A down alert is your starting point. Review the monitor name, check type, location, response code, response time, and the time the failure began before making changes.
Compare recent checks, locations, response times, and status codes. A single symptom can point to several different causes depending on what changed.
Begin with a quick confirmation
When a site is marked down, avoid guessing. Confirm the scope of the issue first. A site that is down from multiple locations usually needs faster escalation than a site that only failed from one path, one network, or one specific check type.
Read the alert carefully
Review the affected monitor, the check type, the detected status, the location that reported the problem, and the time the down event started.
Open the monitor report
Look at recent checks, response times, response codes, and the event timeline. Reports help separate a temporary spike from a sustained outage.
Test the target yourself
Open the website, API URL, IP address, port, or keyword page from your own connection. If possible, also test from a server or network outside your office or home connection.
Compare the failure type
A timeout, refused connection, DNS failure, SSL/TLS failure, HTTP error, or missing keyword each points to a different part of the stack.
Use the monitor type to narrow the cause
The correct troubleshooting path depends on what kind of monitor reported the down event. Spark Uptime supports website, ping, port, and keyword checks, and each type answers a different availability question.
Website monitor
Use this for homepages, stores, dashboards, and API URLs where HTTP or HTTPS availability matters. Check the HTTP status code, redirects, SSL/TLS behavior, and whether the web server is responding.
Ping monitor
Use this to test whether an IPv4 or IPv6 address responds to ICMP. A failed ping can indicate packet filtering, routing issues, host downtime, or an IP that intentionally blocks ping.
Port monitor
Use this to verify whether a specific TCP service port is accepting connections. If the port is closed or timing out, check the service daemon, firewall, listener, and upstream routing.
Keyword monitor
Use this when the page must contain exact case-sensitive content. If the page loads but the keyword is missing, check the page body, application output, caching, template changes, and redirects.
Interpret common failure signals
The alert and report details usually reveal which layer needs attention. Use the signal below to decide what to inspect first.
Check response times before and during the event
Response time trends can show whether the site failed suddenly or degraded first. A rising response time before the down event may point to database pressure, slow upstream services, overloaded workers, heavy traffic, or network congestion.
Sudden drop
A sudden transition from normal response times to down can indicate a deploy, firewall rule, DNS change, service restart, certificate issue, or provider outage.
Gradual slowdown
A slow rise before failure often suggests resource exhaustion, application queue buildup, database latency, or a dependency becoming unstable.
Location pattern
If only one region is affected, investigate routing, CDN behavior, DNS answers, regional firewall rules, or provider transit issues.
Repeated flapping
Frequent down and recovery events can indicate intermittent load, unstable deployments, rate limiting, health check blocking, or a service near capacity.
Review recent changes
Most availability issues are easier to diagnose when you connect the alert time to recent changes. Compare the down time with deployments, DNS edits, firewall changes, certificate renewals, server updates, CDN changes, redirects, or payment/provider maintenance.
Check deploy history
Look for application releases, plugin changes, theme edits, dependency updates, configuration changes, or restarts near the alert time.
Check DNS and CDN changes
Confirm the hostname still points to the correct destination and that CDN, proxy, cache, or firewall settings are not blocking monitor traffic.
Check server health
Review CPU, memory, disk, network, database, web server, and application logs. A monitor alert often appears after the underlying system is already under stress.
Confirm the monitor configuration
If the site appears healthy but the monitor still reports down, review the monitor settings. Confirm the URL, protocol, hostname, IP address, port, and keyword match what you actually want Spark Uptime to check.
Communicate if customers may be affected
If the issue affects real users, publish a clear update on your status page or internal incident channel. Keep the message simple: what is affected, when it started, what users may experience, and that investigation is underway.
After recovery
Once the monitor returns to an operational state, use the report to understand duration, response time changes, and whether the recovery was stable. If the issue was caused by configuration, capacity, DNS, SSL/TLS, or application changes, document the cause and prevention steps.
Confirm stability
Watch several checks after recovery to make sure the service is not still flapping or responding slowly.
Record the cause
Note what failed, what fixed it, who was impacted, and what should change to prevent the same issue.
Tune monitoring
If needed, adjust monitor type, URL, keyword, port, or alert routing so future alerts are more precise.
Update users
If you posted an incident notice, close the loop with a recovery update and any relevant next steps.

