Website Reliability Knowledgebase

DNS reliability

Learn how DNS records, TTL, propagation, DNSSEC, and resolver behavior affect website availability, alert accuracy, and customer access.

DNS is one of the most important layers of website reliability. Before a browser, monitoring system, API client, or mail server can connect to a service, it must first resolve the hostname to the correct destination. If DNS is misconfigured, slow, stale, unsigned incorrectly, or unavailable, the service may appear down even when the web server itself is healthy.

A reliable DNS setup helps visitors reach the correct infrastructure consistently. It also helps uptime monitors distinguish between a web server outage, a certificate problem, a routing issue, and a DNS-layer failure.

Reliability principle DNS should be treated as production infrastructure. A website can have healthy servers, valid SSL/TLS certificates, and available application code, but still be unreachable if DNS resolution fails.

How DNS affects availability

DNS translates human-readable names such as example.com into technical destinations such as IPv4 addresses, IPv6 addresses, mail servers, verification records, or delegated nameservers. When that lookup process fails, users may see browser errors, email may stop routing, APIs may become unreachable, and monitoring checks may report downtime.

  • Incorrect records can send traffic to the wrong server or an old provider.
  • Missing records can prevent a hostname from resolving at all.
  • Expired or broken DNSSEC can cause validating resolvers to reject otherwise valid answers.
  • Long TTL values can make old DNS answers remain cached after a migration.
  • Nameserver issues can prevent resolvers from receiving authoritative answers.

TTL and DNS propagation

TTL, or time to live, tells resolvers how long they may cache a DNS answer. A short TTL can help changes take effect faster, while a long TTL can reduce query volume and improve caching efficiency. Neither approach is universally correct. The best TTL depends on how often the record changes and how much operational flexibility you need.

DNS propagation is not a single global event. It is the process of cached answers expiring across many recursive resolvers, networks, and regions. During a migration, some visitors may receive the new answer while others continue to receive the old answer until their resolver cache expires.

DNSSEC and validation failures

DNSSEC adds cryptographic validation to DNS responses. When configured correctly, it helps protect against forged DNS answers. When configured incorrectly, it can make a domain fail for resolvers that validate DNSSEC.

Common DNSSEC-related reliability issues include stale DS records at the registrar, missing signatures, expired signatures, mismatched keys, or a DNS provider change where DNSSEC was not updated correctly. These failures can be especially confusing because some resolvers may reject the domain while others appear to work.

Resolver behavior matters

Different recursive resolvers may return different results during DNS changes, outages, or partial provider failures. Public resolvers, ISP resolvers, corporate resolvers, and regional resolvers may cache answers differently or validate DNSSEC differently.

For uptime monitoring, this means DNS failures should be interpreted carefully. A single failed lookup may indicate a temporary resolver issue, while repeated failures across multiple locations are stronger evidence of a real DNS reliability problem.

DNS Record Types

Each DNS record type has a specific purpose. Understanding the record type helps you troubleshoot whether a failure is affecting website traffic, email routing, service discovery, verification, delegation, or security policy.

Recommended reliability practices

  • Use reliable authoritative DNS providers with redundant infrastructure.
  • Keep registrar nameserver settings aligned with the DNS provider actually hosting the zone.
  • Lower TTLs before planned migrations, then raise them again after the change is stable.
  • Validate A, AAAA, CNAME, MX, TXT, and CAA records after DNS changes.
  • Use DNSSEC only when you can maintain the DS, DNSKEY, and signing chain correctly.
  • Monitor both IPv4 and IPv6 when both record types are published.
  • Check DNS from multiple networks when investigating region-specific access issues.
Operational takeaway DNS reliability is not just about whether a record exists. It also depends on caching, authoritative nameserver health, DNSSEC validity, resolver behavior, and whether the returned destination is actually correct.