The month of March is approaching rapidly, and you all know what that means: DNS Awareness Month is about to be upon us!
But not really.
Several links that I’ve seen in the past few weeks on HN have surfaced bad professional memories I have regarding DNS. When employed properly DNS is a force for good - it enhances user and operator experience and saves us from having to memorize IP addresses. When misused, whether out of malice or ignorance, it confounds and becomes a proverbial rock in one’s shoe.
I’ve seen all of (and have occasionally been responsible for) the following DNS faux pas multiple times in the wild and want to make sure that I pay forward my hard-gained lessons to the world at large. Although most of these also apply to application deployments this article is more oriented towards user services, i.e. your typical “corporate” Active Directory deployment or Amazon VPC in conjunction with VPN and e-mail services. Here we go…
Not creating PTR records
It’s always easier to solve a maze backwards and the same principle applies to your DNS records. Okay, so that’s a poor analogy - sue me. The fact still stands that if you do a
netstat and see that your machine has been SSH’d into 10.108.12.10 for the past three hours it’s a hell of a lot easier to look at a hostname (or
dig the IP that comes appears) than it is to:
- Ping the IP and hope that the Windows resolver cache has the hostname in it (I’ve been told this is a thing but it’s been so long since I’ve concerned myself with Microsoft on the desktop that I don’t even know if it’s true)
- Consult a very probably out-of-date network diagram or Excel spreadsheet
- Fire up the IPAM solution that your department uses (but only during even-numbered years when Mercury is in retrograde) and pray that the record is in there
- Dig through one of several MMC snap-ins that might reveal the answer via log or property page
- Play virtual scavenger hunt to get the server credentials (you are using secrets management, aren’t you?) and remote into it to find the hostname
Improperly naming your internal directory
I’ve made this mistake at least once, but thankfully the guidance is clear and hasn’t changed in at least a decade. Don’t use a public DNS zone for your internal directory. This can introduce a split DNS situation (see below) or just general confusion. Using a subdomain from a publicly-registered domain for your directory environment is most often the easiest solution.
Back in the Clinton administration (when everybody was busy drinking Fruitopia and listening to Ace of Bass) people used to do whimsical things like use non-existent TLDs like .local but the Great TLD Explosion of 2013 meant that people could actually register such domains. Hindsight is 20/20: see Y2K, 32-bit time_t, 640k of memory, Blackberry, and countless other examples in the storied history of tech.
Poorly-implemented Split DNS
This happens when there are multiple authoritative zones for a single namespace, e.g. you use porkchopexpress.com for your directory root and externally-accessible resources on different DNS servers. This leads to split DNS and changes what lookups return depending upon a machine’s network configuration. Introducing such unnecessary state to a system makes troubleshooting difficult and confusing. If you do want to serve internal IPs to internal users then there are ways to intelligently implement this by using the same server with multiple listeners. For smaller deployments I’d say it’s probably easier to just enable NAT loopback with external records only and be done with it.
Using FQDNs for user access to applications
Generally speaking FQDNs should be used to uniquely identify servers. Any time you deploy an application it should get its own record(s). “But applications are servers, aren’t they?" Not quite… applications run on servers, but the fact that there are many applications that require multiple servers to run should clue one into the fact that applications != servers. Give each application its own DNS record, even if it only runs on a single server. This makes it easier to add or change servers later on for load balancing and migrations.
This mistake usually goes hand-in-hand with other common deployment missteps such as:
- Failing to reconfigure the default non-standard port that the application listens on
- Leaving it running in the default context without any redirect from the root of the application
- Not implementing HTTPS properly (using self-signed certificate, a cert with the wrong CN or SAN(s), not implementing a redirect to HTTPS from the HTTP listener, having no HTTP listener, etc)
… so that your users have to enter https://frisky-mongoose-57.yoyodyne.com:8443/tc42 then click “PROCEED ANYWAY” while they pour sweat in fear of clicking the no-no button on the Chrome warning only to have to repeat the process three months later with a new URL that they spend ten minutes looking for in a migration notification email that they didn’t receive and had to have their coworker forward them because they aren’t a member of a distro that’s as old as they are and hasn’t been updated in just as long.
Subjecting users to the intricacies of your network infrastructure is indefensible. Be part of the problem, not the solution.
Throwing everything in a single zone (servers, user machines, network resources, applications, etc.)
This is more of an annoyance than a critical error but it’s confounding nonetheless. Segmenting DNS zones by purpose makes it easier to identify record types. Comingling records for applications, user machines, and servers makes it more difficult to distinguish these at a glance and becomes even more complicated if you have your DHCP server automatically creating DNS records when resources get reservations assigned to them. If you put an application record in the same zone as AD managed objects then it creates a taxonomical headache for anybody trying to use DNS names for anything meaningful.
Using cute naming conventions
This one’s a holdover from the days of physical servers that might have served multiple applications. The file server, for instance, might also be a print server and occasional licensing server. It wouldn’t make sense to immutably name it “file” because that wouldn’t adequately describe its purpose. Having an arbitrary naming scheme for physical machines is vaguely acceptable (even though other solutions involving serial numbers, location, or spec might be more descriptive). In a post-VMWare/cloud world, however, there’s no reason to name virtual boxes as such because you shouldn’t be repurposing virtual assets; you should be provisioning new ones via some type of automatic process. Having to maintain a mapping of pasta sauce brands to functional roles in your network infrastructure is asinine.
“But if we name the servers what they do then the hackers will know what they do too!" They can figure that out in 10 seconds using nmap. You’re wasting hours of employee time by forcing everybody to look up and subsequently memorize the fact that your primary domain controller is “ragu” and the secondary is “raos”, not the other way around.
“The infrastructure team takes months to provision new servers! I need to reuse existing ones." This is a rough one. In my completely unprofessional opinion you should (in descending order of preference):
- Have your IT department fire up the flux capacitor and jump forward to the year 2021 where self-service/automated provisioning tools exist
- Get your team’s own infrastructure via a cloud provider or series of VM hosts in the corporate datacenter
- Get in tight with the infrastructure lead so you can call in favors when necessary
- Have your boss harass the IT department via e-mail by employing classic vagaries regarding “customer deadlines” and “work stoppages”
- Find a new job
Not using it
Why go to the trouble of maintaining DNS infrastructure if you’re just going to type IP addresses any time an application asks for them? Even in the year 2021 I see and hear people suggesting the use of IP addresses to access network resources instead of DNS names. To be fair there are times when this is preferable but it’s a very narrow set of cases. If a user is trying to access a licensing server and it’s not working via DNS name then figure out why DNS isn’t working so that the systemic issue can be solved. If your users ever ask you what the IP address to a server or resource is then it’s almost certainly indicative of a larger issue, not the need for a band-aid.
Am I alone in having seen these over and over? Do I have a point or am I just a slavish devotee to convention? Let me know in the comments!