Fighting E-mail Spammers by: Todd Burgess (tburgess@eddie.cis.uoguelph.ca) Forward I remember the day so clearly. I came back from class, turned on my computer and checked my e-mail. In my mailbox were three messages that contained hateful and threatening language. The messages were addressed to me and had been forged. I had played around with forging e-mail messages on my Linux box and sending them to myself (why anybody would forge e-mail and send it to themselves will be discussed at a later time). As far as I was concerned the e-mail could not be traced and I did not know what to do. I figured I could simply forward them to the postmaster and let the messages be their problem. But there was this little voice in the back on my head which was telling me that I could track down who sent the e-mail. I had never even considered attempting to track down e-mail forgers let alone tried it. Needless to say I studied the e-mail messages and a couple other resources. After 4 hours not only did I know what site the e-mail message came from but I also had an idea as to what user sent it. I notified the right people and a couple days later my suspicions were confirmed. I was elated and learned a lot about hunting down forgeries. Soon I learned the techniques I had used to catch the forger could also work on spammers as well. Pretty soon I was telling e-mail spammers what they could do with their promotions (or sys admins what they should be doing with their users). Needless to say a lot of my knowledge has been gained through trial and error. This document is an attempt to explain some of the things I have learned over the last year so you do not have to use the trial and error method. This is a work in progress so please bare with me. If you find any glaring errors or stupid statements let me know so that I can fix them. Hopefully if we can show e-mail spammers how anonymous they really are not so we can stop them. Todd Burgess tburgess@eddie.cis.uoguelph.ca Introduction You have probably at one time or another received e-mail promising you lots of money or cheap phone sex. You reply to the spam but unfortunately the e-mail address is invalid so your reply to the spammer bounces. The purpose of this document is to allow you to identify the site and possibly the user who sent the mail. Identifying these people is an art form. There is no right or wrong way to go about tracing their e-mail. This document attempts to identify several methods you may use to find them and make sure your comments are heard. Successful hunts can be aided with some background knowledge in DNS and SMTP. Here are some common misconceptions about tracking down spammers: * You need root access. While having root access can help in tracking down the culprit your root access would probably only be valid on one network. The e-mail probably came from another network in which you do not have root access. * You need to be a computer hacker. While hacking may increase the odds of finding the person you might get busted in the process. Do not take the law into your own hands because nobody in power likes to be circumvented. A lot of sys admins are on power trips so do not give them an excuse to use their power against you. * Spammers are impossible to find. Remember that for every e-mail message there is a valid e-mail address and site associated with the message. There may be attempts made to obscure it but if you know what you are looking for you should be able to find it. Meet the Enemy As a rule of thumb most spammers do not understand what they are doing or who they are up against. It has been my experience that spammers are ignorant about Internet technologies so this is something to be exploited and used against them. Spammers do it because they think nobody knows who they are. The secret is to use the technology they use to hide themselves against them. Tactics The very first thing you will want to do is decide if you want to take on the job yourself. If you do not feel that you are up to it then simply forward your message to postmaster@yoursite and let them deal with it. If you welcome a challenge then you will want to do it yourself. The first step is to identify if it is a valid message or not. If it is valid then you are already done. If it is fake then you are going to first want to identify the site it came from. If you want a bigger challenge you can see if you can identify who actually sent the e-mail. If you have managed to make an identification then you have to decide what to do. Retribution may be good for internationally sanctioned strategic bombing campaigns but usually does not work very well at the personal level. Attempt to avoid retribution at all costs. Your goal should be to deter the spammer not ruin their life (as tempting as it may seem). E-mailing the offender and asking them to remove you from their mailing lists might be a start. A better approach would be to e-mail the postmaster at the site the e-mail originated from. E-mail forgery and spamming violates a lot of connection agreements so a lot of providers will be quick to take action. More on contacting the proper people at the end of the document. E-Mail Forgery First and foremost, this is not an introduction on how to commit e-mail forgery. This is an introduction on how to spot a forgery and more important how to identify where the e-mail originated from. Since most spammed e-mail is forged it is important to know how to spot a forgery. Valid E-Mail Before looking into forged e-mail we must first examine valid e-mail. Valid e-mail is e-mail whereby no attempt is made by the sender to conceal its origin. Perhaps the e-mail might look like this on your favorite e-mail reader: Date: Tue, 25 Mar 1997 12:00:48 -0500 (EST) From: John Smith To: Todd Burgess Subject: Hello This is a perfectly good e-mail message. Unfortunately what most e-mail readers show us is not the real story. What we want to know is buried inside the headers. Here is the message complete with all its headers. From jsmith@hornet.tbsys.ca Tue Mar 25 12:00:50 1997 Return-Path: jsmith Received: (from jsmith@localhost) by hornet.tbsys.ca (8.6.12/8.6.9) id MAA00135; Tue, 25 Mar 1997 12:00:48 -0500 Date: Tue, 25 Mar 1997 12:00:48 -0500 (EST) From: John Smith To: Todd Burgess Subject: Hello Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Status: RO X-Status: This is a perfectly good e-mail message. Now if you have never seen the e-mail headers before it can be a little scary but not to worry. Only certain parts of the headers concern us. Why do we need headers you might ask? Well headers are what allow the programs which deliver e-mail to make sure it arrives at its proper destination. First off we can ignore the Date:, From:, To:, Subject:, MIME-Version:, Content-Type: and Status: headers. The Date:, From:, To: and Subject: headers are what the e-mail programs use when displaying the e-mail. They are not needed to deliver the e-mail. First we will look at the From: line. The From: line is who the sender identified themselves as. Think of it as a login used when sending mail. It is possible to forge this line so do not take it at face value. The Return-Path: simply contains the e-mail address from the From: line. It too should not be taken at face value. The next item of importance is the Received: line. This line can be used to identify the sender and the origin of the e-mail message. It can also identify any intermediate mail servers used to deliver the e-mail message (more on this in a later section). This line is added in by the mail server and it is very difficult to forge (but not impossible). As a general rule this header can be trusted to provide accurate information. The last line of importance is the Message-Id: line. This can provide a way of identifying the host from which the e-mail message originated from. Sometimes this line will also identify the e-mail program used to put together the e-mail message. In the above message the e-mail program used was Pine for Linux. This line does not always identify where the message originated from so do not put too much faith in it. Putting It All Together Now that the basics of the headers have been explained it is time to put it all together. How do we know it is a valid e-mail message? Take a look at the From:, Return-Path: and Received: headers. The same username appears in all three (jsmith). It is fair to say that no attempt was made to conceal the username or the site of origin of the message. For System Administrator's Eyes Only! By now we have seen what the e-mail looks like from a user view. There is also a view very few of us will ever see and that is the System Administrator's view. As a general rule all mail servers keep logs of all the messages they process (both incoming and outgoing). These logs are usually never available to users but can be used by System Administrators should the need arise. The following entries were logged by sendmail when it received the e-mail message. Mar 25 12:00:49 hornet sendmail[135]: MAA00135: from=jsmith, size=314, class=0, pri=30314, nrcpts=1, msgid=, relay=jsmith@localhost Mar 25 12:00:51 hornet sendmail[136]: MAA00135: to=Todd Burgess , ctladdr=jsmith (503/100), delay=00:00:03, mailer=local, stat=Sent The first entry was recorded when sendmail received the e-mail message. The second line was recorded when sendmail delivered the e-mail message. The first entry is the one that concerns us the most. More importantly the relay= line. This identifies who owned the process which sent the e-mail message. This is how a lot of System Administrators catch e-mail forgers. They do not even look at the e-mail message but instead rely on the system logs. The only way a forger would be able to circumvent the system logs is to break into the system and alter the logs. Such activity would draw unwanted attention to the hacker's activities and could expose the hacker. It is a wise hacker who does not draw attention to themselves. I still maintain that the best hacks are the ones nobody knows about. Forged E-Mail Forged e-mail can come in many forms and below are two forms that you may come across. Amateur Forgery This style of forgery comes from a FAQ who's name will go unmentioned. The reason I call it amateur is because it is forged e-mail but it is easy to spot and easy to track down. The e-mail might look like this in your mail reader: Date: Tue, 25 Mar 1997 12:25:57 -0500 From: nobody@nowhere.net hello. this is a really horrible piece of forged e-mail. First off, the e-mail address in the From: header does not exist. You will also notice there is no To: line in the message. The absence of the To: header usually is the sign of a poorly implemented e-mail program. Now the non-believers out there might think that the person who sent the message can not be found. All you need to do is take a look at the headers for the message. If you understood the previous section then you should be able to tell who sent the message and where it came from. From nobody@nowhere.net Tue Mar 25 12:26:29 1997 Return-Path: nobody@nowhere.net Received: from nowhere.com (jsmith@localhost [127.0.0.1]) by hornet.tbsys.ca (8.6.12/8.6.9) with SMTP id MAA00153 for tburgess; Tue, 25 Mar 1997 12:25:57 -0500 Date: Tue, 25 Mar 1997 12:25:57 -0500 From: nobody@nowhere.net Message-Id: <199703251725.MAA00153@hornet.tbsys.ca> Apparently-To: tburgess@hornet.tbsys.ca Status: RO X-Status: hello. this is a really horrible piece of forged e-mail. If you said that the e-mail message was sent from jsmith@localhost then you just won a new car. If you do not know why that is the correct answer then look at the message again (HINT: look at the Received: header). First off you will notice the From, Return-Path: and From: headers are forged. Not to worry because the Received: header told us all we needed to know. You will also notice that an Apparently-To: header was introduced into the message. This header was added by sendmail because a To: header was not included in the message. Such a header is at least a sign of a poorly done e-mail message. If you suspect (or know) that the From: header is bogus then this header can indicate an amateur forgery. Sendmail Logs Just to confirm our answer we will take a look at the sendmail logs. Mar 25 12:26:29 hornet sendmail[153]: MAA00153: from=nobody@nowhere.net, size=57, class=0, pri=30057, nrcpts=1, msgid=<199703251725.MAA00153@hornet.tbsys.ca>, proto=SMTP, relay=jsmith@localhost [127.0.0.1] Mar 25 12:26:29 hornet sendmail[154]: MAA00153: to=tburgess, delay=00:00:32, mailer=local, stat=Sent Notice how the from and relay tags do not match. This is a sign of a forged e-mail message. You should notice how the relay tag confirms our answer. Better Forged E-Mail Message The following is a modification of the previous example. The difference between this message and the previous one is looks a lot more real then the previous example. The message might look like in your favorite e-mail reader: Date: Tue, 1 Apr 1997 23:16:52 -0500 From: Nobody To: Easter Bunny Subject: Send More Chocolate I want more chocolate. Send more!!! I added a little twist to this message which will hopefully explain another one of those mysteries. Ever gotten a piece of junk e-mail that was not addressed to you? Ever wondered how you got it? The answer lies in the SMTP protocol. SMTP (Simple Mail Transport Protocol) is the protocol used to exchange e-mail across the Internet. When sending e-mail the e-mail program must identify who the message is addressed to. As it was said earlier the To: header can be easily forged. Now the full message will be shown. The questions you must try to answer are who sent the message, from where and who was the message really addressed to? From nobody@nowhere.com Tue Apr 1 23:18:08 1997 Return-Path: nobody@nowhere.com Received: from nowhere.com (jsmith@localhost [127.0.0.1]) by hornet.tbsys.ca (8.6.12/8.6.9) with SMTP id XAA00447 for tburgess; Tue, 1 Apr 1997 23:16:52 -0500 Date: Tue, 1 Apr 1997 23:16:52 -0500 Message-Id: <199704020416.XAA00447@hornet.tbsys.ca> From: Nobody To: Easter Bunny Subject: Send More Chocolate Status: RO X-Status: I want more chocolate. Send more!!! If you said that the message was sent from jsmith@localhost and the message was addressed to tburgess then you won a trip to some place nice and warm. If you did not get the correct answer take a look at the Received: header. By now you might see a pattern developing. Every time you get what you think is a forged e-mail message just take a look at the Received: header. In fact there is a pattern. Always take a look at the Received: header if you want to trace the message's origin. Sendmail logs Here are the sendmail logs for the above e-mail message. They will confirm what was said above: Apr 1 23:18:08 hornet sendmail[447]: XAA00447: from=nobody@nowhere.com, size=133, class=0, pri=30133, nrcpts=1, msgid=<199704020416.XAA00447@hornet.tbsys.ca>, proto=SMTP, relay=jsmith@localhost [127.0.0.1] Apr 1 23:18:08 hornet sendmail[448]: XAA00447: to=tburgess, delay=00:01:16, mailer=local, stat=Sent In case you are still a little unsure about how the To: header works take a look at the logs. See if you can find Easter Bunny or bunny@easter.org. Notice the two items do not appear? It just goes to show how meaningless the To: header really is! Check Point Tracking down forged e-mail is fairly easy as the above examples have illustrated and you can now feel good about yourself. Now before you go opening any beer I have to tell you about a couple things. First off all these examples have been made easy. Tracking down all forged e-mail will not be this easy. The following sections will complicate everything so make sure you understand the above examples before going any further. identd aka auth Unix systems typically run a service called identd. identd allows programs to identify which users are running which processes. For example identd can allow sendmail to determine which user is sending the e-mail message. In the previous examples the system was running identd so the user sending the e-mail message could be identified. In the next example the identd service has been disabled. In the next example identify the site where the message came from and the user who sent it: From nobody@nowhere.com Tue Apr 1 23:51:56 1997 Return-Path: nobody@nowhere.com Received: from nowhere.com (localhost [127.0.0.1]) by hornet.tbsys.ca (8.6.12/8.6.9) with SMTP id XAA00472 for tburgess; Tue, 1 Apr 1997 23:51:36 -0500 Date: Tue, 1 Apr 1997 23:51:36 -0500 From: nobody@nowhere.com Message-Id: <199704020451.XAA00472@hornet.tbsys.ca> Apparently-To: tburgess@hornet.tbsys.ca Status: RO X-Status: this is a message this no user The site was localhost. As far as we know there is no user. It was a trick question. This can show you what happens when the system is not running identd. As this example illustrates sometimes you will have to be content with simply identifying the site where the message came from. Sendmail Logs When all else fails there are always the sendmail logs. These will answer all our questions: Apr 1 23:51:56 hornet sendmail[472]: XAA00472: from=nobody@nowhere.com, size=31, class=0, pri=30031, nrcpts=1, msgid=<199704020451.XAA00472@hornet.tbsys.ca>, proto=SMTP, relay=localhost [127.0.0.1] Apr 1 23:51:56 hornet sendmail[473]: XAA00472: to=tburgess, delay=00:00:20, mailer=local, stat=Sent Unfortunately we have run into a problem. The sendmail logs have not helped us in identifying who sent the e-mail message. This again can illustrate what happens when the system sending the mail does not support identd. Intermediate Mail Servers This is a technique that a lot of spammers use to conceal their identity. Before showing you all the headers like I have before I am going to explain to you how it works. Suppose you are on machine A and want to send an e-mail message to your friend on machine B. What will happen is your mailserver will contact your friend's mailserver. The path the e-mail message will take will look something like this: +---+ +---+ + A +------>+ B + +---+ +---+ Now it is possible to relay/bounce your message off an intermediate mailserver. There are several legitimate reasons for doing this. There are also several illegitimate reasons for doing this. Spammers frequently do it because they are convinced nobody will know where they are sending it from. With our knowledge of SMTP we can hunt them down. There are a couple of intermediate mailservers on the net that do conceal the spammers real address but very few spammers seem to use them. Getting back to our previous example, lets introduce an intermediate mailserver called I. Then the path your e-mail will take will look like this. +---+ +---+ +---+ + A +------>+ I +------->+ B + +---+ +---+ +---+ We will move onto our last piece of forged e-mail. This last piece of e-mail was not done by myself but was sent to me by an actual spammer. Be advised that this message is for a stealth bulk e-mailer made by the countries top programmers. From jerry@nowhere.com Wed Apr 2 21:13:04 1997 Received: from watagashi.zzzzzzzzzzz.zzz (watagashi.zzzzzzzzzzz.zzz [10.168.192.43]) by ccshst06.cs.uoguelph.ca with ESMTP (8.7.5/8.7.3) id OAA20088 for ; Wed, 2 Apr 1997 14:35:28 -0500 (EST) From: jerry@nowhere.com Received: from zzzzzzzzzzz.zzz (Cust76.Max7.Los-Angeles.xx.xxxxx.xxx [10.168.73.204]) by watagashi.xxxxxxxxxxx.xxx (8.7.5+2.6W/3.5W) with SMTP id DAA06068; Thu, 3 Apr 1997 03:58:21 +0900 (JST) Received: from mailhost.nowhere.com (alt1.nowhere.com (206.1.562.999)) by nowhere.com (8.8.5/8.6.5) with SMTP id GAA00597 for ; Wed, 02 Apr 1997 10:18:14 -0600 (EST) To: jerry@nowhere.com Message-ID: <144523806421342786@nowhere.com> Date: Wed, 02 Apr 97 10:18:14 EST Subject: How To E-Mail Up To A Million Messages Per Hour--No Kidding Reply-To: jerry@nowhere.com X-PMFLAGS: 34078848 0 X-UIDL: 3671313288a65eb1890m0762123a Comments: Authenticated sender is [e-mail content deleted] As you may have noticed all those headers are quite intimidating and quite confusing. The first thing we are doing to do is ignore all the headers except the Received: headers. The rest will only serve to confuse you. The first thing I wish to draw to your attention is the faked Received: header at the bottom of the list. How do we know it is faked? Notice the for word in it points to an e-mail address that is not mine. The more perceptive of you would have also spotted the bogus IP address in the line. The bogus IP illustrates my assertion that spammers do not understand Internet technology. The line above the faked line tells us where the message actually came from. In this case the message came from Cust76.Max7.Los-Angeles.xx.xxxxx.xxx (I have intentionally blanked out part of the real address). The spammer used the site watagashi.zzzzzzzzzzz.zzz (again not the real address) as the intermediate mailserver (found in the first line). The Received: line at the top of the message indicates who the message was really addressed to (me) and that an intermediate mailserver was used to deliver it. It has been my experience that spammers will try to include several faked Received: lines in their messages to conceal their identity but as I have shown you the only people they will confuse are themselves. Conclusions Several forms of e-mail forgery have been illustrated and how to identify where they were sent from. Everything from the crudest forgery to the most complex forgery. Now we move onto the next portion of the hunt. Forwarding the e-mail to the right people. Finding out Who Done It So you want to bust the spammer in a big way, eh? There are several things you might do. I will start with the simplest. Postmaster Every mailserver according to the rules of the Internet must support a postmaster account. Postmaster should be somebody who takes care of e-mail problems. The easiest thing to do is take the offending piece of e-mail and forward it to postmaster@site.com (where site.com is the offending domain the e-mail message originated from). Typically, if the spammer sent out a large quantity of bulk e-mail then the postmaster account will get flooded with a large number of complaints by the Internet community. Most people do not like to have their mailboxes flooded with a large number of complaints so this should motivate them to do something. The way the postmaster will handle the situation is to check the mail and system logs to find the person responsible. What the postmaster will then do depends on the rules of the site. whois Database If you want to make sure your complaint is seen by somebody important you can use the whois database. Most Unix systems include the whois program so all you have to say is whois site.com. If your site does not have a whois program you can telnet to rs.internic.net and talk to the whois database directly. Typically, the whois database will contain important people's e-mail addresses and telephone numbers of the site at which you are interested in and might want to contact. I will not cover the ins and outs of the whois database but I invite you to explore it for yourself. Finding the Person Yourself There are a few things you can try yourself to see if you can find the person yourself. Finger Finger is a program that can be used to identify users and verify their existence. Many operating systems do not support finger and many of those which do are disabling the service because of security concerns. You will probably only have marginal success using finger. SMTP As mentioned earlier, SMTP (Simple Mail Transfer Protocol) is the protocol which permits e-mail to be exchanged across the Internet. SMTP provides two useful commands called vrfy (VeRiFY) and expn (EXPaNd). Verify allows you to confirm that the e-mail address is valid and expand allows you to see where the e-mail is forwarded to. Using them is fairly straight forward. open a telnet session to port 25 on the mailserver (ie telnet mailserver.host.com 25). Issue the commands vrfy userid and expn userid. And see if they return anything. If you get something then you have probably identified the person. The expand is the most useful command for identifying people. Couple words of caution though, the person you have found may have never sent the message so do not beat them up too bad. As well, several sites have disabled the verify and expand commands so you may find that they do not work. You might also find the system administrators frown upon people opening telnet sessions on port 25. There are other darker and sinister reasons why people telnet to port 25 that I will not cover but be advised you might come under suspicion. Identifying Mailservers Sometimes it is not obvious where the mailserver is for a site. If you need to find the mailserver use the nslookup utility which is on most Unix and NT systems. 1. Execute the command nslookup at a shell prompt (Unix) or the Command Prompt (NT) 2. Issue the command set type=MX 3. Enter the domain you want to find the mailserver for Hopefully, you will get something back. In the case of multiple mailservers choose the one with the lowest preference. If you get nothing back then you will have to play with nslookup to get an answer. A background in DNS will help (which this paper will not give you) and more advanced knowledge of nslookup will assist you in finding the mailserver. Concluding Remarks As the Internet grows so will the amount of noise we will have to deal with. Spam is only a small part of the noise we will encounter. Unfortunately, this noise is found in our own personal mailboxes. Luckily, technology is being developed to stop spam from entering our mailboxes. Filters are being put on mailservers to stop its delivery, many mailservers no longer allow themselves to be used as intermediate mailservers and several filters have been developed so users can filter their own mail. Most of us feel helpless about spam but I hope I have given you some ideas on how you can deal with people who insist on invading your own private mailboxes. Remember that knowledge is power. Todd Burgess ftp://ftp.internic.net/rfc. RFC 821, Simple Mail Transfer Protocol by Jonathan B. Postel RFC 822, Standard for the Format of ARPA Internet Text Messages by David H. Crocker Last updated December 6, 1997 Back to my webpage