Wednesday 26 December 2012

Sender Policy Framework

Nearly ten years ago, Meng Weng Wong of pobox.com proposed an anti-spam methodology for SMTP email based upon the sending email domain by publishing a list of authorised mail sending agents in DNS. What followed was the highly controversial IETF MARID conference that produced four experimental RFCs including RFC4408 - Sender Policy Framework. That RFC is currently being revised by the SPFBIS working group which will ratify SPF as an IETF standard.

What the SPF experiment has demonstrated is that SPF will never be a complete solution to the problem of spam. However, it has been highly effective in mitigating spam and reducing the load on anti-spam engines. It is believed that SPF in conjunction with two other anti-spam measures (DKIM & DMARC) should be able to virtually eliminate spam with needing to check the content of email measures.

Part of the problem with SPF being widely successful in preventing spam is that most email systems administrators misunderstand SPF and implement it either poorly or incorrectly.

SPF is a "sender policy". It allows the sending system to define clearly what is and what is not a legitimate source of email. Before SPF, such checks were done using RDNS lookups - which are costly, expensive and simply do not work correctly if the domain is using an external email provider.

One thing the Marid conference did settle once and for all is who has the responsibility for the delivery of email.

The standard rule for email delivery post-Marid is that it is the sending server's responsibility to deliver the email until an SMTP code 250 is received from the target mail server. After that, it is the recipient MTA (Message Transfer Agent) that accepts responsibility to deliver the email.

That may sound really simple, however there are an enormous number of consequences resulting from this. For starters it means that email messages can no longer be trashed by anti-spam engines without notification to either the sender or the recipient. More importantly, if the receiving MTA cannot deliver the email after issuing a 250 to the sender, it now MUST issue an NDR (Non-Deliverable Report). However, it must do this without risking backscatter (sending a forged message back to a real recipient). MTAs that backscatter will quickly find themselves on a global blacklist.

The corollary to this is that the receiving MTA MUST have some way of verifying a legitimate source of email: Enter SPF.

Creating an SPF record for an email domain is quite simple: all you need to do is publish a DNS TXT record similar to the following:

<domain name> TXT "v=spf1 mx a ~all"

The "v=spf1" bit is simply identifying the string as an SPF record. What follows is a list of acceptable and unacceptable hosts and the degree to which they are acceptable. The host list is optionally prefixed by a character which designates the level of acceptance or non-acceptance as follows:

'+'  (or no prefix) PASS
'-' FAIL
'?' NEUTRAL
'~' SOFTFAIL

SPF PASS is essentially a guarantee that the email came from a legitimate mail server for the sending domain and standard spam tests may be bypassed.

SPF FAIL is a statement that the email did not come from a legitimate mail server and may safely be rejected.

SPF NEUTRAL means the sender policy has nothing to say about the email. Treat it as though no SPF existed.

SPF SOFTFAIL means the email probably did not come from a legitimate mail server, but the sender cannot 100% guarantee that it didn't.

There are two other SPF Results: TempError and PermError.  I have made comments on how these results are interpreted in the SPFBIS Working Group.

The above record could have been written as "v=spf1 +mx +a ~all". However, since '+' is assumed which no symbol is given, the '+' is rarely used.

The basic acceptable fields used are:

mx - all hosts defined by the mx record for the domain.
a - all hosts with an a record for the domain
ip4: - all hosts within the listed netblock (or ip address)
ip6: - all hosts within the listed ipv6 netblock
include: - include the SPF record for the listed email hosting domain
all - matches everything

So for our listed example, the receiving mail server would tested the sending servers IP address and check first to see if it matches the mx record for the sending domain. If it does, the return an SPF PASS result; otherwise test to see if it resolves to a 'a' record for the sending domain and if that fails, return an SPF SOFTFAIL.

I see a lot of SPF records that end in '-all'. I regard that as a very bold statement. It basically gives me the right to completely reject email messages with an SMTP code 550 if they fail the SPF check. The sender who publishes such a statement has to be completely SURE that no legitimate email will come from a host other than the one's listed. For the vast majority of organisations, that guarantee is simply impossible to make. The reason is that the sender cannot control email forwarding.

When you forward an email, the FROM address remains the same as does the SMTP MAIL FROM: line. This is what SPF uses to evaluate the sending domain against the connecting MTA. In the case of forwarded mail, they will not match and SPF will FAIL, resulting in a rejected message. However, now the intermediate server has a problem. It has accepted responsibility to deliver the mail and cannot do so. Under the post-MARID rules it must deliver an NDR, however with no current connection to the sending MTA the only way it can do so is send a DSN email (Delivery Status Notification) and risk backscatter.

There is a solution to this mess called the Sender Rewriting Scheme (SRS). This simply involves encapsulating the original destination address within the MAIL FROM: line in such a way that SPF can check both the source email addresses. It's a great idea that suffers from a simple three-part problem:

1) The sender has no control over whether SRS is implemented at the original destination or not.
2) Since SPF is a "sender policy", this means the sender still cannot make definitive statements in their SPF record.
3) To date, very few commercial MTAs implement SRS.

This means we cannot rely on SRS even if we implement SRS ourselves. You cannot know for certain how many of your email recipients implement SRS, nor can you know if they are forwarding email. This means, at best we can publish a "~all" at the end of our SPF records. Anything else risks both the possibility of incorrect non-delivery of an email and unnecessary backscatter.

At around the same time that SPF was floated, both Google and Yahoo came up with the idea of signing the email envelope using public-key cryptography. Yahoo's was called DomainKeys and Google's was called Domain Key Identified Mail (DKIM). Both methods were similar, but DKIM has become the standard. It works similarly to SPF with the public key published in a DNS TXT record. The good thing about DKIM is that it is not broken by forwarding since it looks at the envelope.

Mailing Lists, however, do break DKIM. This is because mailing lists (by their nature) modify the email envelope.

This gives us hope, because now we can use both SPF and DKIM to evaluate email. If either method passes, we can now regard the email as legitimate. Only if both fail do we stamp FAIL on the email message and reject it.

The problem with this idea is the take up of DKIM is very low. Few commercial mail servers can support DKIM. Publishing a DKIM record is also more difficult than publishing an SPF record. So, while the take up rate for SPF is around 40% (accounting for more that 60% of legitimate email), the take up rate for DKIM is around 5%. So, even if you do publish a DKIM record and stamp all outgoing email with a DKIM hash, chances are that few of the recipient MTAs will do anything about it.

Enter Domain Based Message Authentication, Reporting and Conformace or DMARC.

A DMARC record is published in DNS like SPF and DKIM. The difference is that DMARC provides a mechanism to report back to the original sending MTA to determine if the email is legitimate. In addition, it provides a "fuzzy" method for temporary rejection of non-conformant email similar to grey-listing, but with a more intuitive and sender-determined policy for doing so. With a DKIM record, the sender policy is not just isolated to SPF or DKIM alone. The sender can define a rejection policy based upon a combination of checks and receive reports back on rejected messages. Thus it removes all the guess work associated with message rejection.

Pretty much all the major players are now behind DMARC: Google, Yahoo, Microsoft, AOL, Facebook all currently implement DMARC even before the specification has been ratified by the IETF. See http://www.dmarc.org

In summary, if you administer an email system, you SHOULD be publishing an SPF record at the very least. You should also be implementing some form of SPF checking - but not too aggressively (making sure you whitelist regulars). If possible, look at implementing DKIM with a view to eventually implementing DMARC. As for SRS, if you can implement it, then do so. It will reduce the level for SPF rejections on forwarded email.

I will write another article another time on practical SPF implementations. Oh, and if you are running MailMarshal less than version 7, then upgrade it asap. The SPF implementation is broken. It took me nearly two weeks to convince them of this. Even with the fix they implemented, it still breaks under certain circumstances.



Tuesday 25 December 2012

Autoproxy: How it works, why it sucks and why transparent proxy is so much better

Most sites I visit have some form of proxy in place. Most of the time it is pretty basic with the the proxy details either setup in the SOE or pushed out (to Windows clients) via Group Policy. In a typical scenario there will be two proxies: one requiring authentication and one that doesn't. The unauthenticated proxy is used by devices, IT staff and any user that runs some app that can't handle the authenticated proxy for some reason. The changes (when required) are performed manually in most cases.

Considering the automation available to us, why is this process done manually? All major browsers support the ability to perform autoproxy, yet very few sites implement it.

I'm only going to cover the basics of autoproxy here. This is essentially a 5 minute guide to get it up and running.

Firstly, the Proxy Auto Config (PAC) file is simply an http-delivered script that tells your browser which proxy to use under specific situations. It can be anything from mind-numbingly simple to extremely complex.

At the most basic level, you can type the location of the PAC script into your browser options. Each browser locates that in a different place. However in my opinion, this defeats the purpose of autoproxy in the first place - you may as well just type in the proxy address.

The Web Proxy Auto-Discovery protocol (WPAD) uses either DHCP or DNS to discover the PAC script. The browser will first request option 252 from the DHCP server. This field must be populated in DHCP with the URL of the PAC script. DHCP has the highest priority and if present, DNS will not be used.

This brings us to the first problem: Firefox does not support the DHCP method.

Oh, and one gotcha with the DHCP method: Internet explorer expects the string to be null terminated. If not it will strip off the last octet for you. Try troubleshooting that one!

If DHCP fails (or is not used) then DNS is used for WPAC. This is simply a DNS lookup for "wpad.domainname". If not found it walks the DNS tree until a reference is found or the lookups are exhausted. If an entry is found, it attempts to load a wpad.dat file from the reference. For example, for the local domain "department.branch.company.internal.net" the successive lookups will be:

http://wpad.department.branch.internal.net/wpad.dat
http://wpad.branch.internal.net/wpad.dat
http://wpad.internal.net/wpad.dat
http://wpad.net/wpad.dat

This leads us to the second problem: security. If a site is not careful, "wpad.net" can resolve externally and a malicious PAC script can be executed on the browser. This is usually the case with notebooks taken off-site. 

The web-server location referenced by wpad.dat should be a virtual host redirected to a proxy.pac file. In the case of apache this is done simply with the following lines in httpd.conf:

Redirect permanent /wpad.dat /proxy.pac

and

AddType application/x-ns-proxy-autoconfig .dat

Finally we are at the proxy.pac script. This script is basically a simplified for of javascript designed to run on browsers that runs that implements a single function called FindProxyForURL(). There are a limited number of additional built in functions and you can also write your own. At its simplest, a proxy.pac file will be:


  function FindProxyForURL(url, host)
   {
      return "PROXY proxy.example.com:8080; DIRECT";
   }


For most organisations, this will probably be enough. However, a more complex script may be needed. For example:


   function FindProxyForURL(url, host) {
      if (isInNet(host, "10.0.0.0",  "255.255.248.0"))
      {
         return "PROXY fastproxy.example.com:8080";
      }
      return "PROXY proxy.example.com:8080; DIRECT";
   }


The above example enables the proxy location to change according to the subnet used.

This brings us to our next problem: the isInNet() function can be completely unpredictable on windows clients if the .net 2.0 framework is loaded. The MyIpAddress() function can also be unpredictable ff you have more than one adapter, the function could return either either IP address, or it could even (under certain circumstances) return 127.0.0.1.

In fact, your proxy.pac is conditional upon the local environment including any limitations of the javascript engine.

The irony is that in the environments where autoproxy is most useful, it is most likely to be unpredictable and simply not work for many clients. It is also very difficult to troubleshoot.

There are many sites dedicated to enabling you to write the perfect proxy.pac script. There are tips to trap all the vagaries listed plus dozens more. They also detail ways to debug your script. If maintaining a long and complex script that deals with more exceptions than rules is right up your alley then go for it! However, for me this just indicates that autoproxy simply sucks and should be avoided in all but the simplest of circumstances.

Which brings me to the concept of a Transparent Proxy. Implementing this is simplicity itself. All you need to do is runup the following on a spare server with a single network card. It doesn't have to be powerful:

1) Centos Linux (preferably)
2) Squid Proxy
3) Shorewall firewall
4) Webmin (for administration)

Set the squid proxy to listen on ports 80 and 443. You can run two instances.

Setup the shorewall firewall to redirect all non-proxy traffic to the router.

Setup DHCP to set the default route to be the linux server.

That's basically it! There are variations, but this is the nuts and bolts of it. Because the server is the default route and the proxy is listening on http and https ports, it will proxy transparently.

There is also another (newer) way of doing this using TPROXY which performs the transparent proxy at layer 3. I have never done this, because it looks a lot more complicated but more info is available here.