PermaLink Populating your whitelist using Pareto analysis
OK. Now we have a tool to measure mail volumes coming from external SMTP hosts, how do we use it?

I have so far collected roughly two weeks worth of data which is enough for illustrative purposes. In fact the shape of the Pareto curve has been pretty stable since the first few days of data were collected.

First then, a summary of what we find here:

Message stats
Emails received 8,482







Network stats
Networks (/24) connecting 18,411 100.00%

Networks leaving no message 16,705 90.73%
Of all connecting networks
Networks leaving at least one message 1,706 9.27%
Of all connecting networks





Pareto - networks delivering most mail
80% email delivered by 474 2.57%
Of all connecting networks


27.78%
Of networks which delivered at least one message





37% email delivered by 50 0.27%
Of all connecting networks


2.93%
Of networks which delivered at least one message


We see here that

  • over 90% of connecting networks here never leave any message (most of these connect once and are never seen again)
  • 80% of all email received here in the past two weeks was delivered by just 474 different networks, less than 3% of all connecting networks and c. 28% of those networks which left at least one message
  • 37% of all email received here in the past two weeks was delivered by just 50 networks

Pretty classic Pareto. Here's a picture to back it up.

Pareto Graph

Now let's look at who those top 50 are. This is usually pretty obvious from DNS names of connecting hosts. Where it's not you can use a whois tool to find out (Sam Spade, DNS Stuff).

Top 50 sending networksTo protect the anonymity of my senders and for illustrative purposes I have broken them down into six categories:

  1. Business Partner
  2. Customer
  3. eBay
  4. Misc
  5. Trusted Intermediary (e.g. Messagelabs and the mail cores of large ISPs)
  6. Web mail

All except the misc category are candidates for whitelisting here.

How to use this information in a whitelist

It is sensible to use groups in the Domino Directory to hold both local black and whitelists. To make it easier to administer, we maintain two groups in each category, one for domain or host names and one for networks as defined by IP address.

Create groups in the Domino Directory called, for example, PermittedDNSNames and PermittedNetworks. These are Servers Only groups and might look a little like this:

Permitted Names

Permitted Networks

These are then referred to in the server's configuration document like so:


Whitelist names where possible.

NB: Whitelisting for example gmail.com does not mean that your Domino server will automatically accept email where the sender's email address contains gmail.com.

This will whitelist email from connecting systems whose resolved DNS name ends in gmail.com and this can only ever be email that genuinely is from Gmail.

When whitelisting addresses, avoid listing single IPs where you can. Senders to be whitelisted typically have network allocations varying in size from /29 (8 IP addresses) to /16 (65,536 IP addresses).

NB: The correct format for IP addresses used this way is for example

[192.168.55.0-15]
[192.168.0-31.*]
[10.0.*.*]

The square brackets are important and note that you can specify a range of numbers or a wildcard as shown.

From my top 50, the whitelists now include:

  • gmail.com
  • ispmail.ntl.com
  • mail.demon.net
  • mail.uk.tiscali.com
  • mail.ukl.yahoo.com
  • mailcontrol.com
  • messagelabs.com
  • moutng.kundenserver.de
  • mx.aol.com
  • outbound.mail.legend.net.uk
  • smtp.bt.com
  • star.net.uk

  • [66.135.197.*] (eBay)
  • [193.252.22.*] (Wanadoo/Freeserve mail core)
  • [194.73.73.*] (BT Connect mail core)
  • [195.188.213.*] (Blueyonder mail core)
  • [212.23.3.*] (Zen Internet mail core)

Collecting and using this data took only a few minutes and the exercise could be repeated for all of the top 474 sending networks within perhaps a single afternoon.

If you repeat this exercise for the top 474 (or whichever number of networks delivers 80% of your email), you will find that you will be able to use much stronger blacklists in future without seriously compromising deliverability of inbound email.

Next time, a simple tip to help weed out false positives.

Category: SnTT
Technorati:

Comments :

1. Peter von Stöckel28/04/2006 07:18:56
Homepage: http://www.bananahome.com/


Impressive analysis, Chris! I have to admit that I wasn't aware that groups could be used in these fields. Great suggestion!




2. Rob Kirkland07/05/2006 17:00:53
Homepage: http://www.rockteam.com


And I wasn't aware that DNS names and IP addresses could be used in group membership fields!

Chris, I assume you have tested this? I've always understood (and a quick review of Admin Help doesn't suggest otherwise) that the Members field in Group documents only takes Notes user names, server names, and group names.




3. Rob Kirkland07/05/2006 17:11:02
Homepage: http://www.rockteam.com


And another thing - I'm surprised to see that you can use Servers Only groups here. I've always thought they were for use only in Connection documents and for console commands.

Chris, you are a fountain of inspiration!




4. Chris Linfoot07/05/2006 17:45:56


Well, obviously - I've not only tested both but both are in production here and working flawlessly.




Unable to post a comment? Please read this for a possible explanation...
Add Manual Trackback
Please enter the details of the trackback post. Your trackback will not appear on the site until it has been verified. This won't be immediate, as trackbacks are validated on a scheduled basis. Be patient.











Search
Popular Categories
Monthly Archive
Other stuff
ClustrMaps
Contact Me
Meta
Proudly powered by IBM Lotus Domino 8 Proudly powered by IBM Lotus Domino 8

Subscribe to articles Subscribe to articles feed

Subscribe to comments Subscribe to comments feed

ROR info ROR info


My Amazon wish list Wishlist


Wikio - Top Blogs - Technology
Like what I do?
Research Autism Then please consider a donation to support the work of Research Autism.
Idea Jam
Planet Lotus
Dilbert