PermaLink RFC compliance, spam and clueless administrators
What started as a trickle has become a flood and we now regularly see spams arriving where the originating system (always a compromised third party system on a cable/DSL connection, running a malware proxy service of some kind) has used an address literal in SMTP EHLO.

From RFC2821:

4.1.1.1 Extended HELLO (EHLO) or HELLO (HELO)

These commands are used to identify the SMTP client to the SMTP server. The argument field contains the fully-qualified domain name of the SMTP client if one is available. In situations in which the SMTP client system does not have a meaningful domain name (e.g., when its address is dynamically allocated and no reverse mapping record is available), the client SHOULD send an address literal (see section 4.1.3), optionally followed by information that will help to identify the client system.

[the emphasis is mine]


While not at all common in real email, even where the source system is on a dynamically allocated address (these usually use EHLO meaningless.domain) they are now common in spam.

Here are example received headers from a recent spam:

Received: from [222.96.60.145]* ([222.96.60.145]) by my.domino.host (Lotus Domino Release 6.5.4) with ESMTP id 2005042706321245-4171 ; Wed, 27 Apr 2005 06:32:12 +0100

Received: from nctta.org (nctta-org.mr.outblaze.com [205.158.62.181]) for <victim>; Wen, 27 Apr 2005 00:32:44 +0000


[that second received header is a crude forgery]

Real systems running on dynamically allocated IPs usually do not use the address literal in EHLO because they can't.

That would require that at the time of sending, the MTA service queried the OS IP stack for the current IP address and composed an EHLO greeting using the result. Domino can't do that. Nor can Exchange. I guess it could be done with SendMail but we must be talking about a very small number of systems run by hobbyists.

Thus, the presence of [ and ] in EHLO ought to be a reasonable indicator of spam in progress and you can make a mail rule that says:

When HELO contains [ AND HELO contains ] move to database spamtrap.nsf

I have in fact and it works reasonably well, but what's this?

So far we have three false positives where benign sending systems have been configured to use an address literal in EHLO, not because the address is dynamically assigned (which is the reason that the RFC allows this exception), but because the administrator of that remote system has deliberately chosen to configure it that way.

All have permanent, static IP assignments and could publish and use PTRs. All should have PTR records for their mail hosts and use those when saying EHLO. All have chosen to eschew PTRs and use address literals in their SMTP installations because...

I'm quite sure I don't know.

Is it a security precaution? I suspect is is a misguided one. But for the life of me I cannot see what possible benefit accrues from bending the RFC in this pointless way.


* Korea. Again.

Category: Spamatomy
Technorati:

Comments :

1. Eric Parsons27/04/2005 19:28:36
Homepage: http://www.startingblockcomputing.com


"SendMail...a very small number of systems run by hobbyists." I'm thinking you may have opened a can of worms, but maybe not...

Thanks for recently publishing all or most of your rules. It is a great help.

I am seeing a number of malformed received headers that make my RBL Converter up-chuck. The received header has text between the open parenthesis and the open square bar that my parse function uses to "see" the IP address that gets stored and eventually put into the RBL list. Ex. 1-2-3-4.inaddr.arpa(something[4.3.2.1]) by my.mail.host

Ever see anything in the RFC's regarding the parenthesis?




2. Chris Linfoot27/04/2005 20:39:34


Sendmail - I didn't mean to imply that it is only used by hobbyists. Far from it. As implemented in the mail cores of countless ISPs it invariably is configured on static IPs with correct forward and reverse pointing and says EHLO with a valid primary domain name as the RFC says it should.

I mean that where it is used by a hobbyist on a home system with a dynamically allocated IP it could probably be configured to say EHLO with the appropriate address literal. Not that I have ever seen an example.

Brackets in received lines - may I see a real example?




3. Eric Parsons27/04/2005 22:41:07
Homepage: http://StartingBlockComputing.com


My humble mistake, kind sir.

It's consistently in the second (probably forged) received header, as in:

Received:from cvra.org (mail.cvra.org [216.26.171.6])

Each example that I have also follows the all too common FName, MI, LastName where all are apparently random words or letters such as Mooney L. Obsessives <shrubbery@cvra.org>

My only guess as to why it chews up RBL is that from time to time, we find the body of the message contains mail headers that are either forged or from a different message.

The RBL Engine is being reworked now.




4. Chris Linfoot28/04/2005 09:17:52


This rule (HELO contains [ and ]) is trapping a lot of child porn at the moment and while some of these match on some other rule (typicall X-Mailer = The Bat or one of the traps based on Korean address prefixes), many more match only on this rule.

Frankly worth the odd false positive to keep this stuff out of my users' faces.




5. Richard Schwartz22/07/2005 19:58:58
Homepage: http://www.rhs.com/poweroftheschwartz


I've just found false positives on this from one of IBM's servers.

-rich




6. Chris Linfoot25/07/2005 08:28:19


Oh my!

Two samples trapped here so far today - both child porn. I reckon this rule is worth keeping if occasionally tweaked to avoid false positives.




7. Richard Schwartz25/07/2005 14:44:39
Homepage: http://www.rhs.com/poweroftheschwartz


Hmmm... since disabling the rule, I've had one get through. Not porn, though. Just noticed, though, that there's quite a lot. BTW: while reviewing these, I notice a lot of messagess that claim to be from "localhost". A review of archives shows that there have always been some of thees, but the numbers from the past few days are dramatically up compared to the past. This seems like it's a high-probability indicator. A quick but not exhasustive search of my own inbox shows no good mails from localhost. I don't have a rule set up for that. Do you?

-rich




8. Richard Schwartz25/07/2005 14:55:18
Homepage: http://www.rhs.com/poweroftheschwartz


A more exhaustive search indicates that I would have gotten only two false positives from a rule that hits on "localhost" in the past year -- and both came from a "legitimate" bulk-emailer -- a PR agency hired by an organization I belong to. I wouldn't have minded missing those.

Also, apart from these messages from IBM, I would not have gotten a single false positive from the rule bassd on "[" and "]" in at least the past year... So I think tweaking is definitely the way to go.




9. Chris Linfoot25/07/2005 15:26:40


Yes, we filter Received: from localhost all the time.

Couple of false positives in the past year but many hundreds if not thousands of spam kills and the false positives were borderline anyway.




Unable to post a comment? Please read this for a possible explanation...
Add Manual Trackback
Please enter the details of the trackback post. Your trackback will not appear on the site until it has been verified. This won't be immediate, as trackbacks are validated on a scheduled basis. Be patient.











Search
Popular Categories
Monthly Archive
Other stuff
ClustrMaps
Contact Me
Meta
Proudly powered by IBM Lotus Domino 8 Proudly powered by IBM Lotus Domino 8

Subscribe to articles Subscribe to articles feed

Subscribe to comments Subscribe to comments feed

ROR info ROR info


My Amazon wish list Wishlist


Wikio - Top Blogs - Technology
Like what I do?
Then please consider a donation to support the work of Research Autism.

Idea Jam
Planet Lotus
Dilbert