Of course we didn't notice this immediately, but only after another user had done the same thing, this time adding a fairly substantial attachment to his doomed email.
So, by the time we did notice, the poor old router was consuming more than half of the available CPU just trying fruitlessly to deliver two undeliverable emails.
Stopped the router. Disabled "hold undeliverable mail". Restarted router. Everything back to normal.
That's what you get for trying to be responsible.
But what could have caused this errant behaviour? I do not know for sure but suspect it may be something to do with the virus scan, which also uses the "held" flag while scanning emails.
It seems a race may be happening where the virus scanner believes it has marked a message held and the router likewise, the virus scanner completes and on finding no virus releases for forward delivery the message it thinks it alone caused to be held.
There must be a better way.
Category: Domino: Administration
Technorati: Domino: Administration
1. Richard Schwartz16/05/2005 20:54:05
Homepage: http://smokey.rhs.com/web/blog/rhs.nsf
If you have a spare server lying around, you could set the smarhost option to point to it, make sure that the antivirus isn't running on it, and configure hold undeliverable on that server.
-rich
2. Chris Linfoot17/05/2005 09:51:31
A mail server with no AV running on it.
Yup. That's going to happen.
NOT
3. Nathan T. Freeman17/05/2005 10:01:37
Why don't you separate your inbound & outbound hosts? Then you can hold undeliverable on the inbound, but not on the outbound?
Rich is not totally insane in suggesting, then, that the OUTBOUND server doesn't necessarily need AV on it.
4. Chris Linfoot17/05/2005 11:29:24
Yes I thought that until I stepped through this in detail. Nearly went so far as to implement it which would not be difficult here as we already have all required components. I stopped short for the reasons below but was too lazy to write up why, hence my earlier somewhat terse reply to Rich's post.
Sorry Rich. Here's why I won't be implementing that plan...
Objective: Stop backscatter bounces where an inbound email has found several matches on the Domino Directory.
Environment: Separate hosts for inbound and outbound. AV not running on outbound. Hold undeliverable enabled on outbound.
Let's walk through it.
1. Inbound server receives email that will later bounce due to non-unique or ambiguous recipient.
2. Scans it for malware. None found. Passes to router for delivery.
3. Router finds ambiguous directory entries and generates an NDR. At this stage the email is "undeliverable" but it is still in mail.box on the inbound server which is, remember, running AV and has hold undeliverable disabled, so...
4. Router passes NDR to outbound server for delivery.
5. Outbound server has hold undeliverable enabled, but let's look at how that works.
Does it hold based on form, MIME type or sender envelope (null in the case of an SMTP bounce)? No. It holds based on its own experience of attempting delivery. Outbound server will thus attempt, once, to deliver the NDR. That may fail if the spoofed sender of the original email does not exist, but it could equally well succeed. It is only on failure that it will go held (being an NDR it will in fact, per the RFC, go dead).
Sorry chaps. Great idea but doeasn't work and in any case is a kludge to work around a flaw in AV software (perhaps only one vendor's AV software).
The flaw is this: The assumption has been by the AV software author made that only the AV scanner will ever flag messages held and therefore the AV scanner has carte blanche to release held messages. This is clearly untrue.
It should work by a) flagging messages as held as now and b) writing an additional field identifying itself as the holding party. When it comes to release, it should only act on messages that it caused to be held in the first place.
This, sadly, is not how it works here with Trend Micro SMD3.
Anyone using any other AV software having similar difficulties with held mail in mail.box?
5. Richard Schwartz17/05/2005 12:51:23
Homepage: http://smokey.rhs.com/web/blog/PowerOfTheSchwartz.nsf
You missed a part. Or I missed a part. If you set the smarthost to point to the outbound server, shouldn't the ambiguous address lookup on the inbound server result in the message itself being routed to the smarthost server? Not an NDR, because the message isn't undeliverable (yet).
Or does smarthost routing only kick in in the event of a total failure of address lookup, rather than when an ambiguous result is found? If so, then my idea is worthless.
Also (... feeling a bit senile here...) wasn't there a way to require full matching on inbound messages so that last- or first-name-only aren't accepted by the server. Or am I just getting confused by the option that restricts authentication for partial matches.
-rich
6. Chris Linfoot17/05/2005 13:10:28
1. From the Domino Admin help -
"A smart host is a directory server to which SMTP-routed messages are sent when the message recipient cannot be found in the Domino Directory or other secondary directories configured on the server." [my emphasis]
That is, the failure has already happened when the smart host is invoked.
2. Yes, you can restrict what inbound addressing works so that e.g. clinfoot@mydomain works where linfoot@mydomain does not (despite the fact that the latter is a unique match on last name). However, we deliberately leave this turned off because this type of mail routing is very frequently called upon by real users.
What I mean - It is very common for an external sender who knows a user here to send to first.last@mydomain, or first_last@mydomain, or first@mydomain, or last@mydomain rather than flast@mydomain which is the only one which should work. They do this usually because their own naming convention on their local email system works that way and because they fail to refer to addresses clearly printed on business cards but rather choose to guess.
The administrative overhead involved in dealing with the consequent bounces of improperly addressed email is far greater than the potential benefit of requiring a rigidly defined email address in the first place.
7. Richard Schwartz17/05/2005 13:38:24
Homepage: http://smokey.rhs.com/web/blog/PowerOfTheSchwartz.nsf
re 1. The failure has happened, but the NDA is not generated. It's a lookup failure, not a delivery failure. The heritage of the word "smarthost" comes from Unix/sendmail configs, and the semantics are essentially "if a valid recipient isn't found in the local directory here, then relay it to the smarthost server so that it can check its own more extensive directory and make another delivery attempt".
re 2. Figured, but I just thought I'd mention it.
-rich
8. Chris Linfoot17/05/2005 13:44:29
Rich, FWIW I think you may well have a point with that whole smart host plan. I do not plan to try it though for the reasons I have already explored.
This would simply be a kludge to mitigate the effects of an architectural flaw in AV.
What I now need to know is whether other AV software misbehaves this way or if it is just a Trend SMD thing? If the latter I would consider changing to a less broken AV system...
9. Richard Schwartz17/05/2005 14:14:55
Homepage: http://smokey.rhs.com/web/blog/PowerOfTheSchwartz.nsf
Good point. A kludge it is. I had thought I remembered this coming up with respect to other virus scanners, and the proper solution is to remove the message from mail.box during AV processing rather than signaling that it is on hold. But that's another very distant memory -- possibly completely wrong.
-rich
10. Richard Schwartz17/05/2005 14:16:03
Homepage: http://smokey.rhs.com/web/blog/PowerOfTheSchwartz.nsf
OMG! "A kludge it is". I've devolved into Yoda-speak!
I think I need more caffeine.
-rich
11. Scott Iver17/05/2005 21:55:02
Chris,
Though many out there do not like McAfee for various reasons, I for one am a fan of their domino AV product Group Shield, at least in their AV implementation, the AV scanner marks the mail "dead - waiting for virus scan". By marking the mail DEAD instead of held, I was able to "Hold Undeliverable Mail" at my server without experiencing the problems you describe. Perhapse Trend will take a hint from this and change over to using the "Dead" mail state for AV scanning, instead of the "held" state?
12. Chris Linfoot17/05/2005 22:18:37
Excellent suggestion.
Trend - the ball's in your court.
13. tony roth21/09/2005 18:05:58
In talking to Mcafee they state that nai - waiting for virus scan means the scans actually done (poor wording in there case) its just the next process won't accept the routing of the mail file!
Unable to post a comment? Please read this for a possible explanation...