[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Savannah-help-public] [gnu.org #705563] many messages missing from
From: |
Karl Berry via RT |
Subject: |
Re: [Savannah-help-public] [gnu.org #705563] many messages missing from mail archives |
Date: |
Mon, 22 Aug 2011 17:19:00 -0400 |
Hi Bernie,
It took me a while because the logs for 20110810 had already been
rotated, but I finally figured out what happened:
Thanks for delving into it so quickly.
So mailman *never* saw the message
Ah, that explains it. And I saw the message myself because savannah
sent it to me directly (because I had commented on the sv ticket
earlier), not through mailman. That makes sense.
I'm not sure how we could reduce the amount of miscategorized posts.
Well, here are some ideas (actually, strong suggestions :) for the
spamassassin settings:
* 3.3 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL
* [120.62.160.64 listed in zen.spamhaus.org]
* 0.8 RCVD_IN_SORBS_WEB RBL: SORBS: sender is an abusable web server
* [120.62.160.64 listed in dnsbl.sorbs.net]
* 1.4 RCVD_IN_BRBL_LASTEXT RBL: RCVD_IN_BRBL_LASTEXT
* [120.62.160.64 listed in bb.barracudacentral.org]
By considering multiple RBL's, they have a disproportionate effect.
Also, the 3.3 for PBL seems especially exaggerated. Savannah, my own
servers, and many others have been the victim of incorrect blacklisting
many times. I strongly think their contribution to the overall score
should be reduced.
Furthermore, individuals (Shailesh being a case in point) can in general
not control whether their server is blacklisted and probably don't even
know it. Until mail is lost, wasting everyone's time.
* -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
* [score: 0.0000]
This by itself should have been enough to overcome the blacklisting,
seems to me, since it means the msg is almost certainly not spam. Can
this be given greater weight? The next one down (BAYES_05?) is likely
worth increasing too.
* 0.6 HS_INDEX_PARAM URI: Link contains a common tracker pattern.
Every reply to a tracker is weighted toward spam? That does not seem a
reason to add to the spamicity. Nearly all tracker comments are real,
because the trackers themselves (like savannah's) already try hard to
avoid spam.
by SpamAssassin currently go to quarantine maildirs that nobody ever
looks at. (I'm not suggesting that someone should, it would require a
Well, I and others would look at them. Where is the quarantine? Can we
have access, if we don't already?
That is, we certainly would not look all of them -- just the ones where
the score was on the edge. Then there would be a chance to revive
messages wrongly considered spam.
If we saw the full SA configuration, we could compare against what we do
for listhelper and perhaps improve both.
Thanks,
karl