Comment améliorer la détection des spams (configuration rspam)?

FredFF · July 15, 2016, 4:44pm

Après un mois sous yonohost, et quelques petites galères sans lesquelles l’opensource ne serait pas l’opensource:grin:, je voudrais revenir ici sur un point vraiment important : comment améliorer la détection des spams?
Le service rspam est chargé de la détection des spams, mais la détection semble très inférieure à ce que produit gmail (une référence en la matière, il faut bien le reconnaître).

Comment “muscler” cette détection dans rspam sans compromettre la configuration yunohost?

Autre question, marquer des mails comme span dans son client mail améliore-t-il la détection rspam (apprentissage?)? Je n’en ai pas vraiment l’impression mais peut-être me trompe-je?

Merci pour vos réponses

FredFF · January 8, 2017, 5:25pm

N’y a-t-il vraiment person sur le forum qui puisse indiquer :

comment personnaliser le moteur de détection spam inclus dans yunohost
Car je ne peux que confirmer qu’en l’état c’est une véritable passoire…

Un peut d’aide, s’il vous plaît…

FredFF · January 8, 2017, 6:05pm

Pour info j’ai suivi cela
https://kaworu.ch/blog/2014/03/25/dovecot-antispam-with-rspamd/

soit pour yunohost
sudo nano /usr/share/dovecot/conf.d/20-imap.conf
et ajouter ceci
protocol imap {
mail_plugins = $mail_plugins antispam
}

sudo nano /usr/share/dovecot/conf.d/90-plugin.conf
et ajouter ceci
plugin {
antispam_backend = pipe
antispam_spam = Junk
antispam_trash = Trash
antispam_mail_sendmail = /usr/local/bin/rspamc
antispam_mail_spam = learn_spam
antispam_mail_notspam = learn_ham
antispam_mail_sendmail_args = -h;localhost:11334;-P;q1
}

Mais je ne sais pas si c’est utile ou pas, ni si ça marche vraiment
à noter que “q1” est le mot de passe par défaut de rspamd, mais l’intégration yunohost a-t-elle gardé ce mot de passe?

jellium · January 10, 2017, 12:18am

Je n’ai jamais reçu le moindre spam sur mon serveur mail animé par YunoHost, et pourtant mon nom de domaine a bien trainé sur les Internets !

tomdereub · August 3, 2017, 11:34pm

Je déterre le sujet mais j’ai le même problème : le classement des spams est assez mauvais, j’ai pas mal de spams qui passent, et des mails qui n’en sont pas qui finissent classés en spam…
Et je ne sais pas du tout ce qu’on pourrait vérifier/configurer pour améliorer ça.

FredFF · August 4, 2017, 8:13am

Bonjour tomdereub,
Voici les 2 solutions palliatives que j’ai trouvées (car malheureusement je n’ai pas trouvé comment améliorer yunohost qui est très mauvais pour détecter les spams, et plus encore tous les mails commerciaux non sollicités qui arrivent par dizaine):

utiliser le moteur antispam d’un client mail (Mail dans android, ou thunderbird par exemple) qui classeront les mails comme spam à la synchronisation des dossiers IMAP, attention en ce cas à bien configurer le dossier imap de spam pour que ce soit le même qui soit utilisé par tous les clients (cela se paramètre dans les clients mail, en définissant les dossiers)
Ou alors faire confiance à une organisation tierce comme loveyouremails, la solution “gratuite” de VadeSecure qui se montre très efficace, mais pose un problème de principe: si on s’auto héberge ce n’est pas pour ouvrir ses mails à un tiers!

Si quelqu’un pouvait proposer de meilleures solutions, notamment côté serveur yunohost, cela serait un grand plaisir.

tomdereub · February 4, 2021, 10:57pm

Salut,
Je re-déterre le sujet 3 ans après… Depuis on est passé sur yunohost également dans mon entreprise (25 personnes), donc le sujet détection des spams est assez important.
De ce que j’ai compris, il y a plusieurs niveaux de détection des spams :

Postfix, dans sa config, utilise différentes listes de domaines considérés comme spammeurs. Dans /etc/postfix/main.cf, on trouve la section suivante :

# Requirements for the connecting server
smtpd_client_restrictions =
    permit_mynetworks,
    permit_sasl_authenticated,
    reject_rbl_client bl.spamcop.net,
    reject_rbl_client cbl.abuseat.org,
    reject_rbl_client zen.spamhaus.org,
    permit

Rspamd pond une note à chaque mail, et en fonction du résultat, accepte le mail, ou le met dans le dossier spam, ou le rejette directement. On peut ajuster ces paramètres dans /etc/rspamd/local.d/metrics.conf

Apprentissage : Il semblerait qu’il y ait un apprentissage, quand on met manuellement un mail dans le dossier spam, ou quand on en sort un (on le bascule du dossier spam vers la boîte de réception). Cet apprentissage n’a pas l’air de marcher, est-ce que quelqu’un sait comment ça fonctionne ou comment c’est censé fonctionner ? Est-ce que c’est bien actif dans yunohost ? Ou est-ce qu’il faudrait améliorer la config pour que ça le soit ?

Merci d’avance si quelqu’un maîtrise un peu ce sujet.

Aleks · February 5, 2021, 12:44am

A mon avis ça n’est pas actif dans Yunohost par défaut … J’ai vaguement essayé de configurer un truc en rajoutant des fichiers choppé sur internet pour améliorer les résultats, mais ça a pas été fifou …

Une autre piste c’est que la version de rspamd de Buster est très en retard par rapport à l’upstream … ce sera ptete mieux dans Bullseye mais pas sur

De manière générale c’est malheureusement un sujet qu’on a peu creusé pour le moment

Limezy · February 5, 2021, 5:21am

Hello @tomdereub , j’ai passé déjà pas mal de temps sur le sujet et je suis arrivé à une amélioration des résultats. Je tente de mettre ici au propre mes quelques notes ce WE. Ça m’intéresse également qu’on s’y mette à plusieurs pour améliorer cet aspect de Yunohost !

Je réagis déjà rapidement à ça → pas exactement : déjà il ne faut normalement pas toucher les fichiers dans /local.d mais en créer de nouveaux dans /override.d, mais surtout la seule chose qui se passe à ce stade c’est la création d’un “flag” X-Spam : yes dans les headers de l’email. Il faut doubler ça d’un filtre Sieve pour ensuite mettre l’email dans les spams si ce flag apparaît.

tomdereub · February 5, 2021, 3:23pm

@Limezy : pour le override, comment ça marche ? Il faut créer un fichier “metrics.conf” dans override.d avec les valeurs qu’on veut ? C’est pour pas que yunohost écrase la config lors d’une mise à jour, c’est ça ?
Par contre pour rspamd, je suis sûr qu’il renvoie les mails qui dépassent une certaine note (on l’a augmenté parce qu’on renvoyait des mails qui n’étaient pas du spam). Et sinon effectivement il les flag, et ça doit être thunderbird qui les déplacent dans le dossier spam.

Limezy · February 8, 2021, 9:36am

Je ne vous oublie pas, j’ai simplement eu quelques “petites” urgences à gérer ce WE, sans compter 36h de black-out sans internet (j’habite maintenant en Birmanie )

Limezy · February 24, 2021, 9:22am

Bonjour à tous,

J’ai enfin pris le temps d’organiser mes notes. Je vais vous présenter ici ce que j’ai réalisé comme actions et paramétrages sur mes instances, pour avoir un système anti-spam fonctionnel.
Attention, tout n’est pas ultra propre et c’est à vos risques et périls ! Dans l’idée j’aimerais bien que cela puisse servir de base à d’éventuelles améliorations de la configuration par défaut de Yunohost. Hélas, je n’ai ni les connaissances ni le temps pour m’en occuper directement !

Il faut déjà rendre à césar ce qui lui appartient, la très grosse majorité de ce ma configuration est inspirée de ce tutoriel qui est vraiment très bien fait. Le seul boulot a été de comprendre comment l’adapter aux quelques spécificités de Yunohost. Filtering out spam with rspamd workaround.org. Je reprends donc la structure de ce tutoriel et indiquerai les petites modifs à effectuer pour l’adapter au cas particulier Yunohost.

Je passe en anglais pour que ce tuto puisse servir au plus grand nombre.

1. Adding headers

Adding headers should be the first step when you try improving your spam filtering with Rspamd. Indeed, without these you are basically “blind” : Rspamd will consider some emails as spam and you will never know why, or some as “ham” (=not spam) and you will never know why.

Adding headers will make Rspamd write the score it gave to each email in the headers. Invisible for your everyday use but each time you have a doubt in the future or during the first weeks on which your are fine-tuning your setup it is definitely helpful to understand what’s going on under the hood → you will just have to read the headers or source code of the strangely processed email (how to do it depends on your email client).

Just create a new file in the correct location

sudo nano /etc/rspamd/override.d/milter_headers.conf

With the below content :

extended_spam_headers = true;

You can now restart Rspamd, send a test email to yourself and have a look at the headers

sudo yunohost service restart rspamd

You should see some lines close to the example below :

X-Spamd-Result: default: False [10.50 / 120.00];
MSBL_EBL(7.50)[odgnq78215@yahoo.co.jp,412457ff8e76de187013e526b71b1ce9d1d846f4];
HAS_REPLYTO(0.00)[odgnq78215@yahoo.co.jp];
R_SPF_ALLOW(0.00)[+ip4:82.57.200.0/24];
FREEMAIL_FROM(0.00)[alice.it];
REPLYTO_DN_EQ_FROM_DN(0.00);
TO_DN_NONE(0.00);
HAS_X_PRIO_THREE(0.00)[3];
FROM_EQ_ENVFROM(0.00);
RCVD_TLS_LAST(0.00);
R_DKIM_NA(0.00);
FREEMAIL_ENVFROM(0.00)[alice.it];
INTRODUCTION(2.00);
MID_RHS_MATCH_FROM(0.00);
ASN(0.00)[asn:20580, ipnet:82.57.200.0/21, country:IT];
ARC_NA(0.00);
FAKE_REPLY(1.00);
FROM_HAS_DN(0.00);
TO_MATCH_ENVRCPT_ALL(0.00);
MIME_GOOD(-0.10)[multipart/alternative,text/plain];
REPLYTO_DOM_NEQ_FROM_DOM(0.00);
FREEMAIL_REPLYTO(0.00)[yahoo.co.jp];
RCPT_COUNT_ONE(0.00)[1];
BAD_REP_POLICIES(0.10);
DMARC_NA(0.00)[alice.it];
RCVD_IN_DNSWL_NONE(0.00)[120.200.57.82.list.dnswl.org : 127.0.5.0];
RCVD_COUNT_TWO(0.00)[2]
X-Rspamd-Server: monserveur.fr
X-Spam: Yes

The most important one being the last one that says Rspamd has flagged this email as spam (we will use it later to place that email into the Junk folder automatically

The other lines give you details about the mark that were given to that email by Rspamd according to different criteria. Here you can see that this email got a mark of 10.5, mainly from the “MSBL_EBL” criteria.

Based on that mark Rspamd will take different actions, and this will be the next section.

2. Adjust score metrics

Once a mark was given to an incoming email, Rspamd can take different actions :

Let the email directly go to Inbox
Put the email in “greylist”, which will ask the sending server to wait and retry a little bit later. This was initially to avoid some low-level spam servers that didn’t have any queuing management. However this is becoming quite rare and I found that “greylist” a little bit useless nowadays
Flag the email, which will add that header X-Spam: Yes in the email to allow you to process it differently if you want (like making a filter to move it automatically to the Junk folder)
Reject the email - in that case you will never see the email

The detault values are not bad. I have just put a very high value on “Reject” because I want to be able to see all incoming emails, at least for the first months. I may lower that value when I gain confidence in my system.

To change the values, just create a dedicated file in the correct location

sudo nano /etc/rspamd/override.d/metrics.conf

Copy the following content and adjust the values to your taste

actions {
reject = 15;
add_header = 6;
greylist = 4;
}

Restart Rspamd for the new values to be taken into consideration.

sudo yunohost service restart rspamd

You can always do a configdump to double check (but beware, the config is very long and confusing)

rspamadm configdump

3. Send the spam automatically in the Junk folder

I recommend doing it with Rainloop rather than doing it in command line as it is way faster and more efficient. If for some reason you absolutely want to do it in CLI, you may refer to the original tutorial.

Using rainloop will create a filter at user level, not at server level.

Login to Rainloop
Go to settings > filters
Add a new filter and create a new condition : If header X-Spam contains yes
Add an action : move to : Junk folder
You can also check “mark as read”

Save your filter. From now on all your incoming emails that got a mark above “add_header” threshold (in our example above it’s 6) will automatically go to the Junk folder. Things start to take shape !

4. Learning with existing spam and ham

Amongst all the criterias that Rspamd uses to give a mark to an incoming email, one of them is a kind of neural network that “learns” from your actions. Sadly Yunohost is not configured to take advantage of that great feature (yet). The good news is that except if your server is very new you will still be able to leverage on all your server history. Indeed, you can train that neural network to detect spam against all your Junk folder, and also to detect “ham” against all your inbox.

Of course, first make sure that you have no false spams in your Junk folder, and no spam in your Inbox. It may be worth a few minutes checking. It’s important to train both Spam and Ham and not only one of them.

To train the spams :

rspamc learn_spam /var/mail/YNH_USER/.Junk/cur

(where YNH_USER is the user whose inbox you want to use as a training set)

To train the hams :

rspamc learn_ham /var/mail/YNH_USER/cur

(where YNH_USER is the user whose inbox you want to use as a training set)

Notes :

You may want to train spams and hams on several users if you have more than one Yunohost user on your instance. Rspamd configuration and training is shared accross the whole server so you can’t have different settings for different users (or that would be very complicated to setup and it’s not the purpose of this tutorial)
The first command will only work if the .Junk folder is indeed the folder where you have put all your Spams. Be careful because some email clients may not use that folder and may have created another folder like for example .SPAM. In that case you can either change the path of the command to /.SPAM/cur, either setup your email client to use .Junk and move all your spams from the other folder to .Junk folder. The latter option is the one I prefer because it’s always better to keep Yunohost as close as possible from default
The second command will train only against your inbox, not against emails that are stored in other folders. If you have all your emails in different folders, you may want to train against all them one by one using /YNH_USER/.Yourfolder/cur as a path

As usual, restart Rspamd to take these changes into account.
You can always have a look of how much your Bayes filter has been trained by using the following command :

rspamc stat

5. Turning on auto-learning

We will turn on auto-learning. Auto-learning is quite basic but still useful. What it will do is it will train as Spam if the email got rejected (very bad mark) and train as ham if the email got a negative mark (very good mark). Let’s create a new file :

sudo nano /etc/rspamd/override.d/classifiers.conf

And write the following line in it before saving :

autolearn = true;

You can also define boundaries on which the autolearn action shall be triggered (based as always on the mark given by Rspamd). Example where we ask Rspamd to train as spam if an email got a 5 mark, and as a ham if it got a -5 mark

autolearn = [-5, 5];

As usual, restart Rspamd for these changes to take action

6. Now the BIG thing : train spam / ham based on user action

This is definitely how we want things to work : I receive a Spam in my Inbox ? Then I mark it as a spam and hope my system will gain some “experience” from my action. I receive a ham in my spambox ? Then I place it back in my inbox and hope for the same. This is definitely possible with Yunohost + Rspamd, but there is a little work to do first.

The tutorial on which I base this Yunohost-flavored one is perfectly explaining how things work, so please refer to it if you want to understand what’s happening. Here I’ll limit myself to giving minimalistic explanations.

6.A Enabling and configuring imap_sieve plugin

For this we will need to edit the dovecot conf file

sudo nano /etc/dovecot/dovecot.conf

Find the protocol imap {} part of the file and add imap_sieve as per below :

protocol imap {
imap_client_workarounds =
mail_plugins = $mail_plugins imap_quota antispam imap_sieve
}

Then you’ll find multiple blocks plugin {}. Find the one with a few lines starting with “sieve”.
Copy paste the below lines to replace it (first 3 lines are unchanged) :

plugin {
sieve = /var/mail/sievescript/%n/.dovecot.sieve
sieve_dir = /var/mail/sievescript/%n/scripts/
sieve_before = /etc/dovecot/global_script/
sieve_plugins = sieve_imapsieve sieve_extprograms

# From elsewhere to Junk folder
imapsieve_mailbox1_name = Junk
imapsieve_mailbox1_causes = COPY
imapsieve_mailbox1_before = file:/etc/dovecot/sieve/learn-spam.sieve

# From Junk folder to elsewhere
imapsieve_mailbox2_name = *
imapsieve_mailbox2_from = Junk
imapsieve_mailbox2_causes = COPY
imapsieve_mailbox2_before = file:/etc/dovecot/sieve/learn-ham.sieve
sieve_pipe_bin_dir = /etc/dovecot/sieve
sieve_global_extensions = +vnd.dovecot.pipe
}

And then save that dovecot.conf file.

6.B Creating and compiling the sieve filters

Create a new sieve directory inside dovecot folder

sudo mkdir /etc/dovecot/sieve

Create a new learn-spam script inside that folder :

sudo nano /etc/dovecot/sieve/learn-spam.sieve

Copy the following code inside and save the file.

require [“vnd.dovecot.pipe”, “copy”, “imapsieve”];
pipe :copy “rspamd-learn-spam.sh”;

Do the same for the learn-ham script

sudo nano /etc/dovecot/sieve/learn-ham.sieve

Copy the following code inside and save the file.

require [“vnd.dovecot.pipe”, “copy”, “imapsieve”];
pipe :copy “rspamd-learn-ham.sh”;

Compile these two scripts with sievec :

sudo sievec /etc/dovecot/sieve/learn-spam.sieve
sudo sievec /etc/dovecot/sieve/learn-ham.sieve

Double check that last command did add two compiled scripts learn-ham.svbin and learn-spam.svbin inside the /etc/dovecot/sieve folder we just created

Fix the permissions for created files :

sudo chmod u=rw,go= /etc/dovecot/sieve/learn-{spam,ham}.sieve
sudo chown vmail.mail /etc/dovecot/sieve/learn-{spam,ham}.sieve

6.C Creating the bash scripts to be run by the above sieve filters

Create a new bash script file to learn spam

sudo nano /etc/dovecot/sieve/rspamd-learn-spam.sh

Copy the following code inside then save it.

#!/bin/sh
exec /usr/bin/rspamc learn_spam

Create a new bash script file to learn ham

sudo nano /etc/dovecot/sieve/rspamd-learn-ham.sh

Copy the following code inside then save it.

#!/bin/sh
exec /usr/bin/rspamc learn_ham

Let’s fix the permissions for the created files :

chmod u=rwx,go= /etc/dovecot/sieve/rspamd-learn-{spam,ham}.sh
chown vmail.mail /etc/dovecot/sieve/rspamd-learn-{spam,ham}.sh

Now we should be good to go to testing phase ! But obviously, we first need to restart Dovecot

sudo yunohost service restart dovecot

6.D Testing

Let’s check in real time what’s going on on the server :

sudo tail -f /var/log/mail.log

Then you can go to your email client, and try to move one email from your inbox to your Junk folder. You should see a line saying :

imap(ynh_user@example.org): sieve: pipe action: piped message to program `rspamd-learn-spam.sh’

You can then try to put back that same email (or another) from the spam folder to inbox (or any other folder) and you should see a line saying :

imap(ynh_user@example.org): sieve: pipe action: piped message to program `rspamd-learn-ham.sh’

Alternatively, you can also check with

rspamc stat

If the number of scanned email goes up each time you move an email from Inobx to Junk or vice-versa

Conclusion

I hope that this will help some of you deal better with your spams. I also hope that this configuration (or an improved version of it) could be added in future versions of Yunohost

Limezy · February 24, 2021, 9:27am

Note : the above tutorial will make the diagnostic tool scream about the dovecot.conf file that has been modified. You may want to disable the alert.

tomdereub · June 14, 2021, 11:13pm

Salut,
Je teste tout ça, merci beaucoup pour le tuto.
Une petite remarque, il manque un sudo service dovecot restart à la fin du point 6.A, sinon il n’arrive pas à compiler les scripts parce qu’il ne connait pas “vnd.dovecot.pipe”. Et si on copie-colle tes lignes dans ces scripts, les guillemets ne sont pas les bons, ça donne une erreur aussi à la compilation.
Il y a aussi quelques endroits où il manque des sudo, puis le tuto sera parfait !
Pour info, je teste avec la version “filtre pour tous les utilisateurs” pour filtrer les spams, comme indiqué sur le tuto d’origine.

Et une question, j’utilise thunderbird comme client mail, et il y a aussi des options d’apprentissage sur les spams :

Dans quelle mesure est-ce que les 2 peuvent cohabiter ? Est-ce qu’il vaut mieux désactiver ces options sur thunderbird ?

À tester, mais c’est sûr que ça serait vraiment intéressant d’avoir ça par défaut dans yunohost.

Limezy · June 23, 2021, 4:26pm

Hello,
Merci pour ta réponse!
Je suis entre deux postes et un déménagement c’est le rush mais je jèterai un œil pour corriger les erreurs de mon tuto et répondre à tes questions !

tomdereub · September 24, 2021, 7:54am

Salut,
J’essaie de voir dans quelle mesure ça fonctionne bien, là j’ai plusieurs mails de 2 expéditeurs qui tombent dans les spams alors que je les ai déjà déplacés dans la boîte de réception.
Dans le header, ce qui les fait passer en spam c’est ce champ :
FUZZY_DENIED(10.51)[1:973aeae152:1.00:bin];
Qu’est-ce que ça signifie ? Ce champ n’existe pas sur les autres mails, et notamment d’autres mails des mêmes expéditeurs qui ne tombent pas dans les spams.

Limezy · September 24, 2021, 3:35pm

De ce que j’ai compris c’est une analyse des mots et champs lexicaux utilisés dans le message.
https://rspamd.com/doc/modules/fuzzy_check.html
Si cela t’embête, une solution rapide est de baisser l’importance de cette note dans l’algorithme de détection des spams.

tomdereub · September 25, 2021, 10:23pm

Ok, ce que je voudrais surtout c’est que l’apprentissage prenne le dessus. D’ailleurs, comment est-ce que l’apprentissage est pris en compte ? Est-ce que ça ajoute une “bonne note” dans le calcul ? Ou est-ce que ça fait des white/black listes d’expéditeurs ? Ou autre ?

Sty_X · September 11, 2023, 8:39pm

Bonjour et merci pour le tuto !

Je viens de le tester mais lorsque je fais un essai de déplacement d’un mail du dossier spam vers la boite de reception (ou l’inverse), j’ai ce type de message d’erreur dans les logs

Error: sieve: open: failed to open: open(/etc/dovecot/sieve/learn-ham.svbin) failed: Permission denied (euid=500(vmail) egid=8(mail) missing +r perm: /etc/dovecot/sieve/learn-ham.svbin, dir owned by 0:0 mode=0755)
Error: open(/etc/dovecot/sieve/learn-ham.svbin.calci.fr.2766.b93119f4010a9ddb) failed: Read-only file system
Error: sieve: binary /etc/dovecot/sieve/learn-ham.svbin: save: failed to create temporary file: open(/etc/dovecot/sieve/learn-ham.svbin.) failed: Read-only file system

J’ai essayé de corriger le soucis en faisant

sudo chmod u=rw,go= /etc/dovecot/sieve/learn-{spam,ham}.svbin
sudo chown vmail.mail /etc/dovecot/sieve/learn-{spam,ham}.svbin

Mais suite à cela je n’ai plus aucune entrée dans les logs lors d’un déplacement de mail et en vérifiant via “rspamc stat” le nombre de mail scanné n’augmente pas.

Auriez-vous une idée des permission à accorder pour résoudre ce soucis ?

Merci de votre aide

sydneyb · December 19, 2023, 6:28am

Je deterre ce sujet suite à une augmentation des spam dans mails. avant de me lancer dans le tuto, petite question à la communauté:
Y a t’il eu une incorporation, même partiel ou un rework du spam management dans Yunohost depuis ?