Hello
I try to understand the rspamd config in Yunohost and Freedombox. Both projects had left this part mostly untouched. Changes made looks inspired by internet tutorials without fully understanding (may be I’m wrong).
As a sysadmin, I allways begin with documentation. Comparing documentation and current Yunohost config, left me doubts and questions. So I would like to work on this, contributing at least basic doc for users.
(comme c’est technique, est-ce que c’est mieux d’en parler en français ?)
First of all : why did you put additional config files into /etc/rspamd/local.d/ directory ?
local.d and override.d are directories for sysadmin changes not for distro changes (local is for small changes and override for whole config file). Now, if a sysadmin change values in these config files, they will be overwritten at package update.
Why did Yunohost changes “reject” actions threshold (rspamd/local.d/metrics.conf)?
Comparing Debian and Rspamd debs package, I don’t find significative differences against selected modules and their config. Both packages share the same default value for reject threshold (15).
A higher value doesn’t make sense if modules selection and modules config is not changed.
BTW metrics.conf is obsolete (see note inside rspamd/metrics.conf), actions.conf is the right file to use.
AFAIK the milter-headers.conf file define in yunohost repository is not installed — at least I didn’t find it in the scripts (in /hooks) nor on my server. Without this file, rspamd can’t inform user that an email is potentialy a spam.
Not sure if that helps a lot, but I think generally speaking, our rspamd configuration hasnt been really maintained by anybody in the last 4~5 years. Personally I do not have any real experience maintaining such a software and don’t really understand the algorithms or workflows used by such software. If you want to work on this, and assuming you sort of know what you’re doing, I would suggest you just trash our current config and start from scratch ? If I remember correctly, there’s also a new major version of rspamd in Bullseye so it probably make more sense to work on that new version instead of the old one.
Feel free to join us during contributor meeting or on our dev chat on matrix.
We can discuss about this things, but as Aleks explained we have not a lot of knowledge on rspamd.
@Aleks@ljf I have started the work in a dedicated branch. I will add explanations to each changes.
I keep Rspamd default as much as is, adding config options (in main files, to let user customize as he need) only for Postfix and Dovecot integration, allowing auto-learning through Spam folder. This config assume that Yunohost is used for personal or small teams only, with free RBL services. Commercial and hosting use would need paid RBL and autolearning on Spam folder disabled.
While I am not an expert, I owned a hosting company in the 90’s and I understand a bit how spam fighting work
I’ve started to briefly document the base parameters, some of them are missing in the docs but can be easily found in Rspamd source. Purpose is to produce some maintainer documentation about Rspamd interaction with Postfix and Dovecot. Also, there params could be changed if Yunohost is used in more than a family server.
I’ve also nearly finished the base configuration of Rspamd-Postfix-Dovecot interaction.
Spam fighting is another subject. In my opinion, some things can be done quickly:
Rspamd defines around 600 parameters to calculate a spam probability. Unexperimented users are not supposed to play with them. Understanding the main ones and how the spam threshold score can be changed to catch with common situations can result in usable sample configs.
While Rspamd web interface is undocumented, its purpose is to help user to catch again spam. Currently it need some love to fully work in Yunohost. And of course it need to be documented.
Rspamd use of external RBL. AFAIK, none of them is free and user must pay for them as soon as it use a lot of resources. It need to be checked and documented.
I will push my changes on Github and let you know. We can work on my branch and discuss here to keep track of this work (similarily, it is better to discuss in english, even if we both speak french).
Hi @zeroheure great, I’ll have a look !
I suggest you make a draft PR using your branch so that we can both work on it.
I don’t have yet a dev envo for Yunohost itself but I probably will have to setup one
Just starting here by writing down a few things and notes
What could be our goals for Yunohost default config :
Absolutely needed
Rspamd 2.7.1
Spam detection when receiving
Detected spams sent to “Junk” folder
Learning from user actions (Spam → Inbox or Inbox → Spam)
Initial learning from existing email database
Nice to have
Rspamd 3.2
Web interface
Redis configuration to get faster
Spam detection when sending
Autoexpunge Junk folder
Fuzzy feeds from Rspamd (enabled by default but if I carefully read the policy, we should talk to Rspamd team before enabling them by default within an open source project such as Yunohost)
To be discussed
Mileage vary and all Yunohost users may not have the same use.
We should try to fine tune the Rspamd config to be as “universal” as we can.
The two main set of values to be defined are :
/etc/rspamd/actions.conf that defines the score thresholds for an email to get rejected, greylisted or accepted
Type of questions to ask to ourselves : do we set a reject value so that no email get ever rejected ? Do we allow greylisting which is useful but sometimes confusing for users (why is that email from my friend still not arrived ?)…
Values defining the weight of each spam detection module into the final score (however, we may decide to keep this as the by default value)
I was able to run the webUI much easier than I thought
I think I’ll package Rspamd webUI as an app.
Or could we have it as part of the Yunohost web admin ?
[quote=“Limezy, post:16, topic:19283, full:true”]
Just starting here by writing down a few things and notes
What could be our goals for Yunohost default config :
Absolutely needed
Rspamd 2.7.1
Yes, one need to enable backports on Debian Buster
Spam detection when receiving
Detected spams sent to “Junk” folder
Learning from user actions (Spam → Inbox or Inbox → Spam)
Agree, this is the basis of a good spam filter. Exactly what I’ve done so far.
Initial learning from existing email database
It is true that it is necessary, but how to achieve it ? I’ve seen lot of users putting in Junk folder whatever email they don’t like. This mean that this email database should be external (it does exist). NB: Rspamd need to learn trusted emails too, nearly the same quantity as spam,
Related points that comes to mind:
Some webmails comes also with bayesian spam detection, is it enabled in Yunohost?
Desktop email clients comes with spam detection too, where spam emails goes in Junk folder. Hence, shall we expunge Junk folder from server side? I don’t know
What could be Postfix, Rspamd or Milter errors messages, related to Junk folder, if user update its content from webmail or desktop or mobile client?
The learning can be setup as user based or server based. I think that if we setup it as user based we don’t really car whether it’s “real” Spam or just an email the user doesn’t want to receive. It’s the same action that we want in the end → go to Junk !
And yes, the learning is mainly about what the user wants, not what the user doesn’t want.
I think we should run it as a job that is run at every Yunohost update or something like that
Webmails will do their bayesian spam and then move to the Junk folder, but we don’t really care as we will run the Rspamd detection before it goes to the client anyway. I’m not sure to understand your question “is it enabled in Yunohost”
Same, here we don’t really care if the user voids his Junk box or not, or if his desktop client does it on a regular basis → we do it server side as a default, and everything else is a double fence (used or not). I don’t like client side based actions personnally.
I didn’t understand that question ? IMAP is designed to be a client - server mechanism, so if the server does some actions they will be mirrored by the client, and vice versa