Performance issues in YunoHost 12.x

What type of hardware are you using: VPS bought online
What YunoHost version are you running: 12.1.4 (issue happening since 12.0)
How are you able to access your server: The webadmin
SSH

Describe your issue

There are some operations in YunoHost that are becoming incredibly slow, and I cannot figure out why. As an example, I am sharing the logs of both adding and removing a subdomain. It takes >5 minutes, it’s very weird!

My VPS is quite beefy (24GB of RAM, 8 cores). There are three users in total, all with admin rights. The server has a total of 36 domains and subdomains registered, of which 14 are TLDs. Of those domains, only 11 have email services activated.

This thread follows a discussion started between me and @aleks here.

Share relevant logs or error messages

My server is “somewhat” comparable (4 core, 8 GB RAM, 20 domains over 4 TLDs)

Below is a totally non-scientific and messy log of the same action on my server. In short:

  • Just clicking through the domain list
  • Adding a domain
  • Removing a domain
  • Posting logs to yunopaste
  • Some screenshots of htop while performing these actions

I performed all actions through the web admin. The CLI may be more efficient (on a side note: for some accounts I can view, but not create mail aliasses through the webadmin, while it works through the CLI. That may be performance related or a fixed array that is too small; I did not further look into that)

Just opening a domain is not blazing fast. The first time opening the settings for any domain takes a while, after that the first time opening a domain related to a specific TLD takes longer than subsequent other domains for that same TLD. Subsequent openings of the same domain configuration are faster.

It leads me to think that some checks (and after that, cache hits) are responsible for the delay (and subsequent better performance):

  • 45 seconds on first opening of a TLD A
    • instant return to domain list
  • 10 seconds delay on second opening of A
    • instant return to domain list
  • near instant showing of third opening of A
    • instant return to domain list
  • 15 seconds delay on opening a subdomain of TLD B
    • instant return to domain list
  • 15 seconds delay on opening a subdomain of TLD C
    • instant return to domain list
  • 5 seconds delay on opening of TLD C
    • instant return to domain list
  • 5 seconds delay on opening of TLD B

While clicking through this list, yunohost-api is relatively heavily used according to htop.

Then I clicked somewhere in the domain list and got the rainbow cat trying to push DNS records to my DNS provider! I cancelled it, because I’m not sure what automatic configuration may break on dns.he.net, and afterwards was not able to find where I clicked to have this popup return.

After that I tried adding a domain

  • 5 seconds to start the “Add domain” wizard
  • 150 seconds to create the domain, run diagnosis (I suppose) and failing to register letsencrypt

While running, this line mail_in mail_out pops up on top of htop quite often:

Something Nextcloud related also popped up a few times, which it did not beforehand and not afterwards, but that can be coincidence.

Removing the domain took 120 seconds, so about as long as adding it minus timeouts for Letsencrypt registration. During removal the Nextcloud line popped up again, so it seems to get hit one way or another during this proces:

Besiedes that, both MariaDB and synapse are quite a bit busier than when not making any domain changes.

Both these screenshots show a relatively high CPU load; before and after the domain actions the load hovers under 5%

Opening logs:

  • 5-10 seconds to open the log list
  • 30 seconds 80% CPU load by yunohost-api to open the first log (accidentally updating DNS settings)
  • instant posting to yunopaste
  • 40 seconds to open the second log (adding a subdomain)
  • seemingly hangs on posting to yunopaste, then 3 new tabs with unique URLs (so the button was still active while processing and not giving feedback)
    • This log entry mentions 2 sub operations:
    • 40 seconds to open “5_categories”
      • 40-50 seconds to post the log to yunopaste
    • ? seconds to open oeauaoeu
      • 40 seconds to post to yunopaste
      • contains 1 suboperation
        • ?? seconds to open
        • 10 seconds to post to yunopaste
    • 30 seconds for twice browser ‘page back’ to return to the main ‘add subdomain’ log entry
  • 5 seconds to return to the log list
  • 30 seconds to open the remove domain log entry
    • 40 seconds to post to yunopaste
    • 1 sub operation "Regenerate system configurations “4_categories”
    • 20 seconds to post to yunopaste

Second time opening log entry for adding subdomain:

opening the log entry for the sub-operation of the sub-operation:

‘page back’ - ‘page back’ to add subdomain log entry:

opening log entry for deleting the sub domain:

Without knowledge of the matter, I speculate:

  • Diagnosis may be involved
  • Checking / regenerating of (possibly related) configurations is involved
  • Some processes form bottlenecks for other processes
    • yunohost-api is Python, as is Synapse. One way or another, on my system, Synapse has higher CPU load while the api is active
    • nginx worker processes claim higher CPU load while above processes are running (I failed to notice exactly which processes, the refresh rate on htop is set too high)
    • perhaps because of the higher load, php-fpm gets pushed aside and reclaims something after that, causing higher CPU load for PHP processes such as Nextcloud

Sorry for the long and messy post!

1 Like

Mmmh my times are way longer. It took around 8 minutes to add a domain.

I have same problem and totally agree !

Wow!

Seeing some background processes are kicked off to perform checks and validations, not only multiple domains may add to the size of the loop, but additional apps or more users may cause a longer list of actions to perform as well.

One app that adds a lot of time in my case is Forgejo: there are some hooks that fire multiple times on most actions, to sync users or something. If more apps have these kinds of hooks, they each add their own delay to processes.

In your case, is it just domain actions that have these huge delays, or does it also happen when performing other actions?

On this server, there are fewer than 50 users, some 10 groups, 20 domains, and fewer than 30 applications. Seeing your server has better resources, I suppose it serves a larger community?

My server has 3 users, and the long times were happening also when it had only one. :grimacing:

Hundreds of apps?

How many log files do you have in /var/log?
I have noticed that the webadmin is very slow when the log folder gets thousands of files. Running a yunohost tools basic-space-cleanup improves the webadmin responsiveness. Note that running this command will purge old log files, so if you need to keep them, may be moving them elsewhere is more appropriate.

(I didn’t read the whole thread, so I may have missed some details)

1 Like

Daamn, you’re right: Vacuuming done, freed 2.7G of archived journals from /var/log/journal/04273ce85405f70e4e023a236385d818

Unfortunately, I checked again, and processes are still super slow. They take >5 minutes.

can you try disabling email features for the other domains that you are not using for emails and check if this helps

Could you run it by cli and adding the --debug option. Observe the behaviour to say between each debug line how long it is.

yunohost domain add toto.fr --debug

How many user, domain, apps do you have ?

This 12.1.x contains a performance enhancement for ldap entries related to permission/app/users, it’s possible that fix impact domains creation in a way or another.

It’s probably not related to LDAP, the issue seems to boil down to yunohost domain list where we want to filter on domains for which mail is enabled, which in the current code involves loading the domain config panel with the not-yet-optimized version of pydantic and it’s just pretty slow … This call seems to be taking around 50-80secs in this case, and it needs to be done a least 4-5 times across all the regen conf which multiplies to several minutes

Could be optimized by computing this ahead and feeding a bash variable to the regen conf scripts like we already do for the full domain list and the global settings

I did a bit of clean-up, and I added updated and detailed data to the first post:

I added a domain again, and it still takes more than 8 minutes! I noticed that 90% of the time the processes are still stuck on that jq -r '.domains[]'Here is the log with the --debug flag activated :eyes:

The log link is empty

Sorry @jarod5001, you’re right… I now added the link. (https://paste.yunohost.org/raw/idagehegop)