Freeze, how to debug?

,

:uk:/:us: Message template (english)

My YunoHost server

Hardware: Old computer
YunoHost version: yunohost: repo: stable version: 4.3.6.2
yunohost-admin: repo: testing version: 4.3.4.1
moulinette: repo: stable version: 4.3.3.1
ssowat: repo: stable version: 4.3.3.1
Description: Debian GNU/Linux 10 (buster)
I have access to my server : Through SSH, through the webadmin, direct access via keyboard / screen
Are you in a special context or did you perform some particular tweaking on your YunoHost instance ? : no

Description of my issue

Server freeze without reason sometime, no more access trough ssh, http, keyboard/screen. Nothing.

No problem with hard disk :

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   102   099   006    Pre-fail  Always       -       3958584
  3 Spin_Up_Time            0x0003   100   100   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       267
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   078   060   030    Pre-fail  Always       -       4377017538
  9 Power_On_Hours          0x0032   089   089   000    Old_age   Always       -       10401
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       267
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   075   058   045    Old_age   Always       -       25 (Min/Max 24/25)
194 Temperature_Celsius     0x0022   025   042   000    Old_age   Always       -       25 (0 15 0 0 0)
195 Hardware_ECC_Recovered  0x001a   042   021   000    Old_age   Always       -       3958584
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       10399h+11m+05.961s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       4171227750
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       2904713329

SMART Error Log Version: 1
No Errors Logged

Nothing special in the logs.

Extract of syslog.1

Apr  5 11:45:01 server CRON[15357]: (nextcloud) CMD (/usr/bin/php7.3 --define apc.enable_cli=1 -f /var/www/nextcloud/cron.php)
Apr  5 11:45:05 server postfix/submission/smtpd[15370]: connect from unknown[141.98.10.203]
Apr  5 11:45:08 server postfix/submission/smtpd[15370]: Anonymous TLS connection established from unknown[141.98.10.203]: TLSv1.2 with
 cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)
Apr  5 11:45:10 server postfix/submission/smtpd[15370]: warning: unknown[141.98.10.203]: SASL LOGIN authentication failed: UGFzc3dvcmQ
6
Apr  5 11:45:10 server postfix/submission/smtpd[15370]: disconnect from unknown[141.98.10.203] ehlo=2 starttls=1 auth=0/1 quit=1 comma
nds=4/5
Apr  5 11:46:01 server CRON[15380]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:47:01 server CRON[15399]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:48:01 server CRON[15438]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:48:30 server postfix/anvil[15372]: statistics: max connection rate 1/60s for (submission:141.98.10.203) at Apr  5 11:45:05
Apr  5 11:48:30 server postfix/anvil[15372]: statistics: max connection count 1 for (submission:141.98.10.203) at Apr  5 11:45:05
Apr  5 11:48:30 server postfix/anvil[15372]: statistics: max cache size 1 at Apr  5 11:45:05
Apr  5 11:49:01 server CRON[15457]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:50:01 server CRON[15477]: (root) CMD (: YunoHost DynDNS update; sleep $((RANDOM%60)); ! ping -q -W5 -c1 ip.yunohost.org >/de
v/null 2>&1 || test -e /var/run/moulinette_yunohost.lock || yunohost dyndns update >> /dev/null)
Apr  5 11:50:01 server CRON[15478]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:51:01 server CRON[15505]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:51:42 server postfix/smtpd[15523]: connect from unknown[193.56.29.154]
Apr  5 11:51:43 server postfix/smtpd[15523]: disconnect from unknown[193.56.29.154] ehlo=1 auth=0/1 rset=1 quit=1 commands=3/4
Apr  5 11:52:01 server CRON[15530]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:53:01 server CRON[15560]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:54:01 server CRON[15579]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:55:01 server CRON[15598]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:55:03 server postfix/anvil[15528]: statistics: max connection rate 1/60s for (smtp:193.56.29.154) at Apr  5 11:51:42
Apr  5 11:55:03 server postfix/anvil[15528]: statistics: max connection count 1 for (smtp:193.56.29.154) at Apr  5 11:51:42
Apr  5 11:55:03 server postfix/anvil[15528]: statistics: max cache size 1 at Apr  5 11:51:42
Apr  5 11:56:01 server CRON[15617]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:57:01 server CRON[15637]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:58:02 server CRON[15666]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  5 11:59:01 server CRON[15685]: (grav) CMD (php7.3 /var/www/grav/bin/grav scheduler 1>> /dev/null 2>&1)
Apr  6 14:41:11 server systemd[1]: Starting Flush Journal to Persistent Storage...
Apr  6 14:41:11 server systemd[1]: Started udev Coldplug all Devices.
Apr  6 14:41:11 server fake-hwclock[220]: Current system time: 2022-04-06 12:41:05
Apr  6 14:41:11 server fake-hwclock[220]: fake-hwclock saved clock information is in the past: 2022-04-05 09:17:01
Apr  6 14:41:11 server fake-hwclock[220]: To set system time to this saved clock anyway, use "force"
Apr  6 14:41:11 server systemd[1]: Starting Helper to synchronize boot up for ifupdown...
Apr  6 14:41:11 server systemd[1]: Started Restore / save the current clock.
Apr  6 14:41:11 server systemd[1]: Started Load/Save Random Seed.
Apr  6 14:41:11 server systemd[1]: Mounted RPC Pipe File System.
Apr  6 14:41:11 server systemd[1]: Started Apply Kernel Variables.
Apr  6 14:41:11 server systemd[1]: Started Flush Journal to Persistent Storage.
Apr  6 14:41:11 server systemd[1]: Started Helper to synchronize boot up for ifupdown.
Apr  6 14:41:11 server systemd[1]: Started Set the console keyboard layout.
Apr  6 14:41:11 server systemd[1]: Started Create System Users.
Apr  6 14:41:11 server systemd[1]: Starting Create Static Device Nodes in /dev...
Apr  6 14:41:11 server systemd-tmpfiles[263]: [/usr/lib/tmpfiles.d/fail2ban-tmpfiles.conf:1] Line references path below legacy directo
ry /var/run/, updating /var/run/fail2ban → /run/fail2ban; please update the tmpfiles.d/ drop-in file accordingly.
Apr  6 14:41:11 server systemd[1]: Started Create Static Device Nodes in /dev.
Apr  6 14:41:11 server systemd[1]: Reached target Local File Systems (Pre).
Apr  6 14:41:11 server systemd[1]: Starting udev Kernel Device Manager...

Thank’s for your help !

Have you checked htop? Maybe something is using lots of RAM or CPU.

Is it freezing, and then you get back control of the machine?

1 Like

Do you have enough swap, ram for the apps installed ?

2 Likes

Hello,

Thank’s,

After freeze I never get back control of the machine.

Here are the apps

apps: 
  0: 
    description: Web based audio/video streaming application
    domain_path: domain.com/ampache
    id: ampache
    name: Ampache
    version: 4.4.2~ynh1
  1: 
    description: Lightweight, simple to use and highly versatile wiki
    domain_path: domain.com/scw
    id: dokuwiki
    name: wiki sc
    version: 2020.07.29~ynh6
  2: 
    description: Online editor providing collaborative editing in real-time
    domain_path: domain.com/pad
    id: etherpad_mypads
    name: Etherpad MyPads
    version: 1.8.17~ynh1
  3: 
    description: A modern open source flat-file CMS
    domain_path: domain.com/
    id: grav
    name: C
    version: 1.7.30~ynh1
  4: 
    description: Password manager compatible with KeePass
    domain_path: domain.com/keeweb
    id: keeweb
    name: Keeweb
    version: 1.18.8~ynh1
  5: 
    description: Online storage, file sharing platform and various other applications
    domain_path: domain.com/espace_collaboratif
    id: nextcloud
    name: E
    version: 22.2.3~ynh1
  6:
    description: Lightweight multi-account webmail
    domain_path: domain.com/rainloop
    id: rainloop
    name: Rainloop
    version: 1.16.0~ynh2

And top command :

top - 21:53:03 up 1 day,  7:12,  1 users,  load average: 0.01, 0.02, 0.00
Tasks: 161 total,   1 running, 160 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.3 us,  0.3 sy,  0.0 ni, 98.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3841.0 total,   1341.7 free,    756.1 used,   1743.2 buff/cache
MiB Swap:    976.0 total,    976.0 free,      0.0 used.   2727.2 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                              
  499 root      20   0 1726684  28740  10276 S   0.3   0.7   4:46.85 fail2ban-server                                                                                                                      
 1943 user  20   0   10032   3484   1800 S   0.3   0.1   0:03.39 screen                                                                                                                               
30078 user  20   0   11064   3520   2960 R   0.3   0.1   0:06.54 top                                                                                                                                  
    1 root      20   0  170928  10904   7976 S   0.0   0.3   0:06.40 systemd                                                                                                                              
    2 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kthreadd                                                                                                                             
    3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp                                                                                                                               
    4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp                                                                                                                           
    6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-kblockd                                                                                                                 
    8 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_percpu_wq                                                                                                                         
    9 root      20   0       0      0      0 S   0.0   0.0   0:00.03 ksoftirqd/0                                                                                                                          
   10 root      20   0       0      0      0 I   0.0   0.0   0:40.96 rcu_sched                                                                                                                            
   11 root      20   0       0      0      0 I   0.0   0.0   0:00.00 rcu_bh                                                                                                                               
   12 root      rt   0       0      0      0 S   0.0   0.0   0:00.18 migration/0                                                                                                                          
   14 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/0                                                                                                                              
   15 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/1                                                                                                                              
   16 root      rt   0       0      0      0 S   0.0   0.0   0:00.34 migration/1                                                                                                                          
   17 root      20   0       0      0      0 S   0.0   0.0   0:00.02 ksoftirqd/1                                                                                                                          
   19 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/1:0H-kblockd                                                                                                                 
   20 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/2                                                                                                                              
   21 root      rt   0       0      0      0 S   0.0   0.0   0:00.35 migration/2                                                                                                                          
   22 root      20   0       0      0      0 S   0.0   0.0   0:00.02 ksoftirqd/2                                                                                                                          
   24 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/2:0H-kblockd                                                                                                                 
   25 root      20   0       0      0      0 S   0.0   0.0   0:00.00 cpuhp/3                                                                                                                              
   26 root      rt   0       0      0      0 S   0.0   0.0   0:00.36 migration/3                                                                                                                          
   27 root      20   0       0      0      0 S   0.0   0.0   0:00.06 ksoftirqd/3                                                                                                                          
   29 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/3:0H-kblockd                                                                                                                 
   30 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kdevtmpfs                                                                                                                            
   31 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 netns                                                                                                                                
   32 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kauditd                                                                                                                              
   33 root      20   0       0      0      0 S   0.0   0.0   0:00.03 khungtaskd                                                                                                                           
   34 root      20   0       0      0      0 S   0.0   0.0   0:00.00 oom_reaper                                                                                                                           
   35 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 writeback                                                                                                                            
   36 root      20   0       0      0      0 S   0.0   0.0   0:00.00 kcompactd0                                                                                                                           
   37 root      25   5       0      0      0 S   0.0   0.0   0:00.00 ksmd                                                                                                                                 
   38 root      39  19       0      0      0 S   0.0   0.0   0:00.44 khugepaged                                                                                                                           
   39 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 crypto                      

There, is a very low activity on the server.

Hi Claveryu,

It is an old computer, but new enough to run 64-bit code (AMD64), is it not?

My first suspicion, after excluding HD-errors, would be RAM. You can install either memtester or memtest86+ to scan RAM for errors:

  • sudo apt install memtester; it runs within Linux but can’t scan all of your memory
  • sudo apt install memtest86+; you have to run it from GRUB (before booting Linux)

Did you run fsck on your filesystems? Even without clear SMART warnings, filesystems can go sour.

While at your keyboard, you could try alt+printscreen+? to get a list of direct ‘non-interuptable’ commands if the system is not gone completely. Is your keyboard wired? Does pressing caps lock still turn on the caps-lock led on your keyboard?

1 Like

Hello,
Thank’s.
I performed a memory test, without any error.
I didn’t run fsck, I will try.
Keyboard wired usb, no light on the keyboard when it freeze.
Nothing with alt+printscreen+b.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.