NetData - alarms and monitoring user-friendly dashboard

en

#1

NetData for YunoHost

Install Piwigo with YunoHost
Integration level

NetData is a system for distributed real-time performance and health monitoring.
It provides unparalleled insights, in real-time, of everything happening on the
system it runs (including applications such as web and database servers), using
modern interactive web dashboards.

netdata is fast and efficient, designed to permanently run on all systems
(physical & virtual servers, containers, IoT devices), without
disrupting their core function.

Shipped version: 1.10.0

Customization brought by the package:

  • grant MySQL statistics access via a netdata user
  • nginx root log statistics via putting netdata user in the adm group
  • Dovecot statistics via giving access to Dovecot stats stocket to netdata user (works only with Dovecot 2.2.16+)

Further recommendations:
We don’t allow YunoHost packages to make sensible changes to system files. So here are further customizations you can make to allow more monitoring:

It has been tested on x86_64 and ARM.

Features

  • Stunning interactive bootstrap dashboards

    mouse and touch friendly, in 2 themes: dark, light

  • Amazingly fast

    responds to all queries in less than 0.5 ms per metric,
    even on low-end hardware

  • Highly efficient

    collects thousands of metrics per server per second,
    with just 1% CPU utilization of a single core, a few MB of RAM and no disk I/O at all

  • Sophisticated alarming

    hundreds of alarms, out of the box!

    supports dynamic thresholds, hysteresis, alarm templates,
    multiple role-based notification methods (such as email, slack.com,
    pushover.net, pushbullet.com, telegram.org, twilio.com, messagebird.com)

  • Extensible

    you can monitor anything you can get a metric for,
    using its Plugin API (anything can be a netdata plugin,
    BASH, python, perl, node.js, java, Go, ruby, etc)

  • Embeddable

    it can run anywhere a Linux kernel runs (even IoT)
    and its charts can be embedded on your web pages too

  • Customizable

    custom dashboards can be built using simple HTML (no javascript necessary)

  • Zero configuration

    auto-detects everything, it can collect up to 5000 metrics
    per server out of the box

  • Zero dependencies

    it is even its own web server, for its static web files and its web API

  • Zero maintenance

    you just run it, it does the rest

  • scales to infinity

    requiring minimal central resources

  • several operating modes

    autonomous host monitoring, headless data collector, forwarding proxy, store and forward proxy, central multi-host monitoring, in all possible configurations.
    Each node may have different metrics retention policy and run with or without health monitoring.

  • time-series back-ends supported

    can archive its metrics on graphite, opentsdb, prometheus, json document DBs, in the same or lower detail
    (lower: to prevent it from congesting these servers due to the amount of data collected)


What does it monitor?

netdata collects several thousands of metrics per device.
All these metrics are collected and visualized in real-time.

Almost all metrics are auto-detected, without any configuration.

This is a list of what it currently monitors:

  • CPU

    usage, interrupts, softirqs, frequency, total and per core, CPU states

  • Memory

    RAM, swap and kernel memory usage, KSM (Kernel Samepage Merging), NUMA

  • Disks

    per disk: I/O, operations, backlog, utilization, space, software RAID (md)

  • Network interfaces

    per interface: bandwidth, packets, errors, drops

  • IPv4 networking

    bandwidth, packets, errors, fragments,
    tcp: connections, packets, errors, handshake,
    udp: packets, errors,
    broadcast: bandwidth, packets,
    multicast: bandwidth, packets

  • IPv6 networking

    bandwidth, packets, errors, fragments, ECT,
    udp: packets, errors,
    udplite: packets, errors,
    broadcast: bandwidth,
    multicast: bandwidth, packets,
    icmp: messages, errors, echos, router, neighbor, MLDv2, group membership,
    break down by type

  • Interprocess Communication - IPC

    such as semaphores and semaphores arrays

  • netfilter / iptables Linux firewall

    connections, connection tracker events, errors

  • Linux DDoS protection

    SYNPROXY metrics

  • fping latencies
    for any number of hosts, showing latency, packets and packet loss

  • Processes

    running, blocked, forks, active

  • Entropy

    random numbers pool, using in cryptography

  • NFS file servers and clients

    NFS v2, v3, v4: I/O, cache, read ahead, RPC calls

  • Network QoS

    the only tool that visualizes network tc classes in realtime

  • Linux Control Groups

    containers: systemd, lxc, docker

  • Applications

    by grouping the process tree and reporting CPU, memory, disk reads,
    disk writes, swap, threads, pipes, sockets - per group

  • Users and User Groups resource usage

    by summarizing the process tree per user and group,
    reporting: CPU, memory, disk reads, disk writes, swap, threads, pipes, sockets

  • Apache and lighttpd web servers

    mod-status (v2.2, v2.4) and cache log statistics, for multiple servers

  • Nginx web servers

    stub-status, for multiple servers

  • Tomcat

    accesses, threads, free memory, volume

  • web server log files

    extracting in real-time, web server performance metrics and applying several health checks

  • mySQL databases

    multiple servers, each showing: bandwidth, queries/s, handlers, locks, issues,
    tmp operations, connections, binlog metrics, threads, innodb metrics, and more

  • Postgres databases

    multiple servers, each showing: per database statistics (connections, tuples
    read - written - returned, transactions, locks), backend processes, indexes,
    tables, write ahead, background writer and more

  • Redis databases

    multiple servers, each showing: operations, hit rate, memory, keys, clients, slaves

  • mongodb

    operations, clients, transactions, cursors, connections, asserts, locks, etc

  • memcached databases

    multiple servers, each showing: bandwidth, connections, items

  • elasticsearch

    search and index performance, latency, timings, cluster statistics, threads statistics, etc

  • ISC Bind name servers

    multiple servers, each showing: clients, requests, queries, updates, failures and several per view metrics

  • NSD name servers

    queries, zones, protocols, query types, transfers, etc.

  • Postfix email servers

    message queue (entries, size)

  • exim email servers

    message queue (emails queued)

  • Dovecot POP3/IMAP servers

  • ISC dhcpd

    pools utilization, leases, etc.

  • IPFS

    bandwidth, peers

  • Squid proxy servers

    multiple servers, each showing: clients bandwidth and requests, servers bandwidth and requests

  • HAproxy

    bandwidth, sessions, backends, etc

  • varnish

    threads, sessions, hits, objects, backends, etc

  • OpenVPN

    status per tunnel

  • Hardware sensors

    lm_sensors and IPMI: temperature, voltage, fans, power, humidity

  • NUT and APC UPSes

    load, charge, battery voltage, temperature, utility metrics, output metrics

  • PHP-FPM

    multiple instances, each reporting connections, requests, performance

  • hddtemp

    disk temperatures

  • smartd

    disk S.M.A.R.T. values

  • SNMP devices

    can be monitored too (although you will need to configure these)

  • statsd

    netdata is a fully featured statsd server

And you can extend it, by writing plugins that collect data from any source, using any computer language.

Links


#2

You made my day !

Thanks a lot for your work, I’m very happy to install and test this tool ! :slight_smile:

Just a question: why public by default ? (people can make a mistake here)

edit: strangely, it’s installed with the 2 letter “ne” (and not “Ne”) on the icon, and at the end of the list (after Z). The alphabetical order is wrong ^^


#3

… because I just didn’t think about it! :slight_smile:
I tend to agree with you, I’ll change to private as default.

Nice catch! It is linked to the name attribute “netdata” in the manifest (and upper cases come before lower cases)… will be changed to “NetData”! You’ll have to change the label on your server anyway.

Thanks for your feedback :wink:


#4

Already done :smiley:

By the way I’m currently searching for a way to change swap alarm values. I’ve only 1Go of Ram (Raspberry Pi 3) on this server, and 3Go of ram. I added a pretty high swappiness value, so I often have ~400MB of swap used.
And it raise an alarm because more than 30% of ram value is used on swap. Which is intended.

edit: just need to edit /opt/netdata/etc/netdata/health.d/swap.conf (maybe it’s possible using the web interface, but I didn’t found where) and change the values.


#5

You did the right thing, there’s no direct access to these parameters from the web interface.
There’s an extensive documentation on the NetData wiki.


#6

I have a little feature request (maybe it’s better to post it on github ?).

By default this software try to communicate to netdata servers, to count how many servers are monitored (I haven’t read the code yet to check if that’s the only thing). I don’t want that so I deleted the script doing it.
Could you add an option (during install ?) to do it ? Thanks :slight_smile:


#7

Could you please tell me the name of the script?
I suppose it was linked with the registry mecanism, for which I wouldn’t recommend proposing a low-level option to disable it, as it can’t do any harm…


#8

Sorry, I made a mistake, I didn’t delete it, but blocked it thanks to uMatrix (a Firefox extension). It blocks the call to registry.my-netdata.io
edit: Reading the code, it doesn’t seems to be so easy to block, so well it might be easier not to add an option - I’ve already solved the problem on my side.
Sorry for the inconvenience :sweat_smile:


#9

NetData package was just updated!

Enhancements

  • added automatic monitoring of MySQL, Nginx default logs (to be extended by configuration) and Dovecot (only works with Dovecot 2.2.16+)
  • provided further information in README to extend monitoring to Nginx requests/connections, phpfpm and Nginx web logs - needs system modifications that can’t (shouldn’t!) be made via this package
  • raised to quality level 7

Happy upgrading! :wink:


#10

Thanks for the update :slight_smile:

Do you know if there is a way to link the alert send by email with another event, like launching a script ?
(in my case I’d reboot the server in case of big swap usage, to prevent it from being unavailable if one process fails)


#11

Unfortunately, I don’t know of any way to do that simply (but my NetData knowledge is very basic!).


#12

Maybe I should manually modify netdata shell script. Not a very convenient way to do it… I’ll investigate :slight_smile:


#13

NetData package was just updated to version 1.6.0!

The official change log is available here.

Happy upgrading! :wink:


#14

Echec :’( j’ai mis une partie des logs d’install, une idée sur la cause ?

Version
Debian 8.8 64bit (Linux 4.5.1-std-1)
yunohost 2.5.6
yunohost-admin 2.5.1
moulinette 2.5.2
ssowat 2.6.8

Échec de l'installation

+ sudo systemctl reload nginx

+ sudo rm -rf /home/yunohost.app/netdata

+ sudo rm -f /etc/nginx/conf.d/vincentux.fr.d/netdata.conf

+ mysql -uroot -*************

++ sudo cat /etc/yunohost/mysql

+ echo 'drop user '\''netdata'\''@'\''localhost'\'';'

Unknown service 'netdata'

+ sudo yunohost service remove netdata

userdel: user netdata is currently used by process 31850

+ sudo userdel netdata

+ sudo setfacl -x u:netdata /var/run/dovecot/stats

delete mode 120000 alternatives/readline-editor.1.gz

delete mode 120000 alternatives/readline-editor

2 files changed, 2 deletions(-)

Author: admin

[master a43e250] committing changes in /etc after apt run

Processing triggers for libc-bin (2.19-18+deb8u9) ...

Processing triggers for man-db (2.7.0.2-5) ...

Removing uuid-dev:amd64 (2.25.2-6) ...

Removing rlwrap (0.41-1) ...

Removing python-mysqldb (1.2.3-2.1) ...

Removing libopts25-dev:amd64 (1:5.18.4-3) ...

Removing libmnl-dev (1.0.3-5) ...

Removing libgc1c2:amd64 (1:7.2d-6.4) ...

Removing jq (1.4-2.1+deb8u1) ...

Removing guile-2.0-libs:amd64 (2.0.11+1-9+deb8u1) ...

Removing freeipmi-common (1.4.5-3) ...

Removing libfreeipmi16 (1.4.5-3) ...

Removing libipmimonitoring5a (1.4.5-3) ...

Removing libipmimonitoring-dev (1.4.5-3) ...

Removing autogen-doc (1:5.18.4-3) ...

Removing autogen (1:5.18.4-3) ...

Removing netdata-deps (1.6.0-1) ...

(Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 52067 files and directories currently installed.)

create mode 100644 systemd/system/netdata.service

create mode 120000 systemd/system/multi-user.target.wants/netdata.service

create mode 100644 logrotate.d/netdata

16 files changed, 48 insertions(+), 4 deletions(-)

Author: admin

[master ac6ec38] saving uncommitted changes in /etc prior to apt run

+ sudo apt-get -y -qq autoremove netdata-deps

+ DEBIAN_FRONTEND=noninteractive

+ ynh_apt autoremove netdata-deps

+ ynh_package_autoremove netdata-deps

sudo: ./netdata-uninstaller.sh: command not found

+ sudo ./netdata-uninstaller.sh --force

sed: can't read netdata-uninstaller.sh: No such file or directory

+ sudo sed -i 's/rm -I/rm -f/g' netdata-uninstaller.sh

sed: can't read netdata-uninstaller.sh: No such file or directory

+ sudo sed -i 's/rm -i/rm -f/g' netdata-uninstaller.sh

+ cd /tmp

mv: cannot stat ‘/opt/netdata/etc/netdata/netdata-uninstaller.sh’: No such file or directory

+ sudo mv /opt/netdata/etc/netdata/netdata-uninstaller.sh /tmp

+ UNINSTALL_SCRIPT=netdata-uninstaller.sh

+ domain=vincentux.fr

++ sudo yunohost app setting netdata domain --output-as plain --quiet

++ ynh_app_setting_get netdata domain

++ DEPS_PKG_NAME=netdata-deps

++ APPLICATION_SOURCE_URL=https://github.com/firehol/netdata/releases/download/v1.6.0/netdata-1.6.0.tar.gz

++ VERSION=1.6.0

+ source ./_common.sh

++ . /usr/share/yunohost/helpers.d/utils

++ '[' -r /usr/share/yunohost/helpers.d/utils ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

++ . /usr/share/yunohost/helpers.d/user

++ '[' -r /usr/share/yunohost/helpers.d/user ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

++ . /usr/share/yunohost/helpers.d/string

++ '[' -r /usr/share/yunohost/helpers.d/string ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

++ . /usr/share/yunohost/helpers.d/setting

++ '[' -r /usr/share/yunohost/helpers.d/setting ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

++ . /usr/share/yunohost/helpers.d/print

++ '[' -r /usr/share/yunohost/helpers.d/print ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

++ . /usr/share/yunohost/helpers.d/package

++ '[' -r /usr/share/yunohost/helpers.d/package ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

+++ MYSQL_ROOT_PWD_FILE=/etc/yunohost/mysql

++ . /usr/share/yunohost/helpers.d/mysql

++ '[' -r /usr/share/yunohost/helpers.d/mysql ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

++ . /usr/share/yunohost/helpers.d/ip

++ '[' -r /usr/share/yunohost/helpers.d/ip ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

+++ CAN_BIND=1

++ . /usr/share/yunohost/helpers.d/filesystem

++ '[' -r /usr/share/yunohost/helpers.d/filesystem ']'

++ for helper in '$(run-parts --list /usr/share/yunohost/helpers.d 2>/dev/null)'

+++ run-parts --list /usr/share/yunohost/helpers.d

+ source /usr/share/yunohost/helpers

+ app=netdata

+ set -u

Exécution du script « /var/cache/yunohost/from_file/netdata_ynh-master/scripts/remove »...

setfacl: /var/run/dovecot/stats: Operation not supported

+ sudo setfacl -m u:netdata:rw /var/run/dovecot/stats

+ mysql -u root --password=********** -B ''

+ ynh_mysql_connect_as root r5L9BdZQ ''

++ sudo cat /etc/yunohost/mysql

flush privileges;'

grant usage on *.* to '\''netdata'\''@'\''localhost'\'' with grant option;

+ ynh_mysql_execute_as_root 'create user '\''netdata'\''@'\''localhost'\'';

enjoy real-time performance and health monitoring...

+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+--->

| '-' '-' '-' '-' '-' is installed and running now! -' '-'

|.-. .-. .-. .-. .-. . netdata .-. .-

^

--- We are done! ---

#15

Il semble manquer la première partie des logs ; la partie fournie correspond à la tentative de désinstallation suite à l’échec de l’installation. L’as-tu toujours en stock ?


#16

non j’ai pas peu récupérer les logs… mais je penses que je manquais de RAM pour l’installer.


#17

NetData package was just updated to version 1.7.0!

The official change log is available here.

Moreover, the package has been enhanced to setup NetData to look into nginx access logs for every configured domain on your YunoHost instance.

Happy upgrading! :wink:


#18

NetData package was just updated to version 1.8.0!

The official change log is available here.

Moreover, configuration for nginx access logs has been fixed for good.

Happy upgrading! :wink:


#19

NetData package was just updated to version 1.9.0!

The official change log is available here.

Happy upgrading! :wink:


#20

Minor update: settings adapted to automatically monitor PostgreSQL if installed by a YunoHost application (e.g. synapse).

Happy upgrading! :wink: