Graphite, Grafana with gunicorn and NGINX

At work I have wanted to implement system monitoring using graphite and grafana. Eventually I will use this particularly to monitor our Lustre storage system with either collectl or collectd shipping Lustre and general host stats to graphite. To try things out I have installed the stack onto my Kimsufi server.

The graphite installation documentation uses apache modfcgi, but I'm running various other python projects with gunicorn and then an NGINX or Apache front-end proxy, so will use gunicorn to run the graphite-web app, and NGINX to proxy it and grafana over https.

Installation on CentOS 7

I run CentOS 7 on my Kimsufi 2c server, and the procedure to install graphite, grafana, gunicorn, nginx is as follows:

EPEL repository

yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Graphite Dependencies & Graphite Packages

# Distribution packages
yum install -y httpd net-snmp perl python-devel git gcc-c++ pycairo mod_wsgi libffi-devel
yum install -y python-pip node npm
# Python package via pip
pip install django
pip install django-tagging
pip install pytz  
pip install Twisted==16.4.1
# Installing graphite from master, as the stable release is quite old at the moment
export PYTHONPATH="/opt/graphite/lib/:/opt/graphite/webapp/"
pip install --no-binary=:all: https://github.com/graphite-project/whisper/tarball/master
pip install --no-binary=:all: https://github.com/graphite-project/carbon/tarball/master
pip install --no-binary=:all: https://github.com/graphite-project/graphite-web/tarball/master

Configure Graphite

Copy the example configuration into live locations

sudo cp /opt/graphite/conf/storage-schemas.conf.example /opt/graphite/conf/storage-schemas.conf
sudo cp /opt/graphite/conf/storage-aggregation.conf.example /opt/graphite/conf/storage-aggregation.conf  
sudo cp /opt/graphite/conf/graphTemplates.conf.example /opt/graphite/conf/graphTemplates.conf 
sudo cp /opt/graphite/conf/graphite.wsgi.example /opt/graphite/conf/graphite.wsgi  
sudo cp /opt/graphite/webapp/graphite/local_settings.py.example /opt/graphite/webapp/graphite/local_settings.py  
sudo cp /opt/graphite/conf/carbon.conf.example /opt/graphite/conf/carbon.conf

Configure the storage schema:

vi /opt/graphite/conf/storage-schemas.conf

Add the default retention:

[default]
pattern = .*
retentions = 12s:4h, 2m:3d, 5m:8d, 13m:32d, 1h:1y

Add a user account for graphite:

useradd -d /opt/graphite graphite
chown graphite /opt/graphite -R

Configure the carbon daemon, and start it

vi /opt/graphite/conf/storage-schemas.conf

Make these edits:

USER = graphite

Create system.d unit file /etc/systemd/system/carbon.service

[Unit]
Description = Carbon Metrics store

[Service]
Type = forking
GuessMainPID = false
PIDFile = /opt/graphite/storage/carbon-cache-a.pid
ExecStart = /opt/graphite/bin/carbon-cache.py start

[Install]
WantedBy = multi-user.target
sytemctl daemon-reload
systemctl start carbon
systemctl status carbon
systemctl enable carbon

Configure and enable graphite-web

Edit /opt/graphite/webapp/graphite/localsettings.py

SECRET_KEY = '' # Get a random hash to put here
DEBUG = False

Install nginx, gunicorn

yum install nginx
pip install gunicorn

Setup webapp

cp /opt/graphite/conf/graphite.wsgi.example /opt/graphite/webapp/wsgi.py
cd /opt/graphite/webapp/graphite
sudo -u graphite python manage.py migrate

Create /etc/systemd/system/graphite-web.service

[Unit]

[Service]
Environment=PYTHONPATH="/opt/graphite/lib/:/opt/graphite/webapp/"
WorkingDirectory=/opt/graphite/webapp
ExecStart=/usr/bin/gunicorn -u graphite -g graphite -b 127.0.0.1:8080 --log-file=/opt/graphite/storage/log/webapp/gunicorn.log wsgi:application
Restart=on-failure
User=graphite
Group=graphite
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
PrivateTmp=true

[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl start graphite-web
systemctl status graphite-web
systemctl enable graphite-web

NGINX proxy for graphite-web

Setup a self-signed certificat for nginx right now

cd /etc/nginx
sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/ssl/private/nginx-selfsigned.key -out /etc/ssl/certs/nginx-selfsigned.crt

Create /etc/nginx/conf.d/graphite.conf

server {
    listen 80;
    return 301 https://$host$request_uri;
}

server {

    listen 443;
    server_name graphite.example.com;

    ssl_certificate           /etc/nginx/nginx-selfsigned.crt;
    ssl_certificate_key       /etc/nginx/nginx-selfsigned.key;

    ssl on;
    ssl_session_cache  builtin:1000  shared:SSL:10m;
    ssl_protocols  TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4;
    ssl_prefer_server_ciphers on;

    location /graphite/static/ {
        alias /opt/graphite/webapp/content/;
    }

    location /graphite {

      proxy_set_header        Host $host;
      proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header        X-Forwarded-Proto $scheme;

      # Fix the “It appears that your reverse proxy set up is broken" error.
      proxy_pass          http://127.0.0.1:8080/graphite;
      proxy_read_timeout  90;

      proxy_redirect      http://127.0.0.1:8080/graphite https://graphite.example.com/graphite;
    }
  }

Start and open firewall

firewall-cmd --add-service=http
firewall-cmd --add-service=https
firewall-cmd --runtime-to-permanent

systemctl start nginx
systemctl enable nginx

Try collectl locally to sanity check the install

Will use collectl running locally on the same server to make sure that data is getting into graphite.

yum install collectl
# Test it
collectl --export graphite,127.0.0.1
# Configure in /etc/collectl.conf
DaemonCommands = -smcdn --export=graphite,127.0.0.1
# Start
systemctl start collectl
systemctl enable collectl

Now browse to https://localhost/graphite and check that data is making it into graphite, and the graphite-web interface is working OK.

Install grafana

Grafana is a nicer dashboard front-end than the default graphite-web, so we will use that by default.

yum install https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-4.2.0-1.x86_64.rpm
systemctl daemon-reload
systemctl start grafana-server
systemctl status grafana-server
systemctl enable grafana-server

Add nginx proxy setup to /etc/nginx/conf.d/graphite.conf

location / {

  proxy_set_header        Host $host;
  proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header        X-Forwarded-Proto $scheme;

  # Fix the “It appears that your reverse proxy set up is broken" error.
  proxy_pass          http://127.0.0.1:3000/;
  proxy_read_timeout  90;

}
systemctl restart nginx

Configure Grafana

  • Browse to the host URL and login with default user/password admin:admin
  • Add data source
  • Name graphite, type graphite, URL http://localhost:8080/graphite, access=proxy

Fixing Old Email - GMail Import Dates & MIME Multipart Problems

I have a collection of email that has followed me around since the mid-2000s, moving between web hosts, GMail, Outlook.com and iCloud as they've added useful features, I've moved from Android to iOS etc. I've come across 2 problems as i've moved mail around over the years:

  • After importing into GMail from other IMAP accounts everything looks great in GMail, but on moving the mail to another host the received date changes to the date of the GMail import.
  • On a few occasions, moving mail around using different desktop email clients, MIME multipart messages have broken. I end up seeing the raw source, not the HTML mail, attachments etc.

Fixing GMail Import Mangled Dates

When you move from another IMAP account into GMail, using the GMail online account import tool, everything looks great. Unfortunately when later copying mail somewhere else (e.g. iCloud) the received dates may show up incorrectly.

To fix:

  1. Download email from Google Takeout - gives you an mbox file containing all mail. Unfortunately this loses any folder structure, but never mind.
  2. Examine the mbox file. The messages imported by gmail will have additional headers, inserted at the top of each mail, e.g.

    From 1245753982836402098@xxx Sat Aug 25 12:06:17 +0000 2007 Delivered-To: xxxxx@gmail.com Received: by 10.107.187.193 with SMTP id l184csp149097iof; Mon, 21 Sep 2015 16:49:31 -0700 (PDT) X-Received: by 10.107.170.32 with SMTP id t32mr30219550ioe.173.1442879371734; Mon, 21 Sep 2015 16:49:31 -0700 (PDT) Received: from 303668833448.apps.googleusercontent.com named unknown by gmailapi.google.com with HTTPREST; Mon, 21 Sep 2015 19:49:31 -0400 Received: from web38814.mail.mud.yahoo.com (209.191.125.105) by spam2.34sp.com with SMTP; 25 Aug 2007 13:13:01 +0100

The original Received header shows the message was received 25 Aug 2007. Unfortunately clients other than gmail will display the 21 Sep 2015 date in the Received headers added by gmail.

To fix this remove the Gmail headers from each message in the mbox file. Can be accomplished with some creative regex e.g. in sublime text. The headers vary a little between imports I've seen. The fixed mbox file can then be imported into a mail client.

Fixing MIME errors

After using a number of different clients to move mail around over several years (Thunderbird, Outlook, OSX Mail, Windows Live Mail) I've often seen some MIME multipart emails, with HTML and attachments become broken. The message is moved or copied but will no longer display properly - I see the raw source of all the MIME parts, cannot view attachments etc.

The problem turns out that somehow the clients or servers inserted spurious Content-Type headers, above the original Content-Type header for the MIME mail. The added header prevents the message being read correctly.

An example:

From: xxxxxxx <xxxxxx@gmail.com>
Sender: <xxxx@gmail.com>
To: "xxxxxx"
References: <df696fdc-b801-4be8-bcb0-f3d59169dcba@SwitchService>
Date: Wed, 11 Sep 2013 16:19:50 -0500
Message-ID: <06460AE9-3D64-4422-88A2-C05D869A22BC@xxxxxxxxx>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 15.0
Content-Language: en-us
X-Google-Sender-Auth: 1IItZObPtrZmxbo-5dealV_naxQ
Content-type: multipart/alternative;
        boundary="B_3521739430_567641463"

> This message is in MIME format. Since your mail reader does not
understand
this format, some or all of this message may not be legible.

--B_3521739430_567641463
Content-type: text/plain;
        charset="UTF-8"
Content-transfer-encoding: 7bit

In an email client I see all the source, from the original MIME Content-Type header…

Content-type: multipart/alternative;
        boundary="B_3521739430_567641463"

… which is being overridden by the header further up, which has crept in at some point:

MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

Once this block is removed the message is correctly recognized as MIME multipart, and displays properly. This can also be fixed in an mbox file with some find and replace.

Bioconductor GenomicFeatures and Intel Compilers

At work we run R on an HPC cluster, and have built it with the Intel compiler suite, Math Kernel library etc. to achieve best performance. Following through the great rnaseqGene tutorial workflow for RNA Seq differential expression analysis I hit a roadblock.

The GenomicFeatures package makes use of S4Vectors. In S4Vectors there's a method to produce an SVN format timestamp, used to mark the creation time of TranscriptDB objects by GenomicFeatures. When built with an intel compiler the code will just return -1, and you'll never successfully get the TranscriptDB you need.

Eventually found some notes on this. The return of -1 instead of a valid timestamp is due to the intel compiler not supporting the timezone functionality in the code:

https://support.bioconductor.org/p/54740/

Thinking about this, a simple workaround is to avoid the timezone code and return a UTC timestamp. This isn't frienly, but technically correct and will allow things to proceed. rather than -1.

The following change in str-utils.c of the S4Vectors package will do the trick and let you use TranscriptDB objects despite the Intel compiler:

static int get_svn_time(time_t t, char *out, size_t out_size)
{
	struct tm result;
	int utc_offset, n;

	static const char
	  *wday2str[] = {"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"},
	  *mon2str[] = {"Jan", "Feb", "Mar", "Apr", "May", "Jun",
			"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"},
	  *svn_format = "%d-%02d-%02d %02d:%02d:%02d %+03d00 (%s, %02d %s %d)";
#if defined(__INTEL_COMPILER)
	result = *gmtime(&t);
	utc_offset = 0;
#else /* defined(__INTEL_COMPILER) */
	//localtime_r() not available on Windows+MinGW
	//localtime_r(&t, &result);
	result = *localtime(&t);
#if defined(__APPLE__) || defined(__FreeBSD__)
	//'struct tm' has no member named 'tm_gmtoff' on Windows+MinGW
	utc_offset = result.tm_gmtoff / 3600;
#else /* defined(__APPLE__) || defined(__FreeBSD__) */
	tzset();
	//timezone is not portable (is a function, not a long, on OS X Tiger)
	utc_offset = - (timezone / 3600);
	if (result.tm_isdst > 0)
		utc_offset++;
#endif /* defined(__INTEL_COMPILER) */
...

HowTo - Fedora 21 on MacBook Air 6,2

/Update 2015-02-09: High CPU in gnome-shell is caused be the gnome-session background-logo extension. Easy to disable by turning off 'Background Logo' via https://extensions.gnome.org/local or gnome-tweak-tool/

I've recently been playing with Ubuntu on home VMs and DigitalOcean again after realizing how far ahead of CentOS/Fedora/RHEL it is for the availability of Ansible playbooks etc. However, I've used RHEL/CentOS every day at work for the past 5 years and just prefer its way of doing things to Ubuntu. Tiring of OSX, I enjoy having Fedora on my laptop.

There are some great guides for Fedora 20 on an Air, but I couldn't find anything yet for Fedora 21 workstation. So here are my notes on making it work, working roughly (but with various changes) from a Fedora 20 guide from mattoncloud.org

1. Make Space for Fedora

Starting from an installation of Yosemite using the whole disk and with FileVault enabled, the most foolproof way of getting a functioning dual boot I've found is to ignore guides that mention splitting an encrypted partition, using refit/refind etc. and do the following:

  1. Decrypt the OSX install by disabling FileVault in OSX settings. Wait for the decryption to complete (needs Mac to be plugged in) and reboot.
  2. Using OSX disk utility reduce the size of the OSX partition. On a 256GB Air I made 100GB of free space for Fedora.
  3. Reboot again.
  4. Enable FileVault again, let it completely encrypt the drive, and reboot once more into OSX before installing Fedora.

I've found over the course of trying many distro installs that this gives the cleanest partition layout, and never fails. Note that if you try to install Linux before first re-enabling FileVault you may later find that FileVault complains about an incompatible disk layout, and you are stuck without encryption on OSX.

/This doesn't give you refit/refind on bootup to choose OSX / Fedora, but you can hold option to get the Mac's own menu, and set your preferred startup disk in OSX settings./

2. Install Fedora 21 Workstation

Download the Fedora 21 .iso image and write it to a USB stick. Make sure you use a direct image writer (e.g. dd or the methods detailed in the Fedora docs). If you use unetbootin it won't work!

Attach the USB stick to your Mac, reboot while holding down the option key. Choose the Fedora image in the boot menu and proceed through the installer using default options. Automatic partitioning will magically work, install the correct EFI bootloader etc. Wonderful!

Reboot into Fedora. You can login but you won't have WiFi, and your Air will get hot very quickly due to some problems…

3. Stop the gpe66 interrupt storm

On Airs that have been updated to recent OSX you'll find if you check top that kworker uses a huge amount of CPU. This will drain your battery, and heat up your Air very quickly and is due to an interrupt storm on gpe66. To disable the interrupt you can use a systemd script.

Put the following into a file /etc/systemd/system/disable-gpe66.service

[Unit]
Description=Disables interrupt GPE66

[Service]
ExecStart=/usr/bin/bash -c 'echo "disable" > /sys/firmware/acpi/interrupts/gpe66'

[Install]
WantedBy=multi-user.target

Then start and enable on boot:

sudo systemctl daemon-reload
sudo systemctl enable disable-gpe66.service
sudo systemctl start disable-gpe66.service

4. Get WiFi Working

The Air uses a WiFi chipset that isn't supported by the drivers that come with the Kernel / Fedora. The non-free 'wl' driver is available from rpmfusion.

Connect to the net using a USB Ethernet dongle (hopefully you have one!). Update your Fedora install so that you are using the latest kernel, and the kernel / kernel-devel package versions will match etc. Then, reboot into the latest kernel.

sudo yum update
sudo reboot

Now enable the rpmfusion repo by following the instructions for Fedora 21 at:

http://rpmfusion.org/Configuration

Once this is done you can install the 'akmod-wl' driver that will rebuild automatically if your kernel version changes etc.:

sudo yum install kernel-devel akmods akmod-wl
sudo akmods

Once this is finished you should be able to unplug from ethernet and connect by WiFi. Hooray!

5. Fix Backlight on Suspend/Resume

The standard mechanism which sets backlight brightness works fine on first boot, but after a suspend/resume it will fail to work correctly, so that the backlight can only be fully on or off. patjak on Github has created a module that's easy to install to fix this issue:

git clone git://github.com/patjak/mba6x_bl
cd mba6x_bl
make
sudo make install
sudo depmod -a
sudo modprobe mba6x_bl

Log out and back in and the backlight should work properly.

6. Wakeup / Keyboard Mapping Fixes from matthicksj/mba-fixes

In the mattoncloudarticle for Fedora 20 on the Air 6,2 the author provides a package to fix:

  • unwanted wakeups from suspend
  • missing mapping for the ~ key
  • SSD i/o errors

I don't see the SSD i/o errors anymore on F21, but the unwanted wakeups and missing ~ mapping are still there. Luckily the mba-fixes package works fine on F21 to fix these issues, so you can do install mba-fixes and these problems will disappear after a reboot.

sudo rpm -Uvh https://files-oncloud.rhcloud.com/yum/RPMS/x86_64/oncloud-repo-0.4-1.fc20.x86_64.rpm
sudo yum install mba-fixes

7. Ditch Gnome for Cinnamon <optional>

UPDATE - High gnome-shell cpu /is caused by gnome-shell-extension-background-logo. Turn off the 'Background Logo' extension via browsing in firefox to https://extensions.gnome.org/local or using gnome-tweak-tool (yum install gnome-tweak-tool). No need to ditch GNOME to get good battery life! See: https://bugzilla.redhat.com/show_bug.cgi?id=1177683/

On Fedora 21 I've found that gnome-session constantly use 40-50% CPU - something that didn't happen in Fedora 20. This CPU usage will cut battery life quite a bit. I can't find any fix for this and need to file a bug report. However, I switch to Cinnamon for the desktop and then don't have any issue. Since I use RHEL6 with GNOME 2.x all day at work the 'Classic' layout in Cinnamon is more familiar than the standard gnome-shell anyhow.

sudo yum group install cinnamon

Log out. Choose the 'Cinnamon' session via the little gear icon setting button when logging back in.

8. Install thermald

Intel provides a linux daemon for thermal management that works with many features of newer intel chips. In other distros I've noticed improvement in battery life when using thermald, perhaps due do cpu governor settings. Not checked this thoroughly in F21, but I'm installing it anyway from the hadrons123/thermald copr:

sudo dnf copr enable hadrons123/thermald
sudo yum install thermal-daemon
sudo systemctl enable thermald
sudo systemctl start thermald

I'm just leaving it with the default config. You can configure a lot via thermald to have a cooler mac that uses it's fan more, or aggressively throttles the CPU etc. if you want to.

9. powertop and tuned power saving tweaks

At this point everything (except the iSight) should be working nicely, and battery life is fairly good. A little bit more battery life can be gained by using powertop to find some tuning options and tuned to ensure these are applied.

Install powertop and calibrate

sudo yum install powertop
sudo powertop --calibrate

Let powertop calibrate without touching anything. It takes a couple of minutes. At this point with backlight at 50% brightness and an idle Cinnamon desktop I see a draw of about 5.3 watts. By applying all of the recommended tunables within powertop I get down to a draw of 4.8 watts. This is 9% lower, which definitely seems worth having.

Powertop tuning doesn't persist over restarts. One method of making it persist is to use tuned and setup a tuned profile that applies powertop recommended tuning. Unfortunately this isn't working at the moment. It seems that the 'powertop2tuned' command isn't compatible with the version of powertop on F21.

As a temporary workaround until this is fixed I just call powertop with the –auto-tune option directly from the root user's crontab at reboot. This does mean that if any of powertops tuning options cause problems there's no way of turning some off, but I've had no issues so far.

sudo crontab -e

Add the line

@reboot /usr/sbin/powertop --auto-tune > /dev/null 2>&1

All Done!

After these steps I'm happy with the way Fedora 21 is working on my Air. Everything is shiny, fast, and the battery life is great. Not too much effort overall - far better than when I attempted this back in the day on an original Air with much older distros.

Make linuxbrew work on Fedora 20

When installed, linuxbrew looks for versioned gcc binaries/symlinks e.g. gcc-4.8. Fedora 20 comes with gcc 4.8 which is new enough for linuxbrew, but there is no versioned symlink. If you try to 'brew install' it will complain about there being no suitable compiler on the system. To fix:

sudo ln -s /usr/bin/gcc /usr/bin/gcc-4.8

Linuxbrew is great on older systems like RHEL 6.x which I use in work. It's indispensible there to get up-to-date software installed to a home directory quickly and esaibly. Less necessary on more up to date systems with newish packages via yum or apt-get, but having it on Fedora 20 means I can do the same thing in work, on my mac, or on my Linux machine.