Tuesday, 18 February 2014

Changing the Default Synagios Port

UPDATE: The port can now be changed when Synagios is installed. These instructions are still useful if you need to change the port after installation.

The Synagios in-built Web server is set to start on port 8888 and nagrestconf, pnp4nagios and nagios all work from this port. If it needs to be changed then currently it can only be changed by using the terminal on the Synology device. How to change the default port follows.

Only use these instructions if you are comfortable using the Synology terminal.

First make a backup using the Nagrestconf web interface. If anything goes wrong Synagios can be uninstalled, then re-installed, and the backup restored.

In the Synology DSM ensure that the terminal is enabled. Open the Control Panel, click Terminal, then ensure that 'Enable SSH service' is checked.

From the workstation, use putty or ssh to log into the diskstation, for example:
ssh root@diskstation
Then, once logged in, set the port you want to use, for example:
Change to the Synagios directory:
cd /volume1/@appstore/Synagios/
Then copy the following block and paste it directly into the terminal:
sed -i 's/\(port =\).*/\1 '$PORT'/' \
sed -i 's/\(SYNAGIOS_PORT=\).*/\1'$PORT'/' \
sed -i 's/\(NameVirtualHost.*:\).*/\1'$PORT'/' \
sed -i 's/\(<VirtualHost .*:\).*/\1'$PORT'>/' \
sed -i 's/^\(Listen \).*/\1'$PORT'/' \
sed -i 's/\(^.*\)[^\/]*/\1:'$PORT'/' \
sed -i 's/^\(adminport="\).*/\1'$PORT'"/' \
This will change the port in all the relevant files. Stop then start Synagios using the Package Manager and it should now be using the new port.

Thursday, 13 February 2014

Nagrestconf Service Sets

Nagrestconf uses service sets to group services but what are they for and why bother with them?


When first learning to configure Nagios a number of 'Object Tricks' are suggested. These are really great time saving tips when configuring nagios by hand, editing the text files directly, but as more tricks are learned and used it becomes more difficult to understand the configuration, and more difficult to change it with confidence.

After spending time and effort learning all the tricks it feels natural to organise hosts into hostgroups and then to say, 'which ever host belongs to host group X gets the services assigned to that host group'. Naturally this then extends to having hostgroups named by a role and assigning hosts to many hostgroups. This is great in theory, but in practice it causes many problems and reduces the granularity of changes that an administrator is willing to make since the configuration contains so much redirection, ifs and buts, it looks more like an sql database than, what should be, a simple nagios configuration.

Simply put, using host groups to configure services can get messy, difficult to manage, and sometimes it is even dangerous. As the configuration grows it becomes increasingly more difficult to make localised changes and mistakes creep in - especially in the form of changes that affect more servers than was expected. Often the answer to this problem is to restrict changes - only allowing changes to a group of servers, which reduces the configurability of the monitoring system.

Additionally, the hostgroups view, in the nagios web interface, is not a logical grouping that would be useful to those supporting the servers, but instead it's a grouping only to facilitate the configuration. Using host groups this way means that the hostgroups view will contain many duplicate entries, especially as the configuration becomes complex, which affects being able to visually determine the state of the network. With a quick scroll down the host groups list for example, a single problem will be shown many times making things look worse than they are, and, of course, this happens just as someone important is being shown the monitoring system.

Service sets solve part of the overall configuration problem and makes many automation tasks straightforward. It's about trying to make the configuration work the way we think, and not the other way around.

I will describe service sets in the next section and, even though it's a small section, you will finish knowing everything there is to know about them.

Service Sets Definition

In nagios, a service is the definition of a monitoring check that nagios will read and execute at regular intervals. This service check can be assigned to a host and will be shown connected to that host in the nagios web interface.

A service set is a named collection of services, and once defined this service set can be assigned to a host when the host is created. Service sets are a nagrestconf feature and are not available in nagios.

More than one service set can be assigned to a host, in which case the services contained in each listed service set will be added to the host. If there are duplicate services in the service sets assigned to a host then the rightmost service set containing the duplicate service will be used.


Using service sets allows you to make nagios do exactly what you want and no more.

Using the Bulk Tools plugin allows service sets to be applied to many hosts at once, and for automation it allows a host to be added with only one REST request.

Service sets can be modified and re-applied to one or many hosts using the host edit dialog or by using the bulk tools plugin, and since the list of servers to re-apply the service sets to is chosen by the administrator, there is a much lower risk of making changes that weren't expected. 

Thinking about service sets is natural and simple, and allows host groups to be used for grouping hosts into logical groups with no duplicate entries.

In distributed environments with a central console, or 'single pane of glass', service sets can be copied to slave hosts without any modifications by using the Backup and Restore plugin, and without having to change any names. Nagrestconf deals with name collisions by name mangling so identical configurations can be used throughout an organisation.

The Bulk Tools plugin also allows hosts to be created using a 'csv' file, which can be created using a spreadsheet program. The format of the csv file is short and simple since it expects the service set to be specified rather than all of the individual checks. Service sets can be named by the role they fulfill so creating the csv file for an environment should feel quite natural. As an example, an environment, called 'aa', might contain 2 database servers, 2 apache web servers, and 2 application servers. The csv format is:

Hostname, IP Address, Host Template, Hostgroup, Service Sets

So the csv file might look like:

aa-db1,,host-tmpl-linux,aa-env,base-lin aa-db
aa-db2,,host-tmpl-linux,aa-env,base-lin aa-db
aa-web1,,host-tmpl-linux,aa-env,base-lin aa-apache
aa-web2,,host-tmpl-linux,aa-env,base-lin aa-apache
aa-app1,,host-tmpl-linux,aa-env,base-lin aa-app
aa-app2,,host-tmpl-linux,aa-env,base-lin aa-app

After a successful import all the 'aa' hosts will belong to the 'aa-env' hostgroup, which might be displayed as 'AA Environment' in the nagios web interface. Before the file is imported into nagrestconf the host template, host groups and service sets should already have been defined. If not, then the hosts will not be added since nagrestconf will spot that the configuration is wrong.

After adding the hosts it is still possible to edit values for any host or service. For example, if the host 'aa-app2' always runs with higher load then the load threshold for that server can be changed, but it would be better to create another service set with this higher load threshold and apply that service set to the host, which is easy to do since service sets can be duplicated (cloned) using the web interface.

The Future

I think service sets are pretty useful and would be excellent for sharing. Imagine an on-line repository of service sets which could be downloaded and used, along with the required plugins. This could save lots of time and headache for many monitoring tasks.

It would be great if when I'm asked to monitor something, say an IIS Web server, I could go to the repository, look for IIS, and find a bunch of ready-made service sets that can be downloaded and modified for the local environment. I want this feature - and it's a planned addition for the future!

Wednesday, 12 February 2014

Enabling Synagios Email Alerts

So, you've just installed Synagios, restored from the backup file to get the default configuration, and now your Diskstation is monitored; but how do you make synagios send email alerts? This is fairly easy to set up but how you do it does depend on your email setup.

Sending to the Synology Email Server

The default configuration is set up to send notification emails to the local server, so if you are using the Synology email server, built into the diskstation, then all that's required is to add your email address to the default ds-admin account in the Contacts tab. The following video shows how.

Sending via your ISP

Many ISP's and most companies block emails coming from known broadband addresses to reduce spam, and many ISPs also block outward bound email messages to all addresses apart from their own relay servers for the same reason. It's never a good idea to send emails directly from a standard broadband account - always use the relay server unless you have a business account. Additionally, some ISPs require secure authentication to be able to send emails through their relay server.

So, even if you don't use the Synology email server it's a good idea to set it up to send to your ISP's SMTP relay server. Doing it this way means that setting up Synagios is the same as the above video, so, really simple, and the diskstation's mail server can handle all the difficult bits (authentication, relay servers etc.).

Sending from within a Corporate Network

This is something that will need to be cleared with the IT department at your company. It will probably be the same process as 'Sending via your ISP' above, but you'll need to know which internal email server to point the diskstation's email server at, and there may be other restrictions.

Synology Diskstation Performance Tuning (3)

Synology Diskstation Performance Tuning (2)
Synology Diskstation Performance Tuning (1)

Performance Notifications (Continued)

Previously, in part 2, I sketched out an algorithm for reliably determining when the diskstation is overloaded. It looks like it should work but it needs to be implemented and tested.


The implementation will use the same macro technique used by the stock nagios plugin, check_cluster, but will be implemented as a shell script so that the table rules, shown in part 2 and repeated here, can be implemented in the script.

Metric Status

The macro technique check_cluster uses is pretty clever because it's simple to use and very efficient. It relies on nagios getting all the required service status values directly from its in-memory data structures and passing them to the plugin as an argument. This means that the plugin can be simple and will exit quickly, using few resources.

To write the nagios plugin I'll start by specifying the interface, how it will be run:
check_cluster_table (-S | -H) \
    -t col1[,col2,..,colN] \
    -c NUM \
    -d col1[,col2,..,colN]
The options are similar, but not the same as check_cluster's options. This is what the options will mean:

-S for checking services
-H for checking hosts
-t for each 'BAD' column in the table, specified multiple times for each column
-c for specifying how many statuses in critical state will cause this plugin to show critical
-d for service statuses that nagios gets

Using the interface definition above, the command to implement the table, including the macro, would be:
check_cluster_table -S \
    -t 0,1,1 \
    -t 1,0,1 \
    -t 0,0,1 \
    -t 1,1,1 \
    -c 2 \
    -d "$SERVICESTATEID:diskstation:Load$,
        $SERVICESTATEID:diskstation:Swap Activity$"
The '-d' option contains the nagios macros. These can only be tested from within nagios in the service check and will expand to three numbers separated by commas, for example "1,0,0", so it should be fairly easy to see how the '-t' options relate to the '-d' option. Refer to the On-demand Macros section in the nagios documentation for a good description on how to use them.

Next, I wrote the script and it passed all the table tests, so now it's time to copy the script over to synagios and set up the nagios configuration.

Trying it Out

  1. The nagios plugin script needs to be copied to the SyNagios plugin area, and is available for download from nagios exchange.
  2. Using the synagios web interface, nagrestconf:
    1. Add the command in the Commands tab.
    2. Add the check to the diskstation service set in the Service Sets tab.
    3. Re-apply the service set in the Hosts tab.
    4. Apply and Restart.
  3. View the new check in the nagios3 web interface and disable notifications for the clustered service checks.
  4. Adjust thresholds for the clustered checks.
  5. Test.
Copying the plugin.

Until the System Tools plugin is written, the plugin, check_cluster_table, will need to be copied using scp. In DSM 4.3 enable the SSH service in the DSM Control Panel then copy using an scp client like filezilla or cygwin. I use cygwin on Windows so for me the command is:

scp check_cluster_table root@diskstationarm:/volume1/@appstore/Synagios/nagios-chroot/usr/local/nagios/plugins/

Then make sure it's executable:

ssh root@diskstationarm chmod 755 /volume1/@appstore/Synagios/nagios-chroot/usr/local/nagios/plugins/check_cluster_table

Add the command, service-set service, and re-apply the service set to the host.

The following video shows the process involved and the text to paste in is shown just below the video. For Firefox users the video can be paused and resumed using the Toggle animated GIFs plugin.

Screenshot video

Command, 'check_svc_cluster_table':

$USER5$/check_cluster_table -S -c $ARG1$ -d "$ARG2$" $ARG3$

Service set service, 'System Performance':

check_svc_cluster_table!2!$SERVICESTATEID:diskstation:Load$, $SERVICESTATEID:diskstation:CPU$, $SERVICESTATEID:diskstation:Swap Activity$!-t 1,1,1 -t 0,1,1 -t 1,0,1 -t 0,0,1

Stop notifications from the clustered service checks.

Now that the System Performance plugin is checking the values from the service checks, load, cpu and swap activity, notifications for those checks should be disabled. Simply disable the notifications in the Nagios Web interface.

Adjust thresholds for the clustered checks. 

Ensure that the clustered checks each have thresholds set. If they don't then check_cluster_table won't work correctly. In my setup the cpu check had no thresholds set so I appended the options -w 75 -c 90 to the cpu check, so the command ended up as:
check_cpu.sh!-i 4 -w 75 -c 90
Test by making load.

I used many different packages and tools to create load in different ways and email alerts were only sent when there was a real problem. Success!

So, that's it for accurate performance notifications. I expect to do a bit more tweaking of thresholds and it's looking good so far, but there's still Reliable Media Streaming to get back to, which I'll talk about in part 4.

Monday, 10 February 2014

Synology Diskstation Performance Tuning (2)

Synology Diskstation Performance Tuning (1)

Continuing with my ongoing quest for good system performance on my Synology devices, I have more issues to solve: reliable media streaming and performance notifications.

Reliable Media Streaming

I switched from the Plex media server to Synology's Video Station (plus Media Indexer) and the performance was much better. Now, I really like Video Station (and Plex by the way) but Video Station seems to take a long time to pick up new media, and I don't get thumbnail images shown on my LG TV, so, from the shadows, in steps MediaTomb.

I installed Media Tomb, which, so far, has solved both issues, and it's really efficient too, beating Plex and Videostation for performance. As usual, I'm losing lots of features to get better performance but I'm very happy so far: It's fast, efficient, uses little memory, shows thumbnails on my LG TV and in Media House Pro (on Android), and picks up new media instantly - great!

Searching the Net for installing mediatomb on a synology device, I saw some ways to install it directly in the Diskstation using ipkg. I don't like this idea - to me it feels like messing with the device firmware, and it's not something I want to get used to doing. Instead I put mediatomb inside a chroot and ran it from there. If I ever want to delete it I just delete the chroot and then it is completely gone, with no risk of breaking the system or filling up precious system space.

I'll explain how I installed mediatomb on a synology device in another post soon. It's pretty simple.

Performance Notifications

I only want SyNagios to tell me when there's an actual performance issue, not when the system is simply under heavy load. When the system is working hard then load will be high, and this means the diskstation is working correctly, so please Synagios, don't bug me with email notifications!

Getting rid of false positives

To accomplish this, a few existing metrics need to be checked and clustered together to implement the logic in the following table:

Metric Status

My first thought was to use the standard nagios check_cluster plugin, but it's not configurable enough to implement the logic shown in the table, but before looking further into implementing the logic, let's first test this logic on a performance problem I have encountered recently.

The screenshot below shows where I tried the Plex media server again on the tiny DS112, and it shows that there was a performance issue that i would want to be notified about.

The graphs show that Plex is too heavy for the tiny DSM112, so I stopped plex midway through it's scan, built a chroot, later installed mediatomb in the chroot, then did a media scan on the same media using mediatomb.

The load is still high, sitting at around 3-4, and there is plenty of cpu usage, but there is no performance problem. Using the table logic from above I would have been notified about Plex, good, but would not been notified about mediatomb doing its scan, or for the load from building the chroot, which is what I want.

Plex versus Mediatomb for the initial media scan on a Diskstation DS112

So now I think I have the theory covered it's time to work out how to do it with synagios.

I'll cover this in Synology Diskstation Performance Tuning (3).

Saturday, 8 February 2014

Synology Diskstation Performance Tuning

I'm lucky enough to own a couple of Synology Diskstation NAS devices. They are excellent devices, but when they're overloaded they don't feel so great, so here's how I reduced the load and increased the performance.

How to Tell When a Diskstation is Overloaded

Strictly speaking the diskstation is overloaded when the load, divided by the number of CPUs plus Hyperthreads, exceeds 1.0, and this number indicates the number of processes that are using the cpu or waiting for the cpu to be ready. So a system running at a load of 1.0 is utilising its cpu perfectly, neither the cpu or processes are waiting for each other. A load of 1.5 means that processes were waiting for cpu time for 50% of the time that the load average was taken. This waiting time is felt as lag or unresponsiveness.

The load can be subjective, but there's one metric that Synagios takes that really will show a poorly performing box, and that metric is Swap Activity. This is the amount of actual swapping the system is doing, transferring data to or from the swap file, and if swap activity is high then performance will be poor. The following graph shows the swap activity on my DS112 before and after tuning.

Swap Activity - before and after tuning
Using the 'top' program the load average will be shown as three numbers, for example '1.8 1.4 1.0', which are the load averages for the last 1 minute, 5 minutes and 15 minutes, respectively.

The current load on a Diskstation can be seen by enabling the terminal, logging in, then typing 'top' but it is useful to be able to see historical graphs to tune system performance. For this, Synagios, can be installed, which will graph this, and other statistics. Synagios can be downloaded for free from http://nagrestconf.smorg.co.uk.

Setting up Synagios is quite straightforward and a sample configuration can be loaded which will, among other things, graph the diskstation's load average.

A happy diskstation.
A slightly worried diskstation.

Overloading: Is it bad?

The Synology boxes can run under higher than optimal load quite comfortably, without damaging the system or components so is it bad? The answer to this question depends on your expectations and types of packages used.

How long are you willing to wait for media to start playing, for media to be added, to open files over the network, or for pages to load? It's well worth getting the load down to below 1.0 then trying your favourite programs to see if they perform better. If the performance is the same or similar then allow the load to stay high - it's fine to run with high load if the performance is acceptable to you.

Types of Packages
As a general rule of thumb, packages written by Synology are highly performant. For example, by simply changing from Plex to Video station (plus Media Server) your system load will be greatly reduced. However, Plex has a few features that Video station doesn't so it's a trade off. For me, the reduction in performance is too great to warrant running Plex, although I miss it. Now I've switched, the whole system works better: Symform health now stays at 100%, applications are responsive and media playback starts quickly.

If you're only using your Synology NAS device as an, err, NAS device (perish the thought) then you'll probably not need to do any tuning.

Reducing the Load

To find out how much the system load is affecting performance stop as many packages as possible and try to get the load below 1.0, then browse shares, selectively add packages and use user interfaces or network access points to see if the experience is improved. If the increase in performance is only slight then don't bother tuning and allow the load to stay at its previous level. If not then some tips follow below:

Stop all packages that are no longer used - don't leave them running. Unused packages may sit in the swap file, which may not increase the load, but if they poll occasionally or try to do a bit of house-keeping then they will affect performance. It's safer just to stop them or uninstall them.

Tweak packages that poll. Packages that access sites on the Internet to check for things should be tweaked to lower the polling frequency, or to do their polling when you're not around, maybe very early in the morning or while you're at work.

Tweak downloaders. Packages that download from the Internet can really slow things down, especially if you have a fast internet connection. Set them to download when you're not going to be using the diskstation, which will also keep your ISP happy since this will usually be after 12pm.

Reduce threads and processes. Some packages have settings that control how many threads, processes or connections they use. Reduce these values and system load will reduce.

Backup when you're away. Synology's Time Backup is great, but doing it hourly for a large directory will incur a bit of a load penalty. Try to organise directories (folders) and shares so that hourly backups can be made for smaller directories, incurring less load, and backup larger directories when you're away.

Move DHCP and DNS. If you're using DNS and DHCP on the Synology box then maybe try moving one or both to your Internet router.

Reducing the Resources Synagios Uses

Synagios itself will use precious resources so some tweaking may be required to reduce the load Synagios puts on the system, especially for devices with smaller memory. Once the system tuning is finished, stop or uninstall Synagios, or leave it on and allow it to email you when things aren't going well.

I won't say exactly how to tune Synagios as files have to be edited manually. Instead, the following is for Linux savvy people, but for non-linux nerds a Synagios plugin is currently being written to make this process easy.

So for the techies among you:
  1. Edit the apache main configuration file and reduce the number of servers that are started and the max number of servers.
  2. Edit the main pnp4nagios configuration file reducing the number of threads and use the option that stops pnp4nagios from processing stats when the load is over a certain value.
The quest for better performance continues in Synology Diskstation Performance Tuning (2).

Monday, 3 February 2014

Synagios ARM Fixed

I picked up a Synology DS112 this evening and found out what was wrong with Synagios ARM. It ended up being a problem with libapr1 on wheezy for ARM, which I think is a kernel version mismatch between the headers in the chroot and the actual kernel version used in the Synology device. Rebuilding libapr1 fixed this and now Synagios is working, as shown above - Great!

The DS112 is a really good looking piece of kit and performs really well, which surprised me, but comes with the slightly older DSM 4.1. It was a real bargain at under £100, and could probably monitor a small office!

Try it out and please let me know if it works on your Synology device.

Saturday, 1 February 2014

Synagios ARM broken

Someone reported that the arm port doesn't work so I ordered a Synology DS112 today, from Currys, for under 100 quid! I went for the cheapest one I could find so if it works on that then it should work on anything right? The log he sent was not very useful either so I created a ticket on Github to improve that. I should get it on Monday evening and I'm really looking forward to having a play with it and getting Synagios arm working. I'll let you know how I get on...