Install Domoticz and Razberry2 on Raspbian 2017-01-11

I just installed domoticz with the following setup:

* Razberry2
* Raspberry Pi 3
* Raspbian Jessie, 2017-01-11

There are a couple of things to keep in mind, for the Razberry2 to work properly, especially with the later jessie releases:

* The serial port has to be turned ON
* Console on the serial port has to be turned OFF
* Bluetooth has to be disabled
* hciuart.service can optionally be disable (to get rid of an error message during boot)

So, the minor issue is that when you use “raspi-config” to turn off the serial console, it does not only turn off the console output on the serial port. It also turns off the serial port, which is not really what we want. That is why most people get a bit confused and fiddle around until they figure out that the “enable_uart=0” entry in /boot/configure.txt should be “enable_uart=1”, and never think of why it happened to be that way.

The “console output” to serial is configured in /boot/cmdline.txt with the entry “console=serial0,115200”, which we need to get rid of, but still make sure that there is no “enable_uart=0” in /boot/config.txt.

Unless you really want to, there is no need to redistribute the GPU RAM mapping.

So, a working setup (as of 2017-01-20) is:

* Create an SD card with 2017-01-11-raspbian-jessie.img
* Before you unmount it from your PC, change the following files on the SD card:

/boot/cmdline.txt

[code]
cat /boot/cmdline.txt
dwc_otg.lpm_enable=0 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait
[/code]

/boot/configure.txt

[code]
enable_uart=1
dtoverlay=pi3-disable-bt
[code]

* Boot the raspberry pi
* Disable the hciuart service

[code]
sudo systemctl stop hciuart
sudo systemctl disable hciuart
[code]

* Ensure you have a /dev/ttyAMA0 file

[code]
ls -la /dev/ttyAMA0
crw-rw—- 1 root dialout 204, 64 Jan 20 08:19 /dev/ttyAMA0
[code]

* Install domoticz as described above by kent

[code]
mkdir ~/domoticz
cd ~/domoticz
wget http://releases.domoticz.com/releases/release/domoticz_linux_armv7l.tgz
tar xvfz domoticz_linux_armv7l.tgz
rm domoticz_linux_armv7l.tgz
sudo cp domoticz.sh /etc/init.d
sudo chmod +x /etc/init.d/domoticz.sh
sudo update-rc.d domoticz.sh defaults
sudo service domoticz.sh start
[code]

* Go to “Setup”->”Hardware”
* Add a OpenZWave USB device with the serial port: /dev/ttyAMA0

Done.

Network interface naming in Ubuntu 16 back to eth0

In Ubuntu 16 the network interface naming is changed, so you won’t have your usual “ethX” naming.

If you, for any reason, would like to revert back to the old behavior, do the following:

  • Add the following line to /etc/default/grub
GRUB_CMDLINE_LINUX="biosdevname=0 net.ifnames=0 biosdevname=0"
  • Run “sudo update-grub”
  •  Reboot

That’s all.

Meta monitoring revisited – EC2 meta monitoring

This post is a revisit of a topic I have already blogged about, Who monitors the monitor?

Meta monitoring -> frequently compare an inventory with your monitoring configuration.

In my terminology I call this meta monitoring since it is not actively monitoring a business function or the functionality of an infrastructure item. By using meta monitoring I am making myself aware of the completeness of my monitoring. Meta monitoring should give an answer to the question: Is there is something I am missing?

Well, as most of you will say; we always miss something. I agree. But with meta monitoring, we will aim to limit the unknown to a bare minimum. If you don’t do it, your configuration will be hopelessly out of date within days.

eyes-in-the-dark

My take on meta monitoring is to make a list of something that could be monitored, filter away known exceptions, then compare it with the monitoring system configuration.

There are plenty of tools on the market that will help you make inventories of more or less every aspect of your infrastructure. They are usually very expensive. And, honestly, to do this yourself is not even hard.

  • Get a list of items from your current infrastructure (may it be vCenter or Amazon Cloud)
  • Remove items that you know should not be monitored
  • Compare this list with your monitoring system.

In an OP5 environment, you can even do this in a “one-liner”, for example:

1
2
3
root@op5-system:~# echo mysql-v001fry magnus monitor synology03 | sed -e 's/ /\n/g' | grep -wv "$(printf "GET hosts\nColumns:name\n" | unixcat /opt/monitor/var/rw/live)" | xargs -I"{}" echo "Host \"{}\" is not configured in OP5"
Host "magnus" is not configured in OP5
Host "synology03" is not configured in OP5

 

Now, this one-liner is listing all configured hosts in your monitoring environment and using that list to filter away known (monitored) hosts from the list you echo, so it is probably not the most effective way to do it, but it works. See it as an example of how easy it can be.

Now, when it comes to gathering the complete (or partial) inventory from your infrastructure is also not that hard. In it’s simplest form you just copy/paste from your favorite excel sheet, or you request it from your infrastructure through an API. Amazon EC2 has a very powerful API. Just create a read-only user with access to your environment, and use a simple ruby script to get the names from EC2. Note that you need to point out which region you would like to list, and optionally add your proxy URI to the script below.

Example:

1
2
3
4
5
6
7
8
9
10
#!/usr/bin/ruby
%w[ rubygems aws-sdk ].each { |f| require f }

aws_api = AWS::EC2.new(:access_key_id => 'YOUR_ACCESS_KEY', :secret_access_key => 'YOUR_SECRET_KEY', :region=>'us-west-2' , :proxy_uri => '')

aws_api.client.describe_instances[:reservation_set].each do | instance |
  instance[:instances_set][0][:tag_set].each do | tag |
    puts tag[:value] if tag[:key] == 'Name'
  end
end

Running this script will give you a list of your instances in Amazon EC2. I called this script “listEC2Instances.minimal.rb” and put it together with my one-liner:

1
2
3
4
5
6
root@op5-system:/opt/kmg/ec2/bin# ./listEC2Instances.minimal.rb
vpn-v001ec2
kmg-test002ec2

root@op5-system:/opt/kmg/ec2/bin# ./listEC2Instances.minimal.rb | sed -e 's/ /\n/g' | grep -wv "$(printf "GET hosts\nColumns:name\n" | unixcat /opt/monitor/var/rw/live)" | xargs -I"{}" echo "Host \"{}\" is not configured in OP5"
Host "vpn-v001ec2" is not configured in OP5

 

Now, you know which hosts in your Amazon Cloud that are not monitored. Do something about it! =)

 

One liner to kill Oracle expdp job

This is a very obscure one liner to kill a running an Oracle expdp job.

Background:

  • The expdp/impdp is the Oracle data pump tools to export and import data
  • Killing the expdp process is not enough to stop an export job
  • To kill an export, you will have to “expdp attach=JOB_NAME” and issue a “KILL_JOB” command

Pre requisites for my one liner:

  • You have a log file in the current directory called exp_something.log (by using the LOGFILE=exp_something.log in your parameter file)

Here comes the one-liner which works in ksh:

1
expdp attach=$(grep Starting $(ls -tr exp*.log | tail -1) | cut -d":" -f 1 | cut -d"." -f 2 | sed -e 's/"//g')<$(printf "/ as sysdba\nKILL_JOB\nyes\n">/tmp/someFile$(echo /tmp/someFile)

That’s it! If the expdp job exits with an exit code > 0 (echo $?), it failed. Just run the one liner again. The output of the one liner will hang for some seconds at the question “are you really really sure: [yes]/no:”, which is normal. Just wait.

Over and out!

Hotplugging more than 15 scsi devices in Ubuntu

Hi all,

Today I ran into something that took me a bit to figure out. I could not add new disks to a virtual machine running Ubuntu.

The basic scenario:

  • VMWare ESXi 5.1
  • Ubuntu 12.04.3 LTS
  • 15 virtual harddrives already configured

I had to add more space to a filesystem without rebooting the server, which normally is very simple. This is what I normally do:

  1. Add a virtual disk in vSphere Client
  2. “rescan-scsi-bus -w -c” on the guest system (ubuntu)
  3. fdisk -> create partition and set device id to 8e
  4. pvcreate vgName /dev/sdX1
  5. vgextend
  6. lvextend -L +100G /dev/vgName/lvName
  7. sudo fsadm -v resize /dev/vgName/lvName

I tried to do this, but no matter what, I just could not get my Ubuntu box to see the new disks (I added a few).

Now over to the solution: After quite some research, I figured out what I had to do, but first some theory.

VMWare: When you add your 16th scsi device, VMWare will not only add the disk you ask it to add, but allso add a new scsi controller, and add the new disk to this controller. This since you can only have 16 devices on the scsi2 bus (duh). The new disk will be “SCSI (1:0)”. If you just want to test this, you can add a new disk to your VM and in the last section of the wizard, just assign it to (1:0). Before you apply this to your VM, you will see that you will not only add a new disk; you will also add a new scsi controller.

Ubuntu: If you just try and run rescan-scsi-bus on your Ubuntu system, it will happily do so, but it will not be able to see your new disk; since it does not know of your new scsi controller yet. You will notice that, since the adapters are listed in the beginning of the output:

1
2
3
4
5
6
maglub@nfs-v001alt:~$ sudo rescan-scsi-bus -c -w
/sbin/rescan-scsi-bus: line 592: [: 1.03: integer expression expected
Host adapter 0 (ata_piix) found.
Host adapter 1 (ata_piix) found.
Host adapter 2 (mptspi) found.
...

So, the million dollar question is: How do you add this adapter without rebooting?

First, check the PCI bus, just to see that you don’t have the new scsi controller listed:

1
2
3
4
maglub@nfs-v001alt:~$ lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 01)
...
00:10.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)

No trace of the new controller. This is, because you will need to rescan the PCI bus as well. To do this, you will need to do the following (as root):

1
echo "1" > /sys/bus/pci/rescan

If you check your PCI bus now, you will see the new scsi-controller:

1
2
3
4
5
6
7
root@nfs-v001alt:~# lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 01)
...
00:10.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)
...
02:02.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)
02:03.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)

This will also add your new disks, but if you are curious, you can scan your scsi bus for new disks to see what happens:

1
2
3
4
5
6
7
root@nfs-v001alt:~# rescan-scsi-bus -w -c
/sbin/rescan-scsi-bus: line 592: [: 1.03: integer expression expected
Host adapter 0 (ata_piix) found.
Host adapter 1 (ata_piix) found.
Host adapter 2 (mptspi) found.
Host adapter 3 (mptspi) found.
...

The rescan-scsi-bus command can see your new scsi adapter! Voila!

Alias is silver, function() is gold!

Everyone who has reached the first advancement of any meaningful martial arts, know that in order to get that first level you need to learn some very basic, yet useful techniques.

Being a unix admin is the same. The more tools you know, and the more you know how to put them together, the fancier the belt to keep your trousers up will be.

I sometimes claim to be a black belt unix admin; or at least I was once upon the time. Some people might disagree, but after almost 20 years of hacking around I think I know my way around. I have Niklas to thank for my yellow belt, since he lent me my a Sun SPARCstation 2 so that I could cram in the basics of Solaris after accepting a job offer. It helped, after which I practiced a lot to get better.

Now over to this post.

Everyone who claims to know anything about UNIX/Linux/OSX knows about alias. It is very useful to simplify commands you use a lot. I often use it to manage ssh connections for customers.

I.e, in my .bash_profile, I have an alias for each customer I am working for at the moment:

alias set_custA=". ~/.aliases.custA"

alias set_custB=". ~/.aliases.custB"

alias set_kmg=". ~/.aliases.kmggroup"

Every time I start a new terminal window, and if I would like to work for i.e my own company, I would type set_kmg, which would source the alias file ~/.aliases.kmggroup, which could look like this:

alias op5-v001fry="ssh malu@192.168.2.34"

alias guran="ssh malu@192.168.2.37"

This is decently ok, but there is one drawback; aliases does not accept any paramters. I.e, I could not do a:

guran ls -la

At least not if I expect it to resolve to “ssh malu@192.168.2.37 ls -la”. And I often have a wish to do so. Or to do something more complex. But this will serve as a good example.

So, how do I solve this, then? Enter: function()

function() beats alias in any cage match there ever will be. Bash has a wonderful implementation of functions which you can use in all it’s glory. I will use guran again to show how to implement this. Another good example I have is for one customer where I sometimes want to do the same thing on multiple servers, as root. But let’s start with guran. I would put this in my ~/.aliases.kmggroup file next to the aliases.

function guran(){

ssh malu@192.168.2.37 "$@"

}

Voilla!

Again, if I would like to use my example of the multiple hosts:

function sshAll(){
hosts="100 101 102 103 104"

for host in $hosts
do

echo "Host: $host" 1>&2

ssh -t malu@10.0.0.$host "$@"
done
}

Using this script I could, for example, edit /etc/hosts on these five hosts by issuing “sshAll sudo vi /etc/hosts“.

That’s it, for today!

 

Edit: Fredrik Roubert just mentioned that I should change the functions to “ssh xxx xxx $@” from $* to better handle parameters with spaces. Thanks!

Edit2: Fredrik came back to me, pointing out that there is a difference between $@ and “$@”, which there of course is. He was not sure about the implications in bash, since he is a zsh guy, but I made this quick hack as an example:

 

malu@KMG-Hotspot.local:/Users/malu/test $cat functions
function mother(){
echo Parameter 1: $1
echo Parameter 2: $2
echo Parameter 3: $3
echo Parameter 4: $4
}
function function1(){
mother $@
}

function function2(){
mother "$@"
}
malu@KMG-Hotspot.local:/Users/malu $function1 "parameter 1" "parameter 2"
Parameter 1: parameter
Parameter 2: 1
Parameter 3: parameter
Parameter 4: 2
malu@KMG-Hotspot.local:/Users/malu $function2 "parameter 1" "parameter 2"
Parameter 1: parameter 1
Parameter 2: parameter 2
Parameter 3:
Parameter 4:

 

Could not chdir to home directory

I was at a customer’s site the other day, and ran into an issue that I could not really understand.

When logging in on my Linux box, a server I was setting up for a small application, I got the following error message when logging in, as the first thing on my terminal:

1
Could not chdir to home directory /app/prd/kmggroup: Permission denied

The background is that the application I am setting up has it’s home directory in a non-standard location. Let us call the user kmggroup, just for kicks, and that the home directory is /app/prd/kmggroup. Logging into this user directly, using a password should be banned anyways, as it is an anonymous user, owning an application. I will write about my prefered way of logging in as anonymous users (e.g oracle, apache, kmgapp, whatever) in a different post.

At this point, my user “landed” in “/”, but it was still possible to do a “cd /app/prd/kmggroup” to go to that directory. Very annoying, though.

It took me a little while to figure out, as I had just ordered a virtual machine, no preference of flavor. I got a RedHat server, and for me there is not much to say about that.

1
2
kmggroup@server.org:/usr/local/samba/etc $cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.0 (Santiago)

I debugged my .bash_profile, the /etc/passwd file, the /etc/profile, tweaking it a bit (adding “echo bappen” to the startup scripts here and there). I realized that the error message appeared even before the /etc/profile script was ran, so I got a bit curious.

After searching the topic on the world wide information network, also known as the Internet, I slowly realized that this has to do with the SELinux, namely the context settings for the directories.

SELinux is dreaded by the un-initiated, and there are few admins out there who really know how to set it up and live with it properly. (I am one of those, mainly un-initiated).

Enough said about that. Here is my solution to solve the problem, without messing up someone else’s system.

My directories were set up like this:

1
2
3
4
kmggroup@server.org:/app/prd/kmggroup $ls --context -d /app /app/prd /app/prd/kmggroup
drwxr-xr-x. root root unconfined_u:object_r:default_t:s0 /app
drwxr-xr-x. kmggroup kmggroup unconfined_u:object_r:default_t:s0 /app/prd
drwxr-xr-x. kmggroup kmggroup unconfined_u:object_r:default_t:s0 /app/prd/kmggroup

Normally, /home is set to the following context:

1
2
3
kmggroup@server.org:/app/prd/kmggroup $ls --context -d /home /home/*
drwxr-xr-x. root root system_u:object_r:home_root_t:s0 /home
drwx------. apa apa unconfined_u:object_r:user_home_dir_t:s0 /home/apa

My “/app/prd/kmggroup” directory is “special”, as we set it up in a non-default location, where the context was not set yet.

So, a couple of chcon later, the problem was solved:

1
2
3
4
5
6
7
8
sudo chcon -t home_root_t /app
sudo chcon -t home_root_t /app/prd
sudo chcon -t user_home_dir_t /app/prd/kmggroup

kmggroup@server.org:/app/prd/kmggroup $ls --context -d /app /app/prd /app/prd/kmggroup
drwxr-x---. kmggroup kmggroup unconfined_u:object_r:home_root_t:s0 /app
drwxr-x---. kmggroup kmggroup unconfined_u:object_r:home_root_t:s0 /app/prd
drwx------. kmggroup kmggroup unconfined_u:object_r:user_home_dir_t:s0 /app/prd/kmggroup

The error message does not appear, and my user ends up in his homedir. After telling the sysadmin at the site, he told me that they are not using SELinux (for good reasons in their environment), he had just forgotten to turn it off before giving me the box.

We both had a good laugh about it.

Have a nice day!
//magnus

Nagios and OP5 – writing a nrpe check script

Long time no see…

One of my main interests in working with production systems, is to be able to sleep well at the night. A very important component to help making sure I can, is to know when things go bad, which they will; sooner or later. It is just part of life. Just like any car or mechanical thing, a computer system will eventually have a hickup. It is better to know yourself when and what went wrong, than having a customer call you and tell you that something in your shop is broken.

Be proactive, not reactive.

In my world, where a small shop has a minimum of a handful of servers, and a large shop has hundreds – or perhaps thousands of servers and services, there is no way one can for sure know that something is working or if it is broken. A single server, no matter which brand/make/OS, has more than one service running, and everything running can break. So, unless you are willing to constantly log in to each and every system, you need to automate the monitoring of your stuff. For decades there has been monitoring systems around, ranging from very cheap to very expensive.

Short story: You can implement quite a mature and powerful monitoring even with a very small budget. Even large corporations are looking into cost effective solutions.

Today, I checked out the OP5 Monitor, which is a commecial but yet very attractive extension of Nagios. It has many bells and whistles, which are not part of the standard issue, mainly when it comes to reporting and configuration. It still took me a couple of hours to set it up the way I wanted. But man, the configuration is a walk in the park in comparison. After the first hit, there is almost no way back to plain vanilla Nagios.

I have used Nagios quite a lot in the past, but it is ugly (eh, the gui honestly looks like crap, but it for sure fulfills it’s purpose) and there is a horde of config files to keep track of.

Well, being an old school Nagios hacker, I already know the basic concepts. Perhaps the ease of config of the OP5 Monitor software is easier for me than for many others, but I will put that aside. Here, I will just give you a quick glance on how easy it is to extend the Nagios NRPE (Nagios Remote Plugin Executor), so that the monitoring server (Nagios or OP5) can execute remote scripts on a host withot having to deal with weird home grown ssh scripts and keys.

First, I have to give you a short introduction to how Nagios checks a service. It is simple, really simple.

If you want to write your own check-script, you need to know what you want to check. A good example is to look for the presence of a file, e.g /tmp/foo.bar. Let us say, that your whole corporation is depending on knowing whether this file exists. A simple way to check this, is to write a script.

1
2
#!/bin/ksh
[ ! -f /tmp/foo.bar ] && echo "The file does not exist"

This will just echo a warning if the file does not exist.

If you would like for Nagios to understand this, you need to tell it just a little more; a return code.

  • 0 – All is fine, just go on as before
  • 1 – Warn that something is not really ok
  • 2 – Critical – this is bad, call for the fire brigade

So, to extend this script, to make it a fully phledged Nagios module, you just need to send back the correct return code:

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/bin/ksh

if [ ! -f /tmp/foo.bar ]
then
  msg="CRITICAL - The file does not exist"
  rc=2
else
  msg="OK - The file is here!"
  rc=0
fi

echo $msg
return $rc

It is simple as that (plus that you have to go through the tedious job of configuring the _chkcommand.cfg_ file and your Nagios services). With this you have a simple Nagios module.

To make this a NRPE module, which is remotely executed by the Nagios or OP5 server on the server of choice,  you just have to put this script somewhere on your monitored server, e.g in /opt/plugins/check_myfile and setup the NRPE configuration.

1
2
3
4
5
6
remote host $> sudo chmod 755 /opt/plugins/check_myfile
remote host $> grep check_myfile /etc/nrpe.d/my_config.cfg

command[myfile]=/opt/plugins/check_myfile

remote host $> sudo /etc/init.d/nrpe restart

On the Nagios server, check that your script works (my remote host has the IP address 192.168.2.90):

1
2
3
4
5
6
7
8
9
OP5 $> /opt/plugins/check_nrpe -H 192.168.2.90 -c myfile

CRITICAL - The file does not exist

remote_host $> touch /tmp/foo.bar

OP5 $> /opt/plugins/check_nrpe -H 192.168.2.90 -c myfile

OK - The file is here!

That is basically it! Now, go ahead and configure a new _nrpe_ service for a host in your OP5 environment, and put the work “myfile” in the “check_command_args” field, and you are done. Two minutes of work, and you save yourself tons of head ache.

DEBUG: The script _has to_ send at least something to stdout, it doesn’t really matter what. Othervise you will get an error message from the server side _check_nrpe_ script:

1
2
3
4
remote host $> grep echo
#  echo $msg
OP5 $> /opt/plugins/check_nrpe -H 192.168.2.90 -c myfile
NRPE: Unable to read output