Monday, 28 June 2010

IPTables Lost After Reboot of Head Node

IPTables Lost After Reboot of Head Node

The iptables were lost every time the machine was rebooted so I found I needed to save the iptabels to a file /etc/iptables.rules.
Use the following command to use NAT for the compute nodes sudo iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE
Then in Ubuntu become su by using sudo su then use the command iptables-save > /etc/iptables.rules

Wednesday, 16 June 2010

Wake on LAN


I have been working with the wakeonlan command and it does not work with the command eg wakeonlan 00:30:48:70:45:8c.
I have however got it to work with wakeonlan -i 00:30:48:70:45:8c by
using the broadcast address for the private network the node is on.
I have tried it a good number of times and it is consistently switching on the compute-4 node.
I think the main problem was to do with the wakeonlan command not knowing where to send the magic packet either on the bridge network or the private network.

Monday, 7 June 2010

Mac OS X Command Line Software Update

Command line Software Update

I often want to do a software update on Mac OS X from the command line. I use the following command.

sudo softwareupdate -i -a


Thursday, 3 June 2010

SSH Connection Problems

Ssh-agent Problem

When I attempted to use the command 'ssh-agent' then the command 'ssh-add' to enter my rsa passphrase and allow me to login to a host without typing my passphrase in all the time. The error I got after entering 'ssh-add' was 'Could not open a connection to your authentication agent'.
The solution was to enter the command 'exec ssh-agent bash' the command 'ssh-add' then the passphrase. It worked after this not sure why, it used to work before, could have been an upgrade in Ubuntu 9.10.

Hadoop Cluster Problems

Network Instability

The cluster had intermittent network availability, sometimes it would accept ssh connections only sometimes. The connections only seemed to last a period of 10 minutes before they were disconnected.
It was initailly thought that it may have been something to do with the dhcp server but it turned out that this was not the cause.
One of the symptoms was during a ping to the DNS server one would get a few returns then a few Destination Host Unreachable errors.
It was eventually traced to having two gateways setup in the '/etc/network/interfaces' one on the public network and one on the private network, when this was corrected to having only one public gateway this fixed the problem.

Setting up Open-SSH in Ubuntu


I set-up Open-SSH by installing the package ssh using the command 'sudo apt-get -y install ssh'.
I then edited the config file /etc/ssh/sshd_config to include a reference to a banner txt file, change the ssh port to 11000 and add 'UseDNS no' entry, this cures some login delays.

The banner.txt file is displayed when you first login to the machine remotely, it gives warnings about acceptable use policy etc. I chose to listen on port 11000, this has prevented a lot of login attempts on my home server.

To test the server I used a command 'scp -P 11000 file.txt jimp@mumetal:/home/jimp' from a remote machine this transferred a file.txt from the remote machine to the home directory on my machine using ssh.


Wednesday, 2 June 2010

Ubuntu Enterprise Cloud

Hadoop Cluster

I had problems with ssh passwordless logins taking ages on the hadoop Ubuntu cluster I was working on. This was fixed by adding a 'UseDNS no' entry to /etc/ssh/sshd_config file and restarting the ssh daemon using the command sudo /etc/init.d/ssh restart.

The compute nodes on the cluster could not access the internet so NAT had to be setup on the head hadoop node.

Where eth0 is the network card with access to the private network.
The following NAT was setup as follows using the command 'sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE' and '/proc/sys/net/ipv4/ip_forward' file entry should be 1.

Restart dnsmasq using the following command 'sudo /etc/init.d/dnsmasq restart'.

Configure the NAT with the following command 'iptables -t nat -A POSTROUTING -o eth1 -j MASQUERADE'. Where eth1 is the network card with access to the internet.

I tested the cluster by pinging a url on the internet with the command 'ping -c 4' the result was as follows

PING ( 56(84) bytes of data.
64 bytes from ( icmp_seq=1 ttl=51 time=18.8 ms
64 bytes from ( icmp_seq=2 ttl=51 time=18.8 ms
64 bytes from ( icmp_seq=3 ttl=51 time=18.9 ms
64 bytes from ( icmp_seq=4 ttl=51 time=18.8 ms

--- ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 18.800/18.840/18.912/0.106 ms

thus proving that the NAT settings had worked.

Information used was from the page for this blog