2015-01-06

Setting up Cassandra Cluster in Virtual Machines

Intro

From time to time having just one Cassandra instance installed on your machine is not enough because you want to test certain behaviors when Cassandra cluster is up and running. Having extra spare hardware on the side or processing time on amazon is not always an option. So it's a good idea to setup a simple cluster on your own machine with instances in virtual machines. This post is going to show you how to do it with VirtualBox.

Getting VirtualBox Images

The reason why I chose VirtualBox is that there are lot of free virtual images available. Most of the time you'll be installing Cassandra on a Linux machine. I decided to go with the CentOS. Head over to http://virtualboxes.org/images/centos/ and download CentOS-6.6-x86_64-minimal. The default settings are fine for every machine. Create couple of them, give them names so that you can differentiate between them (Node1, Node2, etc. ...).

Perhaps the best idea would be for you to setup one node first and then make copies afterwards. Do not forget to set the network to bridged adapter. The username and password for the virtual machines are probably set to "root/reverse" but check those options when downloading the virtual box image. To keep it short I'll just continue with using the root user. When doing things in production it's an extremely bad practice.

Setup networking

When importing .ova file virtual box is going to ask you if you want to reinitialize mac address. Check that option. There is a certain amount of buggy behavior when it comes down to networking. So to prevent those errors run the following command when logging in to the virtual machine (root/reverse):

        rm  /etc/udev/rules.d/70-persistant-net.rules
    
When VirtualBoxinitializes the networking on the virtual machine it put a new mac address to a file. There seems to be a bug where this mac address is not transferred from that file to the virtual machine settings. Run the following command and copy the MAC Address.
        cat /etc/sysconfig/network-scripts/ifcfg-eth0
    
Shutdown the machine and set the mac address under Settings > Network > Advanced > MAC Address

Install Java

Just to make things a bit easier we're going to install wget:

        yum install wget
    
Now we are going to install java:
        $ cd /opt/
        $ wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u72-b14/jdk-7u72-linux-x64.tar.gz"
        $ tar xzf jdk-7u72-linux-x64.tar.gz
        $ rm jdk-7u72-linux-x64.tar.gz

        $ cd /opt/jdk1.7.0_72/

        $ alternatives --install /usr/bin/java java /opt/jdk1.7.0_72/bin/java 2
        $ alternatives --config java

        $ alternatives --install /usr/bin/jar jar /opt/jdk1.7.0_72/bin/jar 2
        $ alternatives --install /usr/bin/javac javac /opt/jdk1.7.0_72/bin/javac 2
        $ alternatives --set jar /opt/jdk1.7.0_72/bin/jar
        $ alternatives --set javac /opt/jdk1.7.0_72/bin/javac

        $ vi /etc/profile.d/java.sh
        export JAVA_HOME=/opt/jdk1.7.0_72
        export JRE_HOME=/opt/jdk1.7.0_72/jre
        export PATH=$PATH:/opt/jdk1.7.0_72/bin:/opt/jdk1.7.0_72/jre/bin
    
reboot (and check with echo $JAVA_HOME[enter])

Install Cassandra

Cassandra is installed and run by the following commands:

        $ cd /opt/
        $ wget http://downloads.datastax.com/community/dsc-cassandra-2.1.2-bin.tar.gz
        $ tar xzf dsc-cassandra-2.1.2-bin.tar.gz
        $ rm dsc-cassandra-2.1.2-bin.tar.gz

        [check ip address with ifconfig]

        $ cd conf

        $ vi cassandra.yaml
            rpc_address: ip address of the node
            broadcast_address: ip address of the node
            - seeds: ip_address of the first node

        $ cd ../bin
        $ ./cassandra
    

Firewall settings

The cluster will not work out of the box because of the firewall settings. To start everything you will need to enable the following ports:

        $ iptables -I INPUT -p tcp -m tcp --dport 9042 -j ACCEPT
        $ iptables -I INPUT -p tcp -m tcp --dport 7000 -j ACCEPT
        $ iptables -I INPUT -p tcp -m tcp --dport 7001 -j ACCEPT
        $ iptables -I INPUT -p tcp -m tcp --dport 7199 -j ACCEPT

        $ /etc/init.d/iptables save

        $ service iptables restart
    
Now make copies of this machine and update cassandra.yaml file with the ip addresses of the new machines. Also do check /var/log/cassandra/system.log to see if other nodes are joining in.