Breaking News

CDH4 Hadoop/HBase Installation for Ubuntu 12.04 LTS

Changing the Hostname and /etc/hosts(This will be applicable for manual and automated installation)

Step A:Change the hostname of each machine to a meaningful name. Login to each node as ubuntu user(In case of AWS-VPC)

sudo nano /etc/hostname
delete the content and add master.domain for master machine, slave1.domain for slave1 node and slave2.domain for slave2.

Step B: restart the hostname service

sudo service hostname restart

Step C: /etc/hosts might contain something of the form: localhost.localdomain localhost master.domain master slave1.domain slave1 slave2.domain slave2

Manual Installation

1)Install Oracle Java
2)Passwordless SSH from master to slaves
3) Hadoop Configuration & Hbase Configuration
4) Test the setup

1. Install Oracle Java

1.Download Oracle Java from

Accept the license and download "jdk-6u32-linux-x64.bin" and Keep it in Downloads directory of the linux machine.

2.Create the installation folder

Command: sudo mkdir -p /usr/lib/jvm

3. Navigate to Downloads Directory

Command: cd ~/Downloads

4. Move the downloaded files to the installation folder

Command: sudo mv jdk-6u32-linux-x64.bin /usr/lib/jvm

5. Naviagate to the installation folder

Command: cd /usr/lib/jvm

6. Make the downloaded binaries executable

Command: sudo chmod u+x jdk-6u32-linux-x64.bin

7. Extract both compressed binary files

Command: sudo ./jdk-6u32-linux-x64.bin

8. Check your extracted folder names

Command: ls -l
Check: jdk1.6.0_32 directory is there

9. Inform Ubuntu where your Java installation is located

Command: sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.6.0_32/bin/java" 1

10. Inform Ubuntu that this is your default Java installation

Command: sudo update-alternatives --set java /usr/lib/jvm/jdk1.6.0_32/bin/java

11. Update your system-wide PATH

Command: sudo nano /etc/profile

Add below lines to the /etc/profile at the end.

export JAVA_HOME
export PATH

12. Reload your system-wide PATH

Command: source /etc/profile

13. Test your new installation

Command: java -version

Output : java version "1.6.0_32"
Java(TM) SE Runtime Environment (build 1.6.0_32-b05)
Java HotSpot(TM) Client VM (build 20.7-b02, mixed mode, sharing)

Then its installed properly !!!

2. Passwordless SSH

Need to have the same user in all nodes and logged in as the same user

Step 1: Generate SSH Key

Command: ssh-keygen -t rsa

Step 2: Enable SSH access to your master machine with this newly generated key

Command: cat $HOME/.ssh/ >> $HOME/.ssh/authorized_keys

Step 3: Copy the master machine's public key to slave machine's authorized keys

Master machine
a)Command : cat $HOME/.ssh/

b) copy the content

Slave machine
c)Command : nano $HOME/.ssh/authorized_keys
d) paste the content
e) press Ctrl+o
f) Press Enter
g) Press Ctrl+x

Step 4: From master machine, check by issueing

ssh slave

Verify that it should not ask password and it has to login

3. Hadoop Configuration

Step 1: create the source directories and hadoop filesystem directories and give necessary permissions.

sudo mkdir  /hadoop

sudo chmod -R 755 /hadoop

sudo chown -R ubuntu:ubuntu /hadoop

sudo mkdir /dfs

sudo chmod -R 755 /dfs

sudo chown -R ubuntu:ubuntu /dfs

Note: Execute the commands in all nodes

Step 2: Navigate to /hadoop directory and Download the hadoop packages from the below site

a) Download hadoop-2.0.0+922 from
b) Download hbase-0.94.2+202 from
c) download mr1-2.0.0-mr1-cdh4.2.0

cd /hadoop




Step 3: Extract the packages and rename the directory.

tar zxf hadoop-2.0.0-cdh4.2.0.tar.gz
tar zxf hbase-0.94.2-cdh4.2.0.tar.gz
tar zxf mr1-2.0.0-mr1-cdh4.2.0.tar.gz

mv hadoop-2.0.0-cdh4.2.0 chadoop-2.0.0
mv hbase-0.94.2-cdh4.2.0 chbase-0.94.2
mv mr1-2.0.0-mr1-cdh4.2.0  hadoop-2.0.0-mr1

Step 4: Navigate to hadoop-2.0.0-cdh4.2.0 conf and make necessary configuration changes

cd /hadoop/chadoop-2.0.0/etc/hadoop

a) core-site.xml

nano core-site.xml

and paste the below contents in between the <configuration> </configuration> tags.


b) hdfs-site.xml

nano hdfs-site.xml

and paste the below contents in between the <configuration> </configuration> tags.


c) mapred-site.xml

cd /hadoop/hadoop-2.0.0-mr1/conf

nano  mapred-site.xml

and paste the below contents in between the <configuration> </configuration> tags.

    <value> -Xmx419222449</value>


cd /hadoop/chadoop-2.0.0/etc/hadoop

and paste the below content

export JAVA_HOME=/usr/lib/jvm/jdk1.6.0_32


export HADOOP_OPTS="-server  -XX:+UseConcMarkSweepGC $HADOOP_CLIENT_OPTS"

and update the same in  /hadoop/hadoop-2.0.0-mr1/conf/

e) slaves

cd /hadoop/chadoop-2.0.0/etc/hadoop
nano slaves

add the slave node each one in single line

update teh same in /hadoop/hadoop-2.0.0-mr1/conf/slaves

f) hbase-site.xml

cd /hadoop/chbase-0.94.2/conf

nano hbase-site.xml

and paste the below contents in between the <configuration> </configuration> tags.



cd /hadoop/chbase-0.94.2/conf


add the following lines

export JAVA_HOME=/usr/lib/jvm/jdk1.6.0_32
export HBASE_HEAPSIZE=2048
export HBASE_OPTS="-server  $HBASE_OPTS -XX:+UseConcMarkSweepGC"

Step 4: Transfer the configuration files to all the slave nodes

scp -r /hadoop/hadoop-2.0.0-mr1 /hadoop/chadoop-2.0.0 /hadoop/chbas* slave1:/hadoop
repeat the same for all datanodes.

Step 5: Format the namenode

Navigate to namenode and format the namenode

/hadoop/chadoop-2.0.0/bin/hdfs namenode -format

Step 6: start dfs

Navigate to namenode and issue the command


step 7: start mapred

Navigate to namenode and issue the command


wait for few seconds for the namenode to come out of safe mode

step 8: check the hadoop works by creating directory,put some files and get

/hadoop/chadoop-2.0.0/bin/hdfs dfs -mkdir /test
/hadoop/chadoop-2.0.0/bin/hdfs dfs -put /sourcefile /test/ # replace the /sourcefile with the your file

step 9: start the hbase


/hadoop/chbase-0.94.2/bin/hbase shell

create 't1',{NAME=>'cf'}
put 't1','r1','cf:c1','v1'

scan 't1'

If you get the results, then hadoop and hbase is working.

Advanced Hbase configuration

1.sudo nano /etc/security/limits.conf

add the content

ubuntu  -       nofile  32768

2.sudo nano /etc/pam.d/common-session

add the content

session required

To format the disk with ext3

sudo mkfs.ext3 -m 1 /dev/xvdd

For mount the disk with no accesstime option

sudo mount -O noatime /dev/xvdd /dfs

No comments