Breaking News

Compression in Hadoop/Hbase

Installation of snappy in Hadoop/Hbase for ubuntu

1. Download snappy native library (current 1.1.o) from http://snappy.googlecode.com/files/snappy-1.1.0.tar.gz

    wget http://snappy.googlecode.com/files/snappy-1.1.0.tar.gz

2. Extract, build and install snappy. Make sure gcc and g++ installed before.


    tar zxvf snappy-1.1.0.tar.gz
    cd snappy-1.1.0
    ./configure
    make
    sudo make install

3. Download, build and install hadoop-snappy

3.1 Download it from GitHub( Install git)
    Install libtool, make sure ‘libtoolize’ works


    sudo apt-get install git
     sudo apt-get install libtool
    libtoolize
    git clone https://github.com/electrum/hadoop-snappy.git
        cd hadoop-snappy/

3.2 Install Maven 3 if necessary

    wget http://mirror.sdunix.com/apache/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz
     tar zxf apache-maven-3.0.5-bin.tar.gz
    sudo mv apache-maven-3.0.5 /usr/local
    cd /usr/local/
    sudo ln -s apache-maven-3.0.5/ maven
    sudo nano /etc/profile

add the below lines
 export MAVEN_HOME=/usr/local/maven
 export MAVEN_PATH=$MAVEN_HOME/bin
 export PATH=$MAVEN_PATH:$PATH

. /etc/profile

mvn -version


3.3   Hadoop Snappy Installation

navigate to hadoop-snappy folder(where git has cloned the hadoop-snappy)

in my case:

     cd  /home/cluster/snappy-1.1.0/hadoop-snappy

    mvn package -Dsnappy.prefix=/usr/local

Enter target/ directory, expand hadoop-snappy-0.0.1-SNAPSHOT.tar.gz, then

    cd target
    tar zxf hadoop-snappy-0.0.1-SNAPSHOT.tar.gz

    cd hadoop-snappy-0.0.1-SNAPSHOT

    cp -r lib/* $HADOOP_HOME/lib

If you Want to test against Hbase, please use the following command

    cp -r lib/* $HBASE_HOME/lib

3.4 Add the native library path to hbase-env.sh

export HBASE_LIBRARY_PATH=$HBASE_HOME/lib/native/Linux-amd64-64


4. You can check the snappy is working for hbase

    hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://namenode:8020/path snappy

You should get something like

13/04/12 11:19:34 WARN snappy.LoadSnappy: Snappy native library is available
13/04/12 11:19:34 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/12 11:19:34 INFO snappy.LoadSnappy: Snappy native library loaded
13/04/12 11:19:34 INFO compress.CodecPool: Got brand-new compressor
13/04/12 11:19:34 INFO compress.CodecPool: Got brand-new decompressor
SUCCESS

create 't5', { NAME => 'cf1', COMPRESSION => 'SNAPPY' }


No comments