Compression in Hadoop/Hbase
Installation of snappy in Hadoop/Hbase for ubuntu
1. Download snappy native library (current 1.1.o) from http://snappy.googlecode.com/files/snappy-1.1.0.tar.gz
wget http://snappy.googlecode.com/files/snappy-1.1.0.tar.gz
2. Extract, build and install snappy. Make sure gcc and g++ installed before.
tar zxvf snappy-1.1.0.tar.gz
cd snappy-1.1.0
./configure
make
sudo make install
3. Download, build and install hadoop-snappy
3.1 Download it from GitHub( Install git)
Install libtool, make sure ‘libtoolize’ works
sudo apt-get install git
sudo apt-get install libtool
libtoolize
git clone https://github.com/electrum/hadoop-snappy.git
cd hadoop-snappy/
3.2 Install Maven 3 if necessary
wget http://mirror.sdunix.com/apache/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz
tar zxf apache-maven-3.0.5-bin.tar.gz
sudo mv apache-maven-3.0.5 /usr/local
cd /usr/local/
sudo ln -s apache-maven-3.0.5/ maven
sudo nano /etc/profile
add the below lines
export MAVEN_HOME=/usr/local/maven
export MAVEN_PATH=$MAVEN_HOME/bin
export PATH=$MAVEN_PATH:$PATH
. /etc/profile
mvn -version
3.3 Hadoop Snappy Installation
navigate to hadoop-snappy folder(where git has cloned the hadoop-snappy)
in my case:
cd /home/cluster/snappy-1.1.0/hadoop-snappy
mvn package -Dsnappy.prefix=/usr/local
Enter target/ directory, expand hadoop-snappy-0.0.1-SNAPSHOT.tar.gz, then
cd target
tar zxf hadoop-snappy-0.0.1-SNAPSHOT.tar.gz
cd hadoop-snappy-0.0.1-SNAPSHOT
cp -r lib/* $HADOOP_HOME/lib
If you Want to test against Hbase, please use the following command
cp -r lib/* $HBASE_HOME/lib
3.4 Add the native library path to hbase-env.sh
export HBASE_LIBRARY_PATH=$HBASE_HOME/lib/native/Linux-amd64-64
4. You can check the snappy is working for hbase
hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://namenode:8020/path snappy
You should get something like
13/04/12 11:19:34 WARN snappy.LoadSnappy: Snappy native library is available
13/04/12 11:19:34 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/12 11:19:34 INFO snappy.LoadSnappy: Snappy native library loaded
13/04/12 11:19:34 INFO compress.CodecPool: Got brand-new compressor
13/04/12 11:19:34 INFO compress.CodecPool: Got brand-new decompressor
SUCCESS
create 't5', { NAME => 'cf1', COMPRESSION => 'SNAPPY' }
wget http://snappy.googlecode.com/files/snappy-1.1.0.tar.gz
2. Extract, build and install snappy. Make sure gcc and g++ installed before.
tar zxvf snappy-1.1.0.tar.gz
cd snappy-1.1.0
./configure
make
sudo make install
3. Download, build and install hadoop-snappy
3.1 Download it from GitHub( Install git)
Install libtool, make sure ‘libtoolize’ works
sudo apt-get install git
sudo apt-get install libtool
libtoolize
git clone https://github.com/electrum/hadoop-snappy.git
cd hadoop-snappy/
3.2 Install Maven 3 if necessary
wget http://mirror.sdunix.com/apache/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz
tar zxf apache-maven-3.0.5-bin.tar.gz
sudo mv apache-maven-3.0.5 /usr/local
cd /usr/local/
sudo ln -s apache-maven-3.0.5/ maven
sudo nano /etc/profile
add the below lines
export MAVEN_HOME=/usr/local/maven
export MAVEN_PATH=$MAVEN_HOME/bin
export PATH=$MAVEN_PATH:$PATH
. /etc/profile
mvn -version
3.3 Hadoop Snappy Installation
navigate to hadoop-snappy folder(where git has cloned the hadoop-snappy)
in my case:
cd /home/cluster/snappy-1.1.0/hadoop-snappy
mvn package -Dsnappy.prefix=/usr/local
Enter target/ directory, expand hadoop-snappy-0.0.1-SNAPSHOT.tar.gz, then
cd target
tar zxf hadoop-snappy-0.0.1-SNAPSHOT.tar.gz
cd hadoop-snappy-0.0.1-SNAPSHOT
cp -r lib/* $HADOOP_HOME/lib
If you Want to test against Hbase, please use the following command
cp -r lib/* $HBASE_HOME/lib
3.4 Add the native library path to hbase-env.sh
export HBASE_LIBRARY_PATH=$HBASE_HOME/lib/native/Linux-amd64-64
4. You can check the snappy is working for hbase
hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://namenode:8020/path snappy
You should get something like
13/04/12 11:19:34 WARN snappy.LoadSnappy: Snappy native library is available
13/04/12 11:19:34 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/12 11:19:34 INFO snappy.LoadSnappy: Snappy native library loaded
13/04/12 11:19:34 INFO compress.CodecPool: Got brand-new compressor
13/04/12 11:19:34 INFO compress.CodecPool: Got brand-new decompressor
SUCCESS
create 't5', { NAME => 'cf1', COMPRESSION => 'SNAPPY' }
No comments