This weekend I wanted to use Hbase with Zeppelin on my Hadoop cluster. The Hbase interpreter does not appear to be available by default with my install of Hortonworks (HDP 2.6), so I had to install it. This was the first interpreter set-up that was not smooth, so I thought I would share my experience. FYI – I am running an IaaS cluster in Azure on Ubuntu VMs.
Here are the simple steps I took to fulfill my weekend dream.
Installing Hbase Interpreter on Zeppelin Server
You will need to SSH into the server running Zeppelin. You can find this server using Ambari, just click on the Zeppelin Notebook menu item, then click or hover over Zeppelin Notebook in the summary section.
Once SSH-ed into the server, let’s look at the available interpreters. To view the interpreters, run the following:
sudo /usr/hdp/current/zeppelin-server/bin/install-interpreter.sh --list
A similar list should be displayed:
To install the Hbase interpreter, run:
sudo /usr/hdp/current/zeppelin-server/bin/install-interpreter.sh --name hbase
Once the installation is complete, we are ready to add the Hbase interpreter using the Zeppelin UI.
Adding Interpreter to Zeppelin
Click Interpreter on the Zeppelin UI; Use the drop-down arrow next to your logged-in user name at the top right.
Click the Create button to add the Hbase Interpreter. Give it the name hbase and select the group hbase (the SSH command we ran added the hbase to the groups). Once you choose the hbase group three default properties will be added (this is very helpful when a lot of properties are required).
Let’s leave the defaults for now. Click the Save button.
Create a new notebook and… we need to ensure the Hbase interpreter is available. Almost forgot.
Click on the small gear icon in the top right of the notebook.
Select our Hbase interpreter – it should be a blue-ish color when selected. Click the Save button.
Alright back to the notebook.
Create a new notebook paragraph and let’s create a new Hbase table with our new interpreter. Right now I want a table to capture comfort conditions inside my house. It is cold and dry, I have IOT sensors that are streaming data to be stored into Hbase. Maybe someday I can use an Azure Function to tell me to move south when it get too cold in my house.
Let’s write and run the following:
%hbase create 'Sensors','Comfort'
You may get an error:
org.apache.zeppelin.interpreter.InterpreterException: HBase ruby sources is not available at ‘/usr/lib/hbase/lib/ruby’
I discovered the default directory does not exist. Simple work around. Use the current client directory, everything required appears to be available.
Go back to the Interpreter screen and edit your Hbase interpreter. Update Hbase home with:
/usr/hdp/current/hbase-client/
Problem Solved!!!
Or is it…
When the code is ran again I get the error:
org.jruby.exceptions.RaiseException: (NameError) cannot load Java class org.apache.hadoop.hbase.quotas.ThrottleType
Dependency jars
I found that the following dependencies need to be added to the Hbase Interpreter. Thus far (fingers crossed) I have not had any addition problems after adding these dependencies.
/usr/hdp/current/hbase-client/lib/hbase-client.jar /usr/hdp/current/hbase-client/lib/hbase-protocol.jar /usr/hdp/current/hbase-client/lib/hbase-common.jar
Everything should work now once the Interpreter has been saved and restarted. Running the code again should produce something similar to:
0 row(s) in 1.9130 seconds
Fun with Hbase and Zeppelin
Let’s add some data; create a new notebook paragraph and add:
%hbase put 'Sensors', 1, 'Comfort:Temperature', 68 put 'Sensors', 1, 'Comfort:Humidity', 0.22 put 'Sensors', 2, 'Comfort:Temperature', 65 put 'Sensors', 2, 'Comfort:Humidity', 0.23 put 'Sensors', 3, 'Comfort:Temperature', 60 put 'Sensors', 3, 'Comfort:Humidity', 0.21 put 'Sensors', 4, 'Comfort:Temperature', 55 put 'Sensors', 4, 'Comfort:Humidity', 0.20 put 'Sensors', 4, 'Comfort:Relocate', 'TRUE'
Let’s create a new notebook paragraph and view the data:
%hbase scan 'Sensors'
Your results should look similar to:
ROW COLUMN+CELL 1 column=Comfort:Humidity, timestamp=1518416761608, value=0.22 1 column=Comfort:Temperature, timestamp=1518416761602, value=68 2 column=Comfort:Humidity, timestamp=1518416761617, value=0.23 2 column=Comfort:Temperature, timestamp=1518416761612, value=65 3 column=Comfort:Humidity, timestamp=1518416761625, value=0.21 3 column=Comfort:Temperature, timestamp=1518416761621, value=60 4 column=Comfort:Humidity, timestamp=1518416761634, value=0.2 4 column=Comfort:Relocate, timestamp=1518416761638, value=TRUE 4 column=Comfort:Temperature, timestamp=1518416761629, value=55 4 row(s) in 0.0150 seconds
Where Next?
Next time I will query this table from Zeppelin using Phoenix and Hive. Using Hive or Phoenix with Hbase in Zeppelin is a great way to conduct data exploration and gain some quick insights.
I hope this helps those having the same problems I had. I will be sure to keep this article current if I come across new information.
Best Regards,
Jonathan
Categories: Hadoop
what version of hbase are you using? I read it works with hbase 1.0 but not with hbase 1.1 ?
LikeLike
At the time I believe I was using HDP 2.5, which would be HBase 1.1.2
LikeLike
i have try all thos configure, but keep getting this message on zeppelin
TABLE
ERROR: KeeperErrorCode = ConnectionLoss for /hbase
Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list
hbase> list ‘abc.*’
hbase> list ‘ns:abc.*’
hbase> list ‘ns:.*’
LikeLike
Are you using HDP or other? What are the versions: Zeppelin, Hbase, etc?
LikeLike
Hi Jonathan, I am using HDP 2.6 with Zeppelin 0.8 version. I am getting same error. any idea how to resolve?
LikeLike
I’ll see if I can recreate this week to get you an answer. Out of curiosity is there anything you could tell me about the cluster, such as cloud IaaS VMs, on premise cluster, PaaS cloud offering?
LikeLike
This one I am trying from on-prem cluster. I entered hbase _home in zeppelin-env under conf. and necessary entries in interpreter. If I type
%hbase
help table_help
I am getting Help for table-reference commands..
But from zeppelin editor, not establishing connection to namespace and table.
LikeLike
This one I am trying from on-prem cluster. I entered hbase _home in zeppelin-env under conf. and necessary entries in interpreter. If I type
%hbase
help table_help
I am getting Help for table-reference commands..
But from zeppelin editor, not establishing connection to namespace and table.
LikeLike
jonathan hice todos los pasos pero me sale el sigueinte error org.apache.hadoop.hbase.Hconstants
LikeLike
Hi Jonathan, I’m using HDP 3.1.0 and getting the following error:
org.jruby.exceptions.RaiseException: (NameError) cannot link Java class org.apache.hadoop.hbase.HConstants, probable missing dependency: org/apache/commons/lang3/ArrayUtils
at org.jruby.javasupport.JavaUtilities.get_proxy_or_package_under_package(org/jruby/javasupport/JavaUtilities.java:54)
at (Anonymous).method_missing(/builtin/javasupport/java.rb:51)
at Module.HBaseConstants(/usr/hdp/current/hbase-client/lib/ruby/hbase_constants.rb:39)
at (Anonymous).(root)(/usr/hdp/current/hbase-client/lib/ruby/hbase_constants.rb:34)
at org.jruby.RubyKernel.require(org/jruby/RubyKernel.java:1062)
at (Anonymous).(root)(/usr/hdp/current/hbase-client/lib/ruby/hbase_constants.rb:105)
Do you know how to solve it?
LikeLike
Great post. The step of verifying if the interpreter is active helped me a lot, thanks!
LikeLike