DarkMatter in Cyberspace
  • Home
  • Categories
  • Tags
  • Archives

HBase Notes


HBase can be seen as a key-value database with structure: rowkey -> column family -> column (qualifier) -> value

HBase shell

Hbase shell is a DSL of JRuby. Start it with hbase shell:

help
list_namespace
list_namespace_tables 'default'  # show all tables in namespace 'default'
help 'list'
list   # display all the tables in HBase
desc 'SN'  # show column families of table 'SN'
count 'SN'
scan 'SN', {COLUMNS => '201905', LIMIT => 3} # list the first row of column family '201905'
scan 'SN', {COLUMNS => '201905:0568242218'} # list all cells of column '201905:0568242218' (in all rows)
scan 'SN', {COLUMNS => '201905:0568242218', LIMIT => 3} # list only first 3 cells in above results

get 'SN', '201905', {LIMIT => 5}
get 'SN', {COLUMNS => '201905', LIMIT => 5}

create 't2', {NAME => 'fa'}, {NAME => 'fb'}   # create a new table with 2 column families 'fa' and 'fb'
create 'wlwqx', '201901', '201902', '201903'  # create a table 'wlwqx' with 3 column families
alter 't2', NAME => 'fc'          # add a new column family 'fc'
alter 't2', 'delete' => 'fa'      # delete column family 'fc'
put 't2', 'row3', 'fc:kk', 33     # add a cell with value 33
scan 't2'

# you must disable a table before delete it
disable 't2'
drop 't2'

# you can delete multiple tables with regex
# the parameter of the following commands use regex syntax instead of shell wildcard
# so the dot is mandatory
disable_all 't.*'
drop_all 't.*'

Note: In a Ruby method call, you should omit the parentheses when the last parameter is a hash, which is equivalent to dict in Python.

Convert timestamp to datetime

Convert a timestamp 1557935890060 in a HBase cell to normal datetime with:

>>> import time
>>> time.ctime(1557935890060/1000)
'Wed May 15 23:58:10 2019'

Run Script

Only run a script:

cat << EOF > get_info
list_namespace_tables 'default'
desc 'SN'
exit
EOF

hbase shell get_info

Save output to a file: hbase shell <<< "scan 'SN', {COLUMNS => '201905'}" > sn201905.txt, or run multi-line script:

hbase shell << EOF > res.txt
> list_namespace_tables 'default'
> desc 'SN'
> EOF

Export Data to CSV file

Export table 111 in CDH HBase on host 220:

cd /opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4/lib/hbase
hbase org.apache.hadoop.hbase.mapreduce.Export 111 dump111
hadoop fs -ls ./dump111
hadoop fs -cat dump111/part-m-00000

Dump and Restore

$ hbase org.apache.hadoop.hbase.mapreduce.Export wlwqx /tmp/out
$ hadoop fs -ls /tmp/out  # verify the dumped files
$ hbase shell
> create 'newwl2', '201901'
> exit
$ hbase org.apache.hadoop.hbase.mapreduce.Import newwl2 /tmp/out2
$ hadoop fs -get /tmp/out .

Note:

  • The first parameter of Import command (newwl2 in this case) must already exist, and has the same column family with the origin table (newwl in this case);

  • The 2nd parameter of Import command (/tmp/out2 in this case) is a folder on HDFS, not in local FS;

  • The Export command dumps hbase table into HDFS, you need download them to local FS to backup.

Ref:

  • Export HBase data to csv

  • Working with the HBase Import and Export Utility



Published

Jun 11, 2019

Last Updated

Sep 16, 2019

Category

Tech

Tags

  • hbase 1

Contact

  • Powered by Pelican. Theme: Elegant by Talha Mansoor