Infinite Programming Tips: Commands Available on HBase Shell

Thursday, April 5, 2012

Commands Available on HBase Shell

Groups of commands and explanation :

General Commands:
1. status : Shows server status, example
2. version : Shows version of hbase
DDL Commands:
1. alter : Using this commmand you can alter the table

Alter column family schema; pass table name and a dictionary

specifying new column family schema. Dictionaries are described

on the main help command output. Dictionary must include name

of column family to alter. For example,

To change or add the 'f1' column family in table 't1' from defaults

to instead keep a maximum of 5 cell VERSIONS, do:

hbase> alter 't1', NAME => 'f1', VERSIONS => 5

To delete the 'f1' column family in table 't1', do:

hbase> alter 't1', NAME => 'f1', METHOD => 'delete'

or a shorter version:

hbase> alter 't1', 'delete' => 'f1'

You can also change table-scope attributes like MAX_FILESIZE

MEMSTORE_FLUSHSIZE, READONLY, and DEFERRED_LOG_FLUSH.

For example, to change the max size of a family to 128MB, do:

hbase> alter 't1', METHOD => 'table_att', MAX_FILESIZE => '134217728'

There could be more than one alteration in one command:

hbase> alter 't1', {NAME => 'f1'}, {NAME => 'f2', METHOD => 'delete'}

create : create table in hbase

Create table; pass table name, a dictionary of specifications per

column family, and optionally a dictionary of table configuration.

Dictionaries are described below in the GENERAL NOTES section.

Examples:

hbase> create 't1', {NAME => 'f1', VERSIONS => 5}

hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}

hbase> # The above in shorthand would be the following:

hbase> create 't1', 'f1', 'f2', 'f3'

hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}

describe : Gives table Description
disable : Start disable of named table:

e.g. "hbase> disable 't1'"

drop : Drop the named table.

Table must first be disabled. If table has

more than one region, run a major compaction on .META.:

hbase> major_compact ".META."

enable : Start enable of named table

e.g. "hbase> enable 't1'"

exists : Does the named table exist? e.g. "hbase> exists 't1'"
is_disabled : Check if table is disabled
is_enabled : Check if table is enabled
list : List out all the tables present in hbase

DML Commands:
1. count :

Count the number of rows in a table. This operation may take a LONG

time (Run '$HADOOP_HOME/bin/hadoop jar hbase.jar rowcount' to run a

counting mapreduce job). Current count is shown every 1000 rows by

default. Count interval may be optionally specified. Scan caching

is enabled on count scans by default. Default cache size is 10 rows.

If your rows are small in size, you may want to increase this

parameter. Examples:

hbase> count 't1'

hbase> count 't1', INTERVAL => 100000

hbase> count 't1', CACHE => 1000

hbase> count 't1', INTERVAL => 10, CACHE => 1000

delete

Put a delete cell value at specified table/row/column and optionally

timestamp coordinates. Deletes must match the deleted cell's

coordinates exactly. When scanning, a delete cell suppresses older

versions. To delete a cell from 't1' at row 'r1' under column 'c1'

marked with the time 'ts1', do:

hbase> delete 't1', 'r1', 'c1', ts1

deleteall

Delete all cells in a given row; pass a table name, row, and optionally

a column and timestamp. Examples:

hbase> deleteall 't1', 'r1'

hbase> deleteall 't1', 'r1', 'c1'

hbase> deleteall 't1', 'r1', 'c1', ts1

get

Get row or cell contents; pass table name, row, and optionally

a dictionary of column(s), timestamp, timerange and versions. Examples:

hbase> get 't1', 'r1'

hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}

hbase> get 't1', 'r1', {COLUMN => 'c1'}

hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}

hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}

hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}

hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}

hbase> get 't1', 'r1', 'c1'

hbase> get 't1', 'r1', 'c1', 'c2'

hbase> get 't1', 'r1', ['c1', 'c2']

get_counter

Return a counter cell value at specified table/row/column coordinates.

A cell cell should be managed with atomic increment function oh HBase

and the data should be binary encoded. Example:

hbase> get_counter 't1', 'r1', 'c1'

incr

Increments a cell 'value' at specified table/row/column coordinates.

To increment a cell value in table 't1' at row 'r1' under column

'c1' by 1 (can be omitted) or 10 do:

hbase> incr 't1', 'r1', 'c1'

hbase> incr 't1', 'r1', 'c1', 1

hbase> incr 't1', 'r1', 'c1', 10

put

Put a cell 'value' at specified table/row/column and optionally

timestamp coordinates. To put a cell value into table 't1' at

row 'r1' under column 'c1' marked with the time 'ts1', do:

hbase> put 't1', 'r1', 'c1', 'value', ts1

scan

Scan a table; pass table name and optionally a dictionary of scanner

specifications. Scanner specifications may include one or more of:

TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,

or COLUMNS. If no columns are specified, all columns will be scanned.

To scan all members of a column family, leave the qualifier empty as in

'col_family:'.

Some examples:

hbase> scan '.META.'

hbase> scan '.META.', {COLUMNS => 'info:regioninfo'}

hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}

hbase> scan 't1', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}

hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}

For experts, there is an additional option -- CACHE_BLOCKS -- which

switches block caching for the scanner on (true) or off (false). By

default it is enabled. Examples:

hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}

truncate

Scan a table; pass table name and optionally a dictionary of scanner

specifications. Scanner specifications may include one or more of:

TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH,

or COLUMNS. If no columns are specified, all columns will be scanned.

To scan all members of a column family, leave the qualifier empty as in

'col_family:'.

Some examples:

hbase> scan '.META.'

hbase> scan '.META.', {COLUMNS => 'info:regioninfo'}

hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}

hbase> scan 't1', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}

hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}

For experts, there is an additional option -- CACHE_BLOCKS -- which

switches block caching for the scanner on (true) or off (false). By

default it is enabled. Examples:

hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}

hbase(main):025:0> truncate

ERROR: wrong number of arguments (0 for 1)

Here is some help for this command:

Disables, drops and recreates the specified table.

Tools Commands:
1. assign

Assign a region. Add 'true' to force assign of a region. Use with caution.

If region already assigned, this command will just go ahead and reassign

the region anyways. For experts only.

balance_switch

Assign a region. Add 'true' to force assign of a region. Use with caution.

If region already assigned, this command will just go ahead and reassign

the region anyways. For experts only.

hbase(main):027:0> balance_switch

ERROR: wrong number of arguments (0 for 1)

Here is some help for this command:

Enable/Disable balancer. Returns previous balancer state.

Examples:

hbase> balance_switch true

hbase> balance_switch false

balancer :

This will show if the balancer for hbase is enabled or not, hbase has a built-in feature that is called balancer, which by default runs every 5 minutes, and once started will try to equal out the assigned region, per reason servers.
close_region :

Close a single region. Optionally specify regionserver. Connects to the

regionserver and runs close on hosting regionserver. The close is done

without the master's involvement (It will not know of the close). Once

closed, region will stay closed. Use assign to reopen/reassign. Use

unassign or move to assign the region elsewhere on cluster. Use with

caution. For experts only. Examples:

hbase> close_region 'REGIONNAME'

hbase> close_region 'REGIONNAME', 'REGIONSERVER_IP:PORT'

compact :

Compact all regions in passed table or pass a region row to compact an individual region

flush

Flush all regions in passed table or pass a region row to

flush an individual region. For example:

hbase> flush 'TABLENAME'

hbase> flush 'REGIONNAME'

major_compact :

Run major compaction on passed table or pass a region row to major compact an individual region

move

Move a region. Optionally specify target regionserver else we choose one

at random. NOTE: You pass the encoded region name, not the region name so

this command is a little different to the others. The encoded region name

is the hash suffix on region names: e.g. if the region name were

TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then

the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396

A server name is its host, port plus startcode. For example:

host187.example.com,60020,1289493121758

Examples:

hbase> move 'ENCODED_REGIONNAME'

hbase> move 'ENCODED_REGIONNAME', 'SERVER_NAME'

split

Split table or pass a region row to split individual region

unassign : unssign the region server.
zk_dump : Dump status of HBase cluster as seen by ZooKeeper.

HBase is rooted at /hbase

Master address: shashwat.com:60000

Region server holding ROOT: shashwat.com:60020

Region servers:

shashwat.com:60020

Quorum Server Statistics:

localhost:2181

Zookeeper version: 3.3.2-1031432, built on 11/05/2010 05:32 GMT

Clients:

/127.0.0.1:50641[1](queued=0,recved=63,sent=65)

/127.0.0.1:50637[1](queued=0,recved=173,sent=226)

/127.0.0.1:50644[1](queued=0,recved=164,sent=198)

/127.0.0.1:50643[1](queued=0,recved=63,sent=65)

/127.0.0.1:51874[0](queued=0,recved=1,sent=0)

/127.0.0.1:50713[1](queued=0,recved=63,sent=63)

Latency min/avg/max: 0/8/210

Received: 536

Sent: 626

Outstanding: 0

Zxid: 0x32f0

Mode: standalone

Node count: 12

Replication Commands:
1. add_peer :

Add a peer cluster to replicate to, the id must be a short and

the cluster key is composed like this:

hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent

This gives a full path for HBase to connect to another cluster.

Examples:

hbase> add_peer '1', "server1.cie.com:2181:/hbase"

hbase> add_peer '2', "zk1,zk2,zk3:2182:/hbase-prod"

disable_peer :

Stops the replication stream to the specified cluster, but still

keeps track of new edits to replicate.

CURRENTLY UNSUPPORTED

Examples:

hbase> disable_peer '1'

enable_peer

Restarts the replication to the specified peer cluster,

continuing from where it was disabled.

CURRENTLY UNSUPPORTED

Examples:

hbase> enable_peer '1'

remove_peer

Stops the specified replication stream and deletes all the meta

information kept about it. Examples:

hbase> remove_peer '1'

start_replication

Restarts all the replication features. The state in which each

stream starts in is undetermined.

WARNING:

start/stop replication is only meant to be used in critical load situations.

Examples:

hbase> start_replication

stop_replication

Stops all the replication features. The state in which each

stream stops in is undetermined.

WARNING:

start/stop replication is only meant to be used in critical load situations.

Examples:

hbase> stop_replication

Infinite Programming Tips

Thursday, April 5, 2012

Commands Available on HBase Shell

No comments:

Post a Comment

Featured Posts

#Linux Commands Unveiled: #date, #uname, #hostname, #hostid, #arch, #nproc