Hive Installation
After Installing Hadoop:
1.Download Hive:
2.wget https://archive.apache.org/dist/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
3.tar -xvzf apache-hive-3.1.2-bin.tar.gz
4.sudo mv apache-hive-3.1.2-bin /usr/local/hive
Set Environment Variables: Edit your .bashrc file to include Hive environment variables:
5.nano ~/.bashrc
Add the following lines:
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
Apply the changes:
source ~/.bashrc
Configure Hive: Edit the hive-config.sh file to set the Hadoop home path:
6.nano $HIVE_HOME/conf/hive-config.sh
7.cd /usr/local/hive/conf
8.nano hive-config.sh
Add the following line:
export HADOOP_HOME=/usr/local/hadoop
Create Hive Directories in HDFS:
9.hdfs dfs -mkdir /tmp
10.hdfs dfs -mkdir -p /user/hive/warehouse
11.hdfs dfs -chmod ugo+w /
12.hdfs dfs -chmod ugo+w /tmp
13.hdfs dfs -chmod ugo+w /user/hive/warehouse
Initialize the Derby Database:
14.schematool -dbType derby -initSchema
Initialization script completed
schemaTool completed
Start Hive:
hive
hive> show databases;
HIVE DATA TYPES
Integer types
TINYINT 1 BYTE
SMALLINT 2 BYTE
INT 4 BYTE
BIGINT 8 BYTE
Decimal types
FLOAT 4 BYTE Single Precision
DOUBLE 8 BYTE Double Precision
String types
String “xyz” 1-65535
Varchar “ “ length count
CHAR , ,
Complex types
Struct
Map map<”hey”,value>
Array array[‘a’,’b’,’c’]
Hive DLL Commands
Data Definition Language
create
use
show
describe
drop
alter
hive> show databases;
hive> create schema db;
hive> drop database db;
hive> show databases;
hive> create schema emp_db;
hive> show databases;
hive> use emp_db;
hive> create table employee (id int,name string, salary int,position string);
hive> describe employee;
hive> alter table employee change position role string;
hive> describe employee;
hive> drop table employee;
hive> show tables employee;
hive> show tables;
Hive DML Commands
Data Manipulation Language
load
select
insert
delete
update
Check Database
hive> show databases;
default
emp_db
hive> use emp_db;
hive> create table employee(id int,name string, salary int, role string)row format delimited fields terminated by ‘\t’;
Create sample.txt using Tab space
$ nano sample.txt
1 Alen 50000 devloper
2 Tom 45000 jr. devloper
3 Harry 20000 sales
Save and Exit Press CTRL+X
$ls
hive> load data local inpath ‘/home/user/sample.txt’ overwrite into table employee;
hive> select * from employee;
1 Alen 50000 devloper
2 Tom 45000 jr. devloper
3 Harry 20000 sales
hive> insert into table employee values (4,’Vicky’,40000,’Marketing’);
hive> select * from employee;
4 Vicky 40000 Markiting
1 Alen 50000 devloper
2 Tom 45000 jr. devloper
3 Harry 20000 sales
hive> delete from employee where salary=20000; not delete
hive> update employee set role=”Sr devloper” where id=1; not update
Working only use ORC (Optimize Row Column) Table format (Solution next post)
Hive Extra Commands with Practical Examples
