hive2.1.0+mysql5.1本地模式+Hadoop完全分布式集群的安装与配置

数据库 来源:windy_girl 35℃ 0评论
  1. 下载hive等
  2. 配置部署
    2.1 配置环境变量
    2.2 在HDFS上创建相关目录
    2.3 修改hive-site.xml
    2.4 修改 hive-log4j2.properties和hive-exec-log4j2.properties
    2.5 mysql中创建相关内容
    2.6 使用schematool初始化
  3. 启动hive

前提:已安装好MySQL,hadoop能够启动后jps进程中至少存在以下6个进程:
DataNode
Jps
NodeManager
NameNode
ResourceManager
SecondaryNameNode
系统环境:主节点master:172.16.77.98
分节点slave:172.16.77.94,172.16.77.95,172.16.77.97
MySQL安装在master上,hive也下载安装在master上.

1.下载hive

[yqzhou@groot-sm ~]$ wget http://mirrors.cnnic.cn/apache/hive/hive-2.1.0/apache-hive-2.1.0-bin.tar.gz
//解压hive包
[yqzhou@groot-sm ~]$ tar -zxf apache-hive-2.1.0-bin.tar.gz

//建立软链接
[yqzhou@groot-sm ~]$ ln -s apache-hive-2.1.0-bin hive

[yqzhou@groot-sm ~]$ ls
apache-hive-2.1.0-bin
apache-hive-2.1.0-bin.tar.gz
hadoop-2.6.4
hive -> apache-hive-2.1.0-bin
soft

//下载jdbc
[yqzhou@groot-sm ~]$ wget http://cdn.mysql.com/Downloads/Connector-J/mysql-connector-java-5.1.39.tar.gz

//解压
[yqzhou@groot-sm ~]$ tar xzvf mysql-connector-java-5.1.36.tar.gz

//解压后的文件目录
[yqzhou@groot-sm soft]$ cd mysql-connector-java-5.1.39

[yqzhou@groot-sm mysql-connector-java-5.1.39]$ ll

total 1448
-rw-r–r–. 1 yqzhou yqzhou 90721 May 4 19:11 build.xml
-rw-r–r–. 1 yqzhou yqzhou 239747 May 4 19:11 CHANGES
-rw-r–r–. 1 yqzhou yqzhou 18122 May 4 19:11 COPYING drwxr-xr-x. 2 yqzhou yqzhou 4096 Aug 8 10:50 docs
-rw-r–r–. 1 yqzhou yqzhou 989497 May 4 19:11 mysql-connector-java-5.1.39-bin.jar
-rw-r–r–. 1 yqzhou yqzhou 61407 May 4 19:11 README
-rw-r–r–. 1 yqzhou yqzhou 63658 May 4 19:11 README.txt drwxr-xr-x. 8 yqzhou yqzhou 4096 May 4 19:11 src

//将jar包放到hive/lib目录下
[yqzhou@groot-sm ~]$ cp mysql-connector-java-5.1.33-bin.jar apache-hive-1.2.1-bin/lib/

2.1配置环境变量

//没有root权限的情况下配置bashrc文件,有的话vi /etc/profile
[yqzhou@groot-sm ~]$ vi ~/.bashrc

export HIVE_HOME=/home/yqzhou/apache-hive-2.1.0-bin
export PATH=$HIVE_HOME/bin

[yqzhou@groot-sm ~]$ source ~/.bashrc

2.2在hdfs上创建hive数据存放的相关目录

[yqzhou@groot-sm ~]

$HADOOP_HOME/bin/hadoop fs -mkdir /tmp

$HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse

$HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp

$HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse

$HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/log

$HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/log

**

2.3修改hive-site.xml

**

[yqzhou@groot-sm ~]$ cp hive-default.xml.template hive-site.xml

//设置 hive仓库的HDFS上的位置

以下每个配置都分别配置在

和里面

hive.metastore.warehouse.dir
/user/hive/warehouse//对应以上在hdfs创建的文件目录
location of default database for the warehouse

//设置元数据存放的数据库地址、名字

javax.jdo.option.ConnectionURL
jdbc:mysql://172.16.77.98:3306

  JDBC connect string for a JDBC metastore.
  To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
  For example, jdbc:postgresql://myhost/dbName?ssl=true for postgres database.

Driver class名字

javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore

连接使用的用户名–>

javax.jdo.option.ConnectionUserName
hive
Username to use against metastore database

连接使用的密码–>

javax.jdo.option.ConnectionPassword
hive
password to use against metastore database

重要配置项

 hive.metastore.uris
thrift://172.16.77.98:9083
 Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.

如果出现hive启动报错 java.net.URISyntaxException: Relative path in absolute URI: {system:java.io.tmpdir%7D/$%7B,则在hive-site.xml继续修改以下三个属性.


    hive.exec.scratchdir
    /tmp/hive
    HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/ is created, with ${hive.scratch.dir.permission}.
  

  
    hive.exec.local.scratchdir
    /tmp/hive/local
    Local scratch space for Hive jobs
  

  
    hive.downloaded.resources.dir
    /tmp/hive/resources
    Temporary local directory for added resources in the remote file system.
  

2.4修改conf下的 hive-log4j2.properties和hive-exec-log4j2.properties

(1)$ cp hive-log4j.properties.template hive-log4j.properties

打开hive-log4j.properties修改日志文件存放目录,将日志目录由/tmp/${user.name}改为/user/hive/log/

即:property.hive.log.dir = ${mine:/user/hive/log}/${user.name}

(2)$ cp hive-exec-log4j.properties.template hive-exec-log4j.properties

property.hive.log.dir = ${mine:/user/hive/log/exec}/${user.name}

2.5 mysql中创建相关内容

为hive建立相应的MySQL帐号,帐号:hive,密码:hive,并赋予任意主机通过hive用户远程登录权限;

mysql> create user 'hive' identified by 'hive';
Query OK, 0 rows affected (0.28 sec)
#赋予任意主机通过hive用户远程登录权限
mysql> grant all privileges on *.* to 'hive'@'%' with grant option;
Query OK, 0 rows affected (0.00 sec)
#建立Hive专用的元数据库。
mysql> create database hive;
Query OK, 1 row affected (0.00 sec)
#查看所有的数据库
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| performance_schema |
+--------------------+
6 rows in set (0.00 sec)

2.6 使用schematool初始化

安装配置好后,在启动Hive服务端之前,需要在服务端执行一次
“schematool -dbType mysql -initSchema”,以完成对metastore的初始化。特别注意:该操作只需要操作一次,第二次启动Hive时不需要执行此操作。bin目录下执行.
[yqzhou@groot-sm bin]$ schematool -dbType mysql -initSchema
Metastore connection URL:    jdbc:mysql://172.16.77.98:3306
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:   hive
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.mysql.sql
Initialization script completed
schemaTool completeted

3.启动hive

启动metastore服务
[yqzhou@groot-sm bin]$ hive --service metastore &
[2] 31995
[1]   Killed                  hive --service metastore  (wd: ~)
(wd now: ~/hive/bin)

//若成功启动则自动切换到以下
[yqzhou@groot-sm bin]$ Starting Hive Metastore Server

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/yqzhou/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yqzhou/soft/hadoop-2.6.4/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
|光标闪烁无下文....

按Ctrl+C退出.

[yqzhou@groot-sm bin]$ jps
2771 DataNode
32229 Jps
3192 NodeManager
31995 RunJar
2668 NameNode
3085 ResourceManager
2926 SecondaryNameNode

多了一个RunJar,成功.

[yqzhou@groot-sm ~]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/yqzhou/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yqzhou/soft/hadoop-2.6.4/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/home/yqzhou/apache-hive-2.1.0-bin/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
hive> show tables;
OK
test1
Time taken: 1.562 seconds, Fetched: 1 row(s)
hive> 

test1是我第一次启动hive时创建的.

接下来在slave客户端连接hive,如果没有配置hive_home等环境变量,需要进入hive的bin目录下输入 ./hive 启动hive.否则会包找不到

[yqzhou@groot-hs bin]$ hive
-bash: hive: command not found