日志分析工具 安装篇(cm druid)

最近需要用到一些数据处理工具,折腾了一下安装,其实还是有点麻烦的。为了后续安装可以直接快速搞定,在这里记录一下整个流程。

clouder manager可以用来管理CDH/flume/spark等,主要体现在对机器状态的管理中,有免费版,不开源,但是用起来还是不错的。

而druid在日志分析领域相对于hive、kudu这样的工具来说,上手简单,除了不保存原始数据外,在查询速度、图形化界面这点上,可以快速实现一个简单版本的日志分析系统。

我选择了这两个工具来实现我的第一个简单版本的日志分析系统。

一、clouder manager的安装

1.1 机器准备

直接用的aws的ec2,使用的是centos6.8 64G内存,8核,40G SSD作为主目录 以及1T 磁盘,挂载到/cluster,挂弹性ip

可以查看一下已有的磁盘以及现有的系统文件格式
[root@ip-172-31-12-89 software]# df -T
以及系统文件格式为:ext4

mkdir /cluster  
mkfs -t ext4 /dev/xvdb  
mount  -t ext4  /dev/xvdb /cluster  
df -T确定一下  
  • 需要注意的问题 不要使用AWS自己默认的操作系统,没有/etc/redhat.release,这样cm识别不了版本.aws自己的系统看上去是centos但我们并不知道它是经过哪个版本修改后的,有可能是5.*,为了避免后续的麻烦,一定不要使用!选择aws上面的centos镜像吧。如果是运维同事帮你选择机器,记得提醒一下他们。我在这点上面浪费过时间。
1.1.1 设置网络环境

vi /etc/hosts:

172.31.13.153 yourmachine02  
172.31.13.191 yourmachine03  
172.31.13.100 yourmachine04  

设置SSH无密码访问

ssh-keygen  
vi ~/.ssh/authorized_keys  
chmod 600  ~/.ssh/authorized_keys

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys,把所有的机器上的id_rsa.pub都放进来后,再复制authorized_keys文件到所有的机器上。然后ssh测试一下  

系统网络

service iptables stop


vi /etc/sysctl.conf里面加上一行:  
vm.swappiness=10  
可以直接echo 'vm.swappiness=10'>> /etc/sysctl.conf

echo never > /sys/kernel/mm/transparent_hugepage/enabled  
echo never > /sys/kernel/mm/transparent_hugepage/defrag  
上面这几个不加的会,会在后续的图形界面配置时报错提示,所以也可以遇到了再处理。

vi /etc/selinux/config  
SELINUX=disabled  
service iptables stop  
chkconfig iptables off

另外我还遇到了DNS解析错误的问题
这样可以临时修改一下,vi /etc/sysconfig/network-scripts/ifcfg-eth0
增加2行:
DNS1=172.31.0.2  
DNS2=114.114.114.114  
service network restart  
通过cat /etc/resolv.conf可以看到了
直接修改/etc/resolv.conf这个文件是没用的,网络服务重启以后会根据/etc/sysconfig/network-scripts/ifcfg-eth0来重载配置,如果ifcfg-eth0没有配置DNS,那么resolv.conf会被冲掉,重新变成空值

所以最好还是找运维的同事配置好
1.1.2 创建目录
mkdir -p /opt/cloudera-manager &&  
mkdir -p /opt/cloudera/ &&  
mkdir -p /opt/cloudera/parcel-repo  &&  
chmod 777  /opt/cloudera/parcel-repo -R  &&  
mkdir  /opt/software/  &&  
chmod 777  /opt/software -R 

sudo mkdir /var/log/cloudera-scm-headlamp &&  
sudo mkdir /var/log/cloudera-scm-alertpublisher  &&  
sudo mkdir /var/log/cloudera-scm-eventserver  &&  
sudo mkdir /var/log/cloudera-scm-firehose &&  
sudo mkdir /var/lib/cloudera-scm-headlamp &&  
sudo mkdir /var/lib/cloudera-scm-alertpublisher &&  
sudo mkdir /var/lib/cloudera-scm-eventserver &&  
sudo mkdir /var/lib/cloudera-scm-firehose &&  
sudo chmod 777 /var/lib/cloudera* -R  &&  
sudo chmod 777 /var/log/cloudera* -R  &&  
sudo chown -R cloudera-scm:cloudera-scm /var/lib/cloudera* &&  
sudo chown -R cloudera-scm:cloudera-scm /var/log/cloudera*

记得修改权限 
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-* -R

 chown cloudera-scm:cloudera-scm /var/log/cloudera-scm-* -R
 chown cloudera-scm:cloudera-scm /cluster -R
 chown cloudera-scm:cloudera-scm /opt/* -R
1.1.3 依赖软件

Cloudera Manag下载目录这里面选择 https://archive.cloudera.com/cm5/cm/5/cloudera-manager-el6-cm5.11.1x8664.tar.gz 这个版本

CDH的安装包下载目录,我用的是centos6.8,所以选择http://archive.cloudera.com/cdh5/parcels/5.11.1/ 这个版本,下载CDH-5.11.1-1.cdh5.11.1.p0.4-el6.parcel,CDH-5.11.1-1.cdh5.11.1.p0.4-el6.parcel.sha1,manifest.json 这三个文件

另外还需要下载在安装过程中需要使用的包:

cloudera-manager-agent-5.11.1-1.cm5111.p0.9.el6.x86_64.rpm  
cloudera-manager-el6-cm5.11.1_x86_64.tar.gz  
jdk-6u31-linux-amd64.rpm  
cloudera-manager-daemons-5.11.1-1.cm5111.p0.9.el6.x86_64.rpm  
oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm  
从本地到https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.11.1/RPMS/x86_64/这里下载完了,再上传用yum install ***来安装,就可以在cm的命令中用yum list installed **看到了。如果不提前装,后续会有图形界面用cloudera自己去安装的话,时间就会特别长,所以我就直接自己在centos上装好了。后续cloudera会调yum命令查看是否已装好,装好了的话是不会再装了的。

 yum install /opt/software/jdk-6u31-linux-amd64.rpm && 
 yum install /opt/software/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm && 
 yum install /opt/software/cloudera-manager-daemons-5.11.1-1.cm5111.p0.9.el6.x86_64.rpm && 
 yum install /opt/software/cloudera-manager-agent-5.11.1-1.cm5111.p0.9.el6.x86_64.rpm 

我的软件目录列表

安装jdk和python

cd /opt/software &&  tar -zxvf jdk-8u131-linux-x64.tar.gz

vi ~/.bash_profile  
export JAVA_HOME=/opt/software/jdk1.8.0_131  
export PATH=$JAVA_HOME/bin:$PATH  
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib

source ~/.bash_profile

yum install python26 -y  
源不对可能会提示找不到python26,所以需要配置一下yum源。可以使用网易或者阿里云的,在国内会快很多。
1.1.4 解压 cloudera-manager
mkdir /opt/cloudera-manager

tar xzf /home/work/software/clouderamanager/cloudera-manager-el6-cm5.11.1_x86_64.tar.gz -C /cluster/cloudera-manager

tar xzf /opt/software/cloudera-manager-el6-cm5.11.1_x86_64.tar.gz -C /cluster/cloudera-manager  
1.1.5 安装数据库

我用的直接就是某台机器上的数据库了,对可用性要求高的话,建议直接走aws的RDB mysql或者另外搞mysql的高可用。

1. yum install mysql-server

2. copy mysql-connector-java-5.1.40.jar /usr/share/java/mysql-connector-java.jar

3. 初始化数据库  
service mysqld start  
/usr/bin/mysql_secure_installation

4.  
mysql -uroot -pyourpasswd  
create database scm DEFAULT CHARACTER SET utf8;  
grant all on  scm.* TO 'root'@'%' IDENTIFIED BY 'yourpasswd';  
create database hive DEFAULT CHARACTER SET utf8;  
create database oozie DEFAULT CHARACTER SET utf8;  
create database hue DEFAULT CHARACTER SET utf8;  
grant all on hive.* TO 'root'@'%' IDENTIFIED BY 'yourpasswd';  
grant all on oozie.* TO 'root'@'%' IDENTIFIED BY 'yourpasswd';  
grant all on hue.* TO 'root'@'%' IDENTIFIED BY 'yourpasswd';  
flush privileges;

5.  
修改数据库的配置:
vi etc/cloudera-scm-server/db.properties  
com.cloudera.cmf.db.type=mysql  
com.cloudera.cmf.db.host=localhost  
com.cloudera.cmf.db.name=cmf  
com.cloudera.cmf.db.user=root  
com.cloudera.cmf.db.password=yourpasswd  
com.cloudera.cmf.db.setupType=EXTERNAL

6.  
./share/cmf/schema/scm_prepare_database.sh mysql -h yourmachine01  -uroot -pyourpasswd --scm-host yourhost  scm root yourpasswd
一定需要执行这步,执行完了可以看
vi etc/cloudera-scm-server/db.properties  
用户名密码已经被修改了
1.1.6 增加用户
useradd --system --home=/opt/cloudera-manager/cm-5.11.1/run/cloudera-scm-server --no-create-home --shell=/bin/bash --comment "Cloudera SCM User" cloudera-scm

修改密码:
passwd cloudera-scm  
yourpasswd

关于用户,需要说明一下的是:后续执行,可以使用root用户。

1.2 开始启动server

启动前先看一下几个文件: vi /opt/cloudera-manager/cm-5.11.1/etc/cloudera-scm-server/db.properties

全部内容
com.cloudera.cmf.db.type=mysql  
com.cloudera.cmf.db.host=localhost  
com.cloudera.cmf.db.name=scm  
com.cloudera.cmf.db.user=root  
com.cloudera.cmf.db.password=yourpasswd  
com.cloudera.cmf.db.setupType=EXTERNAL  

vi /opt/cloudera-manager/cm-5.11.1/etc/default/cloudera-scm-server

全部内容
CMF_SERVER_ARGS=""  
export CMF_JDBC_DRIVER_JAR="/usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar"  
export CMF_JAVA_OPTS="-Xmx2G -XX:MaxPermSize=256m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp"  
DEFAULT_DIR=`dirname ${BASH_SOURCE-$0}`  
CMF_ROOT=${CMF_ROOT:-`cd $DEFAULT_DIR ; cd ../.. ; pwd`}

export CMF_AGENT_ROOT=$CMF_ROOT/lib64/cmf  
export CMF_SERVER_ROOT=$CMF_ROOT/share/cmf  
export CMF_SBINDIR=$CMF_ROOT/sbin  
export CMF_ETC=$CMF_ROOT/etc  
export CMF_VAR=$CMF_ROOT  
export CMF_SUDO_CMD=" "  
export CMF_AGENT_MGMT_HOME="$CMF_SERVER_ROOT"  

vi /opt/cloudera-manager/cm-5.11.1/etc/init.d/cloudera-scm-server

这里面需要注意
USER=cloudera-scm  
GROUP=cloudera-scm  
...
install -d -o $USER -g $GROUP ${CMF_VAR:-/var}/run/cloudera-scm-server  

vi /opt/cloudera-manager/cm-5.11.1/etc/init.d/cloudera-scm-agent/config.ini

这里面需要注意
server_host=yourClouderaManagerHost  
server_port=7182  
# listening_port=9000这行我是注释着的

vi /opt/cloudera-manager/cm-5.11.1/etc/init.d/cloudera-scm-agent

这里面需要注意
CMF_DIR_OWNER=${USER:-cloudera-scm}  

vi /opt/cloudera-manager/cm-5.11.1/etc/default/cloudera-scm-agent

全部内容
CMF_AGENT_ARGS=""  
USER="cloudera-scm"

DEFAULT_DIR=`dirname ${BASH_SOURCE-$0}`  
CMF_ROOT=${CMF_ROOT:-`cd $DEFAULT_DIR ; cd ../.. ; pwd`}

export CMF_AGENT_ROOT=$CMF_ROOT/lib64/cmf  
export CMF_SERVER_ROOT=$CMF_ROOT/share/cmf  
export CMF_SBINDIR=$CMF_ROOT/sbin  
export CMF_ETC=$CMF_ROOT/etc  
export CMF_VAR=$CMF_ROOT  
export CMF_SUDO_CMD=" "

export CMF_AGENT_MGMT_HOME="$CMF_SERVER_ROOT"  

1.2 启动cm

1.2.1 先启动server
/opt/cloudera-manager/cm-5.11.1/etc/init.d/cloudera-scm-server start

查看log:tail -f  /opt/cloudera-manager/cm-5.11.1/log/cloudera-scm-server/cloudera-scm-server.log
1.2.2 启动agent

再去每一台机器上(包括server),/opt/cloudera-manager/cm-5.11.1/etc/init.d/cloudera-scm-agent start启动Agent服务

1.2.3 图形界面配置

serverHost:7180,使用admin/admin登录
首页

选择CDH 以下是一系列的截图:

进行安装

1.2.4 完成

最后就是一路trouble shooting的过程了

1.3 trouble shooting

1.3.1 无权限

看了一下日志中的异常:
tail -f   log/cloudera-scm-server/cloudera-scm-server.log  
报的是这个错:
2017-07-07 03:27:48,937 WARN NodeConfiguratorThread-8-3:com.cloudera.server.cmf.node.NodeConfigurator: Could not authenticate to machine01  
net.schmizz.sshj.common.SSHException: No provider available for Unknown key file  
    at net.schmizz.sshj.SSHClient.loadKeys(SSHClient.java:575)
    at com.cloudera.server.cmf.node.NodeConfigurator.connect(NodeConfigurator.java:352)

最后没有使用root帐号启动server导致的,我之前一直以为需要cloudera-scm这个用户来搞产生了误导。所以还是建议直接使用root用户开搞,不然的话可能会有别的目录需要root权限就得sudo来执行cloudera-scm的命令了。

1.3.2 看不到CDH的安装文件

提示看不到安装包, Note: No parcels found from the configured repositories. Try adding a Custom Repository under More Options. Otherwise, you may only proceed with Use Packages.
手动进入cdh的下载地址:https://archive.cloudera.com/cdh5/parcels/5/ 下了5.11.1后再放进去之前说的目录下就好了

1.3.3 与系统版本不匹配

这个问题纠结了比较久,主要是之前使用的操作系统是aws自带的。

看上去是aws的机器并不识别aws的OS版本,于是加上了
 /etc/centos-release,内容如下,然后就没有这个错了。
CentOS Linux release 6.8.1611 (Core)  
加上之后还是有问题,于是再加上:
cat /etc/redhat-release  
CentOS release 6.5 (Final)  
这时还是不work的,于是按提示去update cloudera manager agent.

试了很久后,还是放弃了,改了操作系统,用centos6.8解决了。
1.3.4 时区不对,一直卡着

这个是自己作,之前用了aws的操作系统,后来找了两台机器,用的是国内的时区。但server没有换,所以同步时时间对不上,就跪了。改完时区就正常了。

另外这个也有可能是防火墙没有关导致的
service iptables stop  
1.3.5 不重启server,parcel-repo包不会自动刷新

关于进度一定要看server的log,看报的啥错。另外注意如果修改了/opt/cloudera/parcel-repo里面的包,建议重启一下server,否则可能安装包列表不会出现你刚放进去的。 这里放包可以放一下不同版本的,看系统选择哪个版本,说明系统就会用哪个版本的。

1.3.6 不分配elasticIP不能访问外网

如果我不给aws ec2分配elasticIP,就不能访问外网。这个需要和运维同事沟通而不是自己操作aws.可以尝试一下curl http://www.baidu.com看看网是不是通的。看上去是DNS没有修改导致。

1.3.7 系统参数不合理

这个在前面已经说过了。 问题: Cloudera recommends setting /proc/sys/vm/swappiness to at most 10. Current setting is 60. Use the sysctl command to change this setting at runtime and edit /etc/sysctl.conf for this setting to be saved after a reboot. You may continue with installation, but you may run into issues with Cloudera Manager reporting that your hosts are unhealthy because they are swapping. The following hosts are affected:

解决: 需要在所有的机器上执行:sysctl vm.swappiness=10 以及在 vi /etc/sysctl.conf里面加上一行:
vm.swappiness=10
echo 'vm.swappiness=10'>> /etc/sysctl.conf

问题: Transparent Huge Page Compaction is enabled and can cause significant performance problems. Run "echo never > /sys/kernel/mm/transparenthugepage/defrag" and "echo never > /sys/kernel/mm/transparenthugepage/enabled" to disable this, and then add the same command to an init script such as /etc/rc.local so it will be set on system reboot. The following hosts are affected:

解决: echo never > /sys/kernel/mm/transparenthugepage/enabled
echo never > /sys/kernel/mm/transparent
hugepage/defrag

1.3.8 切换用户需要密码,程序不能自动输入

报错: sudo:no tty present and no askpass program specified. 解决:需要设置无密码登录 chmod 644 /etc/sudoers
vi /etc/sudoers
给root用户ALL加上NOPASSWD去执行 再加上了行:Defaults visiblepw 就可以解决了

1.3.9 cloudera-scm这个用户没有sudo权限并且这个用户需要passwd才能进行,

又修改vi /etc/sudoers usermod -g root cloudera-scm
%wheel ALL=(ALL) NOPASSWD: ALL %sudo ALL=(ALL) NOPASSWD: ALL cloudera-scm ALL=(ALL) NOPASSWD: ALL

1.3.9 zookeeper用户权限问题

增加zookeeper用户

useradd --system --home=/opt/cloudera-manager/cm-5.11.1/run/cloudera-scm-server --no-create-home --shell=/bin/bash --comment "zookeeper User" zookeeper
usermod -g root zookeeper
注意一下: default/cloudera-scm-agent文件中:

Uncomment to change the user the agent runs as (default is root).

# USER="cloudera-scm"
关停,杀进程,注意不一样能杀干净。确定无误了再重启! 每台机器都这样操作 完事后再确定一次是否是root启的 批量杀进程: ps -ef|grep agent | grep 498 |grep -v grep | awk '{print "kill -9 " $2}' |sh
ps -ef|grep cloudera |grep -v grep | awk '{print "kill -9 " $2}' |sh

1.3.10 zookeeper文件权限问题

tail -f /opt/cloudera-manager/cm-5.11.1/log/cloudera-scm-server/cloudera-scm-server.log
出现问题了,还需要看Log,一个个解决。 例如提示需要手动建目录: mkdir /var/lib/zookeeper/version-2 完了还出错,没有提示,再翻日志发现需要修改写权限 chmod 777 /var/lib/zookeeper/version-2 -R

1.3.11 oozie权限问题

之前测试连接时边oozie是正常的,安装到一半了发现不正常,估计是jar包有问题。 于是再看了一下/var/lib/oozie下面,发现并没有jar,我估计是权限有问题copy jar包没有copy过来。 于是修改了一下权限 chmod 777 /var/lib/oozie后,正常通过了这个错。

然后又报了下面的错: 看上去是数据库表在之前出错的情况一下已经有了,我试一下手动清理,删除oozie库后重建 重建之后还是有错,于是用hadoop fs 看了一下,看上去用户是cloudera-scm,而不是root 切换帐号到cloudera-scm 出现了: This account is currently not available.
vi /etc/passwd 将shell是“/sbin /nologin”,需要将起改成“/bin/bash”就可以了 hadoop fs -chown -R root:root /
感觉不用cloudera-scm用户就是一个天坑 试了一下还有问题,于是再改: hadoop fs -chmod -R 777 /
又发现了: cp: cannot open `/opt/cloudera/parcels/CDH-5.10.2-1.cdh5.10.2.p0.5/lib/oozie/../../etc/oozie/tomcat-conf.http/conf/server.xml' for reading: Permission denied
我把这个权限都改成了777,又把etc下的所有的cloudera-scm改成了root。 然后就通过了。

1.3.12 Restart this Service Monitor Command报错

INTERNALSERVERERROR
查了一下说可能是内存不足了,需要至少10G的内存才行

1.3.13

com.cloudera.cmon.MgmtServiceLocatorException: Could not find a HOST_MONITORING

以及Could not find a HOST_MONITORING nozzle from SCM. 都可以重启server尝试一下

1.3.14

"Error, CM server guid updated, expected 85587073-270d-43d9-a44a-e213d9f7e45b, received 4c1402a5-8364-4598-a382-0c760710e897"

删除: rm /opt/cloudera-manager/cm-5.11.1/lib/cloudera-scm-agent/cmguid
以及 rm /var/lib/cloudera-scm-agent/cm
guid

1.3.15 root启动!

log/cloudera-scm-agent/cloudera-scm-agent.log:282:Exception: Non-root agent cannot execute process as user 'root'
出现了这个错,说明配置里面确实是cloudera-scm用户的,但是start的用户还是cloudera-scm,这个不太对,需要使用root帐号来启动 这一点非常重要,使用root用户启动,但配置中不要设置root而是使用cloudera-scm。数据库用户名随便弄一个都行,写这个提示无关。重三!!!

2017-07-11 16:21:42,290 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.io.IOException: failed to stat a path component: '/var/run/hdfs-sockets'. error code 13 (Permission denied)
at org.apache.hadoop.net.unix.DomainSocket.validateSocketPathSecurity0(Native Method) at org.apache.hadoop.net.unix.DomainSocket.bindAndListen(DomainSocket.java:189) at org.apache.hadoop.hdfs.net.DomainPeerServer.(DomainPeerServer.java:40)

1.3.16 hdfs权限问题

看上去是这个目录的权限问题 看用户是hdfs:hadoop,于是装hdfs加入到root组下 usermod -g root hdfs
问题解决

1.3.17 maprd权限问题

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279) at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260) 看上去是maperd用户在/这个hdfs的目录下没有操作权限导致 进入到hdfs的配置页面,搜索permission .去掉了这个,重启hdfs让设置生效,再重启yarn就可以了。

总之, 服务起不来,找到对应的instance,看log,一般是权限问题,例如/cluster/yarn/nm/就需要own是yarn

1.3.18 执行mapReduce报错

我copy了一下cm上的必备配置文件,放到了resouces目录下面 core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml

执行时还是报了这个错: Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.

解决: 跟了一下源码,发现Cluster.java中这行代码没有load一个ClientProtocolProvider,看上去是少了一些jar包

private static ServiceLoader<ClientProtocolProvider> frameworkLoader =  
      ServiceLoader.load(ClientProtocolProvider.class);

引入以下的jar包,重新mvn clean compile package后问题解决。

<dependency>  
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-core</artifactId>
            <version>2.6.0</version>
            <exclusions>
                <exclusion>
                    <groupId>io.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
            <version>2.6.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-common</artifactId>
            <version>2.6.0</version>
        </dependency>
1.3.19 执行mapReduce报错

我一开始是在本地执行,出现了

Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException):  

然后我挪到了cm所在的机器上执行java -jar path.className,报了

Exception in thread "main" java.lang.ClassNotFoundException  
找不到的类是我写的Map类和Reduce类

我再换成hadoop jar my.jar yourpath.TrainingLogCollect后还是有问题,最后发现是springboot封装了mvn的包,所以是找不到路径了的。在mvn的target下有以下三个包:

 58K  8  1 15:49 original-materials-0.1.jar
79M  8  1 15:49 materials-0.1.jar.original  
161M  8  1 15:50 materials-0.1.jar  

这三个分别是纯代码编译、加了lib、加了springboot的相关lib的封装包,所以我只使用materials-0.1.jar.original,改成jar结尾后上传到cm管理的机器上,再执行就可以了。

二、druid的安装

2.1 软件安装

  • yum install rsync
  • 安装node.js
sudo rpm -Uvh https://rpm.nodesource.com/pub_4.x/el/7/x86_64/nodesource-release-el7-1.noarch.rpm  
sudo yum install nodejs  
  • 下载imply
imply是druid的主力开发创办的公司,封装起来让其更好用并且提供企业级服务。不过最后发现其实只是用了里面的bin/*命令  
https://imply.io/get-started  
下载:https://static.imply.io/release/imply-2.2.3.tar.gz
这里是参考的文档:
https://docs.imply.io/on-premise/quickstart  

启动imply:

bin/supervise -c conf/supervise/quickstart.conf

执行了之后http://yourhost:8090/console.html在这里面可以看到imply图形界面
另外在 http://yourhost:8081可以看到druid的界面
在http://yourhost:9095/可以看到priot的图形界面

后来发现imply的文档还是很不清楚,所以还是使用druid本身来搞得了

最后我的做法: 编译了0.10.1的源码,替换了imply中的所有的druid的目录后,才没有遇到别的问题。主要替换的就是dist/druid下的extensions,hadoop-dependencies,lib三个目录

相关的端口说明:

那些端口:

  ● 3306 mysql 这个我用的是自己的高可用mysql,所以不用管
  ● 2181 (ZooKeeper; not needed if you are using a separate ZooKeeper cluster) 这个是我自己用的别的zookeeper集群,也不用管
  ● 8081 (Coordinator)
  ● 8082 (Broker)
  ● 8083 (Historical)
  ● 8084 (Standalone Realtime, if used)
  ● 8088 (Router, if used)
  ● 8090 (Overlord)
  ● 8091, 8100–8199 (Druid Middle Manager; you may need higher than port 8199 if you have a very high druid.worker.capacity)
  ● 8200 (Tranquility Server, if used)

2.2 我的配置文件

conf的目录结构

├── druid
│   ├── broker
│   │   ├── jvm.config
│   │   ├── main.config
│   │   └── runtime.properties
│   ├── _common
│   │   ├── common.runtime.properties
│   │   ├── core-site.xml
│   │   ├── hdfs-site.xml
│   │   ├── log4j2.xml
│   │   ├── mapred-site.xml
│   │   └── yarn-site.xml
│   ├── coordinator
│   │   ├── jvm.config
│   │   ├── main.config
│   │   └── runtime.properties
│   ├── historical
│   │   ├── jvm.config
│   │   ├── main.config
│   │   └── runtime.properties
│   ├── middleManager
│   │   ├── jvm.config
│   │   ├── main.config
│   │   └── runtime.properties
│   └── overlord
│       ├── jvm.config
│       ├── main.config
│       └── runtime.properties
├── pivot
│   └── config.yaml
├── supervise
│   ├── data.conf
│   ├── master-no-zk.conf
│   ├── master-with-zk.conf
│   ├── query.conf
│   └── quickstart.conf
├── tranquility
│   ├── kafka.json
│   └── server.json
└── zk
    ├── jvm.config
    ├── log4j.xml
    └── zoo.cfg

vi conf/druid/broker/runtime.properties

druid.service=druid/broker  
druid.port=8082

# HTTP server threads
druid.broker.http.numConnections=5  
druid.server.http.numThreads=20

# Processing threads and buffers
druid.processing.buffer.sizeBytes=536870912  
druid.processing.numMergeBuffers=2  
druid.processing.numThreads=7  
druid.processing.tmpDir=var/druid/processing

# Query cache disabled -- push down caching and merging instead
druid.broker.cache.useCache=false  
druid.broker.cache.populateCache=false

# SQL
druid.sql.enable=true

vi conf/druid/_common/common.runtime.properties

druid/_common/common.runtime.properties  
druid.extensions.directory=dist/druid/extensions  
druid.extensions.hadoopDependenciesDir=dist/druid/hadoop-dependencies

druid.extensions.loadList=["druid-hdfs-storage","druid-kafka-eight", "druid-s3-extensions", "druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "mysql-metadata-storage"]

druid.startup.logging.logProperties=true

druid.zk.service.host=yourmachine02,yourmachine03,yourmachine04  
druid.zk.paths.base=/druid

druid.metadata.storage.type=mysql  
druid.metadata.storage.connector.connectURI=jdbc:mysql://yourmysqlhost:3306/druid  
druid.metadata.storage.connector.user=yourusername  
druid.metadata.storage.connector.password=yourpasswd

# for hdfs
druid.storage.type=hdfs  
druid.storage.storageDirectory=hdfs://yourmachine02:8020/druid/segments  
druid.indexer.logs.type=hdfs  
druid.indexer.logs.directory=hdfs://yourmachine02:8020/druid/indexlogs

druid.selectors.indexing.serviceName=druid/overlord  
druid.selectors.coordinator.serviceName=druid/coordinator

druid.monitoring.monitors=["com.metamx.metrics.JvmMonitor"]  
druid.emitter=logging  
druid.emitter.logging.logLevel=debug  

vi conf/druid/coordinator/runtime.properties

druid.service=druid/coordinator  
druid.host=yourmachine02  
druid.port=8081

druid.coordinator.startDelay=PT30S  
druid.coordinator.period=PT30S  

vi conf/druid/historical/runtime.properties

druid.service=druid/historical  
druid.port=8083

# HTTP server threads
druid.server.http.numThreads=40

# Processing threads and buffers
druid.processing.buffer.sizeBytes=536870912  
druid.processing.numMergeBuffers=2  
druid.processing.numThreads=7  
druid.processing.tmpDir=var/druid/processing

# Segment storage
druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize"\:130000000000}]  
druid.server.maxSize=130000000000

# Query cache
druid.historical.cache.useCache=true  
druid.historical.cache.populateCache=true  
druid.cache.type=caffeine  
druid.cache.sizeInBytes=2000000000  

vi conf/druid/middleManager/runtime.properties

druid.service=druid/middlemanager  
druid.port=8191

# Number of tasks per middleManager,cpu core number - 1
druid.worker.capacity=7

# Task launch parameters
druid.indexer.runner.javaOpts=-server -Xmx2g -Duser.timezone=UTC+0800 -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager  
druid.indexer.task.baseTaskDir=var/druid/task  
druid.indexer.task.restoreTasksOnRestart=true

# HTTP server threads
druid.server.http.numThreads=40

# Processing threads and buffers
druid.processing.buffer.sizeBytes=100000000  
druid.processing.numMergeBuffers=2  
druid.processing.numThreads=7  
druid.processing.tmpDir=var/druid/processing

# Hadoop indexing
druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp  
druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.7.3"]  

vi conf/druid/overlord/runtime.properties

druid.service=druid/overlord  
druid.port=8090

druid.indexer.queue.startDelay=PT30S

druid.indexer.runner.type=remote  
druid.indexer.storage.type=metadata

druid.host=yourmachine02  

另外对于所有的jvm参数,内存Xmx至少6G,时区需要修改为UTC+0800,否则后续会挺麻烦的。

-Xmx6g
-Duser.timezone=UTC+0800
-Dfile.encoding=UTF-8

启动脚本 vi conf-my/master-with-query.conf

:verify bin/verify-java
:verify bin/verify-node
:verify bin/verify-default-ports
:verify bin/verify-version-check
:kill-timeout 10

coordinator bin/run-druid coordinator conf  
broker bin/run-druid broker conf  
!p80 overlord bin/run-druid overlord conf
pivot bin/run-pivot-quickstart conf  

vi conf-my/data.conf

:verify bin/verify-java

historical bin/run-druid historical conf  
!p90 middleManager bin/run-druid middleManager conf

!p95 tranquility-server bin/tranquility server -configFile conf-my/tranquility.json

2.3 我的项目文件

{
  "type" : "index_hadoop",
  "spec" : {
    "dataSchema" : {
      "dataSource" : "productAction",
      "parser" : {
        "type" : "hadoopyString",
        "parseSpec" : {
          "format" : "json",
          "timestampSpec" : {
            "column" : "$aTime",
            "format" : "millis"
          },
          "dimensionsSpec" : {
            "dimensions": ["$userId","duration","$action","$itemId"],
            "dimensionExclusions" : [],
            "spatialDimensions" : []
          }
        }
      },
      "metricsSpec" : [
        {
          "type" : "count",
          "name" : "count"
        }
      ],
      "granularitySpec" : {
        "type" : "uniform",
        "segmentGranularity" : "DAY",
        "queryGranularity" : "minute"
      }
    },
    "ioConfig" : {
      "type" : "hadoop",
      "inputSpec" : {
        "type" : "static",
        "paths" : "hdfs://yourmachine02:8020/flume/test/product/useractions/yourmachine01/20170728/log_20170728_11.1501212142394.txt"
      }
    },
    "tuningConfig" : {
      "type": "hadoop",
      "jobProperties" : {
        "mapreduce.job.classloader": "true",
        "mapreduce.job.user.classpath.first": "true"
      }
    }
  },
  "hadoopDependencyCoordinates": ["org.apache.hadoop:hadoop-client:2.7.3"]
}

2.4 启动与停止

nohup ./bin/supervise -c conf-my/master-with-query.conf > master-with-query.log &

再到别的机器上启动 nohup ./bin/supervise -c conf-my/data.conf > data.log &

停止服务: ./bin/service --down

2.4 支持实时流

我的项目文件:

{
  "dataSources" : [
    {
      "spec" : {
        "dataSchema" : {
          "dataSource" : "productRealtime",
          "parser" : {
            "type" : "string",
            "parseSpec" : {
              "timestampSpec" : {
                "column" : "timestamp",
                "format" : "auto"
              },
              "dimensionsSpec" : {
                "dimensions" : [],
                "dimensionExclusions" : [
                  "timestamp",
                  "value"
                ]
              },
              "format" : "json"
            }
          },
          "granularitySpec" : {
            "type" : "uniform",
            "segmentGranularity" : "hour",
            "queryGranularity" : "none"
          },
          "metricsSpec" : [
            {
              "name" : "value_sum",
              "type" : "doubleSum",
              "fieldName" : "value"
            },
            {
              "fieldName" : "value",
              "name" : "value_min",
              "type" : "doubleMin"
            },
            {
              "type" : "doubleMax",
              "name" : "value_max",
              "fieldName" : "value"
            }
          ]
        },
        "ioConfig" : {
          "type" : "realtime"
        },
        "tuningConfig" : {
          "type" : "realtime",
          "maxRowsInMemory" : "50000",
          "intermediatePersistPeriod" : "PT10M",
          "windowPeriod" : "PT10M"
        }
      },
      "properties" : {
        "task.partitions" : "1",
        "task.replicants" : "1"
      }
    }
  ],
  "properties" : {
    "zookeeper.connect" : "yourmachine02:2181,yourmachine03:2181,yourmachine04:2181",
    "druid.discovery.curator.path" : "/druid/discovery",
    "druid.selectors.indexing.serviceName" : "druid/overlord",
    "http.port" : "8200",
    "http.threads" : "40",
    "serialization.format" : "smile",
    "druidBeam.taskLocator": "overlord"
  }
}

启动:

bin/tranquility server -configFile conf-my/tranquility.json  

发送一个命令测试一下

curl -X POST -H 'Content-Type: application/json' --data '{"timestamp": "2017-07-14T07:19:25Z","value":102}' http://yourmachine01:8200/v1/post/productRealtime  

如果druid换成了0.10.1,而transquality没有换,可能会不兼容,所以最好也再编译一下实时模块

cd  ~/workspace/github/druid-io  
git clone https://github.com/druid-io/tranquility.git  
git checkout druid-0.10.1  
brew install sbt  
sbt compile package//会有点慢  

2.5 图形展示界面

imply自带有pivot界面,但是比较弱,而superset更为通用
这里是官网: https://superset.incubator.apache.org/installation.html
这里是我参考别的人安装的:http://zhmgz.lofter.com/post/90909_e745201

先安装Python2.7,之前系统中自带的是2.6的

 wget http://www.python.org/ftp/python/2.7.8/Python-2.7.8.tar.xz
 tar -xvf Python-2.7.8.tar.xz
./configure
make  
make install  
这时会产生python2.7的文件,进行修改:
mv /usr/bin/python /usr/bin/python.bak  
ln -s /usr/local/bin/python2.7  /usr/bin/python  

安装 pip

wget https://bootstrap.pypa.io/ez_setup.py  
python ez_setup.py  
easy_install pip  
 ln -s /usr/local/bin/pip2.7  /usr/bin/pip
pip -V  

安装操作系统软件依赖

sudo yum upgrade python-setuptools  
sudo yum install gcc gcc-c++ libffi-devel python-devel python-pip python-wheel openssl-devel libsasl2-devel openldap-devel  

建立superset数据目录

mkdir /cluster/software/druid/superset  
使用虚拟机来搞
pip install virtualenv  
virtualenv venv  
. ./venv/bin/activate

利用pip安装superset

pip install superset

然后执行:
fabmanager create-admin --app superset  
报了这个错:
Was unable to import superset Error: No module named pysqlite2  
安装这个看上去非常麻烦,所以直接搞mysql了
pip install mysql-python


在里增加一个文件:
vi venv/lib/python2.7/site-packages/superset_config.py  
OW_LIMIT = 5000  
SUPERSET_WORKERS = 4  
SUPERSET_WEBSERVER_PORT = 8388  
SECRET_KEY = 'anything'  
SQLALCHEMY_DATABASE_URI = 'mysql://yourmysqlhost:3306/druid?charset=utf8'  
CSRF_ENABLED = True  
MAPBOX_API_KEY = ''

先去mysql上面create database:
create database druid DEFAULT CHARACTER SET utf8;

再次执行 fabmanager create-admin --app superset
输入vicviz
vic  
viz  
myemail@email.address  
mycompany

重复一次密码,最后创建成功
Recognized Database Authentications.  
Admin User vicviz created.

初始化数据库:
superset db upgrade

加载示例数据:
superset load_examples  
时间有几分钟吧

创建默认角色和权限
superset init

nohup superset runserver -p 8388 &

然后在http://aws-prophet-recom02:8388/就可以看到酷炫的页面了

2.6 druid trouble shooting

  • 如果执行 ./bin/generate-example-metrics报了:ImportError: No module named argparse则去安装一下 yum install python-argparse

  • 实时数据发送出现了{"result":{"received":1,"sent":0}}:说明是时间不对,有一个变量("windowPeriod" : "PT10M")控制了只有在这个时间范围内的才会去处理,否则var/sv/tranquility-server.log是连任何日志都不会记录的。

  • 时区得搞对,另外发送实时的数据的时区也得是对的才行。

  • 安装superset时遇到了遇到了这个gcc的错

error: command 'gcc' failed with exit status 1  
  Failed building wheel for sasl

Command "/cluster/software/druid/superset/venv/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-wSzBfB/sasl/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-9tgpvG-record/install-record.txt --single-version-externally-managed --compile --install-headers /cluster/software/druid/superset/venv/include/site/python2.7/sasl" failed with error code 1 in /tmp/pip-build-wSzBfB/sasl/  

不知道为啥,重装g++,再按官网来一次就好了。。。

comments powered by Disqus