zookeeper use

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Zookeeper Server持久化两类数据:Transaction和Snapshot.

logDir存储transaction命令,dataDir存储snap快照,其下子目录名称以version-2命名,子目录内部文件是分别以log.zxid和snapshot.lastProcessedZxid命名,每个目录下可以有很多个这样的文件,Transaction文件的文件名中zxid是文件中所有命令中zxid最小的zxid,而Snapshot中的lastProcessedZxid是最后一个操作的zxid,一般来讲是最大的zxid。

事务日志记录的是当前要操作的命令以及命令参数,可以认为是动态的,快照记录的是当前的ZK静态数据结构:包括ACL,节点树,session/临时节点对应关系

如下:
penn@ubuntu:/mnt/app/zookeeper.1/bin$ ls -l /mnt/data/zookeeper.1/version-2/
total 12
-rw-rw-r-- 1 penn penn 1 Nov 3 14:12 acceptedEpoch
-rw-rw-r-- 1 penn penn 1 Nov 3 14:12 currentEpoch
-rw-rw-r-- 1 penn penn 296 Nov 3 14:12 snapshot.0

penn@ubuntu:/mnt/app/zookeeper.1/bin$ ls -l /mnt/log/zookeeper.1/version-2/
total 8
-rw-rw-r-- 1 penn penn 67108880 Nov 3 16:49 log.100000001

日志文件可视化:
默认存储的日志文件是二进制的,我们可以使用如下命令进行查看其存储内容:
penn@ubuntu:/mnt/app/zookeeper.1$ java -cp ./zookeeper-3.4.9.jar:./lib/log4j-1.2.16.jar:./lib/slf4j-log4j12-1.6.1.jar:./lib/slf4j-api-1.6.1.jar org.apache.zookeeper.server.LogFormatter /mnt/log/zookeeper.1/version-2/log.100000001

快照文件:
Zookeeper的数据在内存中是以DataTree为数据结构存储的,而快照就是每间隔一段时间Zookeeper就会把整个DataTree的数据序列化然后把它存储在磁盘中,这就是Zookeeper的快照文件,快照文件是指定时间间隔对数据的备份,所以快照文件中数据通常都不是最新的,多久抓一个快照这也是可以配置的snapCount配置项用于配置处理几个事务请求后生成一个快照文件,与事务日志文件一样快照文件也是使用ZXID作为快照文件的后缀,在FileTxnSnapLog类中的save方法中生成文件并调用FileSnap类序列化DataTree数据并且写入快照文件中;

快照文件可视化:
penn@ubuntu:/mnt/app/zookeeper.1$ java -cp ./zookeeper-3.4.9.jar:./lib/log4j-1.2.16.jar:./lib/slf4j-log4j12-1.6.1.jar:./lib/slf4j-api-1.6.1.jar org.apache.zookeeper.server.SnapshotFormatter /mnt/data/zookeeper.1/version-2/snapshot.0

zookeeper基本操作

  1. 查看当前ZK角色

    1
    2
    3
    4
    5
    penn@ubuntu:~$ cd /mnt/app/zookeeper.1/bin/
    penn@ubuntu:/mnt/app/zookeeper.1/bin$ ./zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /mnt/app/zookeeper.1/bin/../conf/zoo.cfg
    Mode: follower
  2. 登录ZK

    1
    penn@ubuntu:/mnt/app/zookeeper.1/bin$ ./zkCli.sh -server 10.0.2.15:2181
  3. 查看帮助

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    help
    ZooKeeper -server host:port cmd args
    stat path [watch]
    set path data [version]
    ls path [watch]
    delquota [-n|-b] path
    ls2 path [watch]
    setAcl path acl
    setquota -n|-b val path
    history
    redo cmdno
    printwatches on|off
    delete path [version]
    sync path
    listquota path
    rmr path
    get path [watch]
    create [-s] [-e] path data acl
    addauth scheme auth
    quit
    getAcl path
    close
    connect host:port
  4. 查询

    1
    2
    [zk: 10.0.2.15:2181(CONNECTED) 0] ls /
    [zookeeper]
  5. 创建znode节点”zk”并关联其字符串”MyData”

    1
    2
    3
    4
    [zk: 10.0.2.15:2181(CONNECTED) 1] create /zk "MyData"
    Created /zk
    [zk: 10.0.2.15:2181(CONNECTED) 2] ls /
    [zk, zookeeper]
    1
    2
    3
    4
    //注意: 如果不关联字符串,并不会创建新的znode
    [zk: 10.0.2.15:2181(CONNECTED) 3] create /test
    [zk: 10.0.2.15:2181(CONNECTED) 4] ls /
    [zk, zookeeper]
  6. 查看是否包含所创建的字符串

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    [zk: 10.0.2.15:2181(CONNECTED) 5] get /zk
    MyData
    cZxid = 0x100000006
    ctime = Thu Nov 03 14:46:39 CST 2016
    mZxid = 0x100000006
    mtime = Thu Nov 03 14:46:39 CST 2016
    pZxid = 0x100000006
    cversion = 0
    dataVersion = 0
    aclVersion = 0
    ephemeralOwner = 0x0
    dataLength = 6
    numChildren = 0
  7. 现在对关联的字符串进行设置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    [zk: 10.0.2.15:2181(CONNECTED) 6] set /zk "zsl"
    cZxid = 0x100000006
    ctime = Thu Nov 03 14:46:39 CST 2016
    mZxid = 0x100000007
    mtime = Thu Nov 03 14:51:12 CST 2016
    pZxid = 0x100000006
    cversion = 0
    dataVersion = 1
    aclVersion = 0
    ephemeralOwner = 0x0
    dataLength = 3
    numChildren = 0

    [zk: 10.0.2.15:2181(CONNECTED) 8] get /zk
    zsl
    cZxid = 0x100000006
    ctime = Thu Nov 03 14:46:39 CST 2016
    mZxid = 0x100000007
    mtime = Thu Nov 03 14:51:12 CST 2016
    pZxid = 0x100000006
    cversion = 0
    dataVersion = 1
    aclVersion = 0
    ephemeralOwner = 0x0
    dataLength = 3
    numChildren = 0
  8. 删除zk node

    1
    2
    3
    [zk: 10.0.2.15:2181(CONNECTED) 9] delete /zk
    [zk: 10.0.2.15:2181(CONNECTED) 10] ls /
    [zookeeper]

zookeeper 进阶操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[zk: 10.0.2.15:2181(CONNECTED) 13] create /zk "test"
Created /zk
[zk: 10.0.2.15:2181(CONNECTED) 14] ls /
[zk, zookeeper]

[zk: 10.0.2.15:2181(CONNECTED) 15] create /zk/n1 "n1"
Created /zk/n1
[zk: 10.0.2.15:2181(CONNECTED) 16] ls /zk
[n1]
[zk: 10.0.2.15:2181(CONNECTED) 17] ls /zk/n1
[]

//如果zk有层次目录,delete删除不成功,需要使用rmr命令
[zk: 10.0.2.15:2181(CONNECTED) 18] delete /zk
Node not empty: /zk
[zk: 10.0.2.15:2181(CONNECTED) 28] rmr /zk
[zk: 10.0.2.15:2181(CONNECTED) 29] ls /
[zookeeper]

zookeeper quota

zookeeper quota机制支持节点个数(znode)和空间大小(字节数)

1
2
3
4
5
6
//查看,默认是没有限制的
[zk: 10.0.2.15:2181(CONNECTED) 6] create /test "test quota"
Created /test
[zk: 10.0.2.15:2181(CONNECTED) 7] listquota /test
absolute path is /zookeeper/quota/test/zookeeper_limits
quota for /test does not exist.
1
2
3
//设置quota
[zk: 10.0.2.15:2181(CONNECTED) 9] setquota -n 3 /test
Comment: the parts are option -n val 3 path /test
1
2
3
4
5
6
7
//查看quota
[zk: 10.0.2.15:2181(CONNECTED) 10] listquota /test
absolute path is /zookeeper/quota/test/zookeeper_limits
Output quota for /test count=3,bytes=-1
Output stat for /test count=1,bytes=10
//-n表示设置znode count限制,这里表示/test这个path下的znode count个数限制
//-b表示设置znode数据的字节大小限制
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
//测试
[zk: 10.0.2.15:2181(CONNECTED) 14] create /test/0 "0"
Created /test/0
[zk: 10.0.2.15:2181(CONNECTED) 15] create /test/1 "1"
Created /test/1
[zk: 10.0.2.15:2181(CONNECTED) 16] create /test/2 "2"
Created /test/2
[zk: 10.0.2.15:2181(CONNECTED) 17] create /test/3 "3"
Created /test/3
[zk: 10.0.2.15:2181(CONNECTED) 18] ls /test
[0, 1, 2, 3]
//我们发现上面已经超过我们设置的3个znode,但依旧可以创建成功.说明zookeeper的Quota机制是比较温和的,即使超限了,只是在日志中报告一下,并不会限制Client的行为,Client可以继续操作znode
//日志内容:
//2016-11-03 16:38:08,876 [myid:1] - WARN [CommitProcessor:1:DataTree@301] - Quota exceeded: /test count=4 limit=3
//2016-11-03 16:38:12,998 [myid:1] - WARN [CommitProcessor:1:DataTree@301] - Quota exceeded: /test count=5 limit=3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
//查看quota内容
[zk: 10.0.2.15:2181(CONNECTED) 25] get /zookeeper/quota/test/zookeeper_limits
count=3,bytes=-1
cZxid = 0x10000001f
ctime = Thu Nov 03 16:34:55 CST 2016
mZxid = 0x10000001f
mtime = Thu Nov 03 16:34:55 CST 2016
pZxid = 0x10000001f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 16
numChildren = 0

[zk: 10.0.2.15:2181(CONNECTED) 26] get /zookeeper/quota/test/zookeeper_stats
count=5,bytes=14
cZxid = 0x100000020
ctime = Thu Nov 03 16:34:55 CST 2016
mZxid = 0x100000020
mtime = Thu Nov 03 16:34:55 CST 2016
pZxid = 0x100000020
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 16
numChildren = 0

zookeeper 权限认证

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
ZooKeeper 的权限管理亦即 ACL 控制功能通过 Server , Client 两端协调完成:
* Server 端
一个ZooKeeper的节点(znode)存储两部分内容:数据和状态(状态中包含ACL信息)
创建一个znode会产生一个ACL列表,列表中每个ACL包括:
1.验证模式 (scheme)
2.具体内容 (Id) (当 scheme="digest"时,Id为用户名密码,例如"root:J0sTy9BCUKubtK1y8pkbL7qoxSw="
3.权限(perms)

拓展:
ZooKeeper提供了如下几种验证模式(scheme):
1.digest
Client端由用户名和密码验证,譬如"user:password",digest的密码生成方式是Sha1摘要的base64形式
2.auth
不使用任何id,代表任何已确认用户
3.ip
Client 端由 IP 地址验证,譬如 172.2.0.0/24
4.world
固定用户为 anyone,为所有 Client 端开放权限
5.super
在这种scheme情况下,对应的id拥有超级权限,可以做任何事情(cdrwa)
注意:exists操作和getAcl操作并不受ACL许可控制,因此任何客户端可以查询节点的状态和节点的ACL

节点的权限(perms)主要有以下几种:
1.Create 允许对子节点 Create 操作,c
2.Read 允许对本节点 GetChildren 和 GetData 操作,r
3.Write 允许对本节点 SetData 操作,w
4.Delete 允许对子节点 Delete 操作,d
5.Admin 允许对本节点 setAcl 操作,a
1
2
3
4
5
6
//查看ACL
[zk: 10.0.2.15:2181(CONNECTED) 3] create /test "test auth"
Created /test
[zk: 10.0.2.15:2181(CONNECTED) 6] getAcl /test
'world,'anyone
: cdrwa
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
2.设置ip策略
[zk: 10.0.2.15:2181(CONNECTED) 7] setAcl /test ip:10.0.2.15:crwda
cZxid = 0x100000013
ctime = Thu Nov 03 15:22:20 CST 2016
mZxid = 0x100000013
mtime = Thu Nov 03 15:22:20 CST 2016
pZxid = 0x100000013
cversion = 0
dataVersion = 0
aclVersion = 1
ephemeralOwner = 0x0
dataLength = 9
numChildren = 0

[zk: 10.0.2.15:2181(CONNECTED) 8] getAcl /test
'ip,'10.0.2.15
: cdrwa
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
3.设置digest策略
//生成密钥
penn@ubuntu:~$ cd /mnt/app/zookeeper.1/
penn@ubuntu:/mnt/app/zookeeper.1$ java -cp ./zookeeper-3.4.9.jar:./lib/log4j-1.2.16.jar:./lib/slf4j-log4j12-1.6.1.jar:./lib/slf4j-api-1.6.1.jar org.apache.zookeeper.server.auth.DigestAuthenticationProvider test:test
test:test->test:V28q/NynI4JI3Rk54h0r8O5kMug=

//设置权限
[zk: 10.0.2.15:2181(CONNECTED) 2] setAcl /test digest:test:V28q/NynI4JI3Rk54h0r8O5kMug=:crwda
cZxid = 0x100000013
ctime = Thu Nov 03 15:22:20 CST 2016
mZxid = 0x100000013
mtime = Thu Nov 03 15:22:20 CST 2016
pZxid = 0x100000013
cversion = 0
dataVersion = 0
aclVersion = 2
ephemeralOwner = 0x0
dataLength = 9
numChildren = 0

//查看权限
[zk: 10.0.2.15:2181(CONNECTED) 3] getAcl /test
'digest,'test:V28q/NynI4JI3Rk54h0r8O5kMug=
: cdrwa

//验证
[zk: 10.0.2.15:2181(CONNECTED) 4] ls /test
Authentication is not valid : /test
[zk: 10.0.2.15:2181(CONNECTED) 5] addauth digest test:test
[zk: 10.0.2.15:2181(CONNECTED) 6] ls /test
[]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
4.设置超级管理员
//设置super用户密
penn@ubuntu:/mnt/app/zookeeper.1$ java -cp ./zookeeper-3.4.9.jar:./lib/log4j-1.2.16.jar:./lib/slf4j-log4j12-1.6.1.jar:./lib/slf4j-api-1.6.1.jar org.apache.zookeeper.server.auth.DigestAuthenticationProvider super:super
super:super->super:gG7s8t3oDEtIqF6DM9LlI/R+9Ss=

//将密码添加到zkServer.sh中
penn@ubuntu:/mnt/app/zookeeper.1$ vim /mnt/app/zookeeper.1/bin/zkServer.sh
SUPER_ACL="-Dzookeeper.DigestAuthenticationProvider.superDigest=super:gG7s8t3oDEtIqF6DM9LlI/R+9Ss="

//在start启动选项中,添加
nohup "$JAVA" "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" "${SUPER_ACL}" -cp "$CLASSPATH" $JVMFLAGS $ZOOMAIN "$ZOOCFG" > "$_ZOO_DAEMON_OUT" 2>&1 < /dev/null &

//重启zookeeper
penn@ubuntu:/mnt/app/zookeeper.1$ /mnt/app/zookeeper.1/bin/zkServer.sh restart
1
2
3
4
5
6
7
8
5.ACL请求过程
一次Client对znode进行操作的验证ACL的方式为:
a.遍历znode的所有ACL:
i.对于每一个ACL,首先操作类型与权限(perms)匹配
ii.只有匹配权限成功才进行session的auth信息与ACL的用户名,密码匹配
b.如果两次匹配都成功,则允许操作;否则,返回权限不够error(rc=-102)

注意: 如果znode ACL List中任何一个ACL都没有setAcl权限,那么就算superDigest也修改不了它的权限,再假如这个znode还不开放delete权限,那么它的所有子节点都将不会被删除.唯一的办法是通过手动删除snapshot和log的方法,将ZK回滚到一个以前的状态,然后重启,当然这会影响到该znode以外其它节点的正常应用.
1
2
6.ACL缺陷
ACL仅仅是访问控制,并非完善的权限管理,通过这种方式做多集群隔离,还有很多局限性:ACL并无递归机制,任何一个znode创建后,都需要单独设置ACL,无法继承父节点的ACL设置.除了ip这种scheme,digest和auth的使用对用户都不是透明的,这也给使用带来了很大的成本,很多依赖zookeeper的开源框架也没有加入对ACL的支持,例如hbase,storm.