银河星空下看灯塔 發表於 2023-9-22 00:00:00

etcd 集群管理维护

<p><b><span>官方网站:</span></b></p>
<p>https://github.com/coreos/etcd/</p>
<p><b><span>环境:</span></b></p>
<p><b>CentOS7</b></p>
<p><b>etcd-3.0.4</b></p>
<p>3节点集群示例</p>
<p>etcd1:192.168.8.101</p>
<p>etcd2:192.168.8.102</p>
<p>etcd3:192.168.8.103</p>
<p><span><b>一.安装etcd(所有节点)</b></span></p>
<p>curl -L https://github.com/coreos/etcd/releases/download/v3.0.4/etcd-v3.0.4-linux-amd64.tar.gz -o etcd-v3.0.4-linux-amd64.tar.gz</p>
<p>tar xzvf etcd-v3.0.4-linux-amd64.tar.gz</p>
<p>cp -af etcd-v3.0.4-linux-amd64/{etcd,etcdctl} /usr/local/bin</p>
<p>chmod +x /usr/local/bin/{etcd,etcdctl}</p>
<p><b><span>二.配置etcd集群</span></b></p>
<p>https://github.com/coreos/etcd/blob/master/Documentation/op-guide/clustering.md</p>
<p>cluster帮助文档etcd-v3.0.4-linux-amd64/Documentation/op-guide/<b>clustering.md</b></p>
<p>This guide will cover the following mechanisms for bootstrapping an etcd cluster:</p>
<p>* (#static)</p>
<p>* (#etcd-discovery)</p>
<p>* (#dns-discovery)</p>
<p>目前支持三种发现方式,Static适用于有固定IP的主机节点,etcd Discovery适用于DHCP环境,DNS Discovery依赖DNS SRV记录</p>
<p><b>Static方式</b></p>
<p>提示:etcd支持ssl/tls,详见官方文档</p>
<p>https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md</p>
<p><b><span>节点一:etcd1:192.168.8.101</span></b></p>
<p>etcd --name <b><span>etcd1</span></b> --data-dir /opt/etcd \</p>
<p>--initial-advertise-peer-urls http://192.168.8.<b>101</b>:2380 \</p>
<p>--listen-peer-urls http://192.168.8.<b>101</b>:2380 \</p>
<p>--listen-client-urls http://192.168.8.<b>101</b>:2379,http://127.0.0.1:2379 \</p>
<p>--advertise-client-urls http://192.168.8.<b>101</b>:2379 \</p>
<p>--initial-cluster-token etcd-cluster-1 \</p>
<p>--initial-cluster <b>etcd1=http://192.168.8.101:2380,etcd2=http://192.168.8.102:2380,etcd3=http://192.168.8.103:2380</b> \</p>
<p>--initial-cluster-state new</p>
<p><b><span>节点二:etcd2:192.168.8.102</span></b></p>
<p>etcd --name <b><span>etcd2</span></b> --data-dir /opt/etcd \</p>
<p>--initial-advertise-peer-urls http://192.168.8.<b>102</b>:2380 \</p>
<p>--listen-peer-urls http://192.168.8.<b>102</b>:2380 \</p>
<p>--listen-client-urls http://192.168.8.<b>102</b>:2379,http://127.0.0.1:2379 \</p>
<p>--advertise-client-urls http://192.168.8.<b>102</b>:2379 \</p>
<p>--initial-cluster-token etcd-cluster-1 \</p>
<p>--initial-cluster <b>etcd1=http://192.168.8.101:2380,etcd2=http://192.168.8.102:2380,etcd3=http://192.168.8.103:2380</b> \</p>
<p>--initial-cluster-state new</p>
<p><b><span>节点三:etcd1:192.168.8.103</span></b></p>
<p>etcd --name <b><span>etcd3</span></b> --data-dir /opt/etcd \</p>
<p>--initial-advertise-peer-urls http://192.168.8.<b>103</b>:2380 \</p>
<p>--listen-peer-urls http://192.168.8.<b>103</b>:2380 \</p>
<p>--listen-client-urls http://192.168.8.<b>103</b>:2379,http://127.0.0.1:2379 \</p>
<p>--advertise-client-urls http://192.168.8.<b>103</b>:2379 \</p>
<p>--initial-cluster-token etcd-cluster-1 \</p>
<p>--initial-cluster <b>etcd1=http://192.168.8.101:2380,etcd2=http://192.168.8.102:2380,etcd3=http://192.168.8.103:2380</b> \</p>
<p>--initial-cluster-state new</p>
<p>2379是用于监听客户端请求,2380用于集群通信,可以通过--data-dir指定数据存放目录,不指定则默认为当前工作目录</p>
<p># netstat -tunlp|grep etcd</p>
<p>tcp        0      0 192.168.8.103:2379      0.0.0.0:*               LISTEN      11103/<b>etcd</b></p>
<p>tcp        0      0 127.0.0.1:2379          0.0.0.0:*               LISTEN      11103/<b>etcd</b></p>
<p>tcp        0      0 192.168.8.103:2380      0.0.0.0:*               LISTEN      11103/<b>etcd</b></p>
<p># ls</p>
<p><b>etcd3.etcd</b></p>
<p># ls etcd3.etcd/</p>
<p>fixtures/ member/</p>
<p># ls etcd3.etcd/fixtures/</p>
<p>client/ peer/</p>
<p># ls etcd3.etcd/fixtures/peer/</p>
<p>cert.pem  key.pem</p>
<p><b><span>注意:</span></b>上面的初始化只是在集群初始化时运行一次,之后服务有重启,必须要去除掉initial参数,否则报错</p>
<p>请使用如下类似命令</p>
<p>etcd --name etcd3   --data-dir /opt/etcd \</p>
<p>--listen-peer-urls http://192.168.8.103:2380 \</p>
<p>--listen-client-urls http://192.168.8.103:2379,http://127.0.0.1:2379 \</p>
<p>--advertise-client-urls http://192.168.8.103:2379</p>
<p><b><span>三.管理集群</span></b></p>
<p><b><span>etcdctl</span></b></p>
<p>https://github.com/coreos/etcd/blob/master/Documentation/op-guide/maintenance.md</p>
<p># etcdctl --version</p>
<p>etcdctl version: 3.0.4</p>
<p>API version: 2</p>
<p>COMMANDS:</p>
<p>backup          backup an etcd directory</p>
<p>cluster-health  check the health of the etcd cluster</p>
<p>mk              make a new key with a given value</p>
<p>mkdir           make a new directory</p>
<p>rm              remove a key or a directory</p>
<p>rmdir           removes the key if it is an empty directory or a key-value pair</p>
<p>get             retrieve the value of a key</p>
<p>ls              retrieve a directory</p>
<p>set             set the value of a key</p>
<p>setdir          create a new directory or update an existing directory TTL</p>
<p>update          update an existing key with a given value</p>
<p>updatedir       update an existing directory</p>
<p>watch           watch a key for changes</p>
<p>exec-watch      watch a key for changes and exec an executable</p>
<p>member          member add, remove and list subcommands</p>
<p>import          import a snapshot to a cluster</p>
<p>user            user add, grant and revoke subcommands</p>
<p>role            role add, grant and revoke subcommands</p>
<p>auth            overall auth controls</p>
<p><b><span>集群健康状态</span></b></p>
<p># etcdctl <b>cluster-health</b></p>
<p>member 2947dd07df9e44da is healthy: got healthy result from http://192.168.8.102:2379</p>
<p>member 571bf93ce7760601 is healthy: got healthy result from http://192.168.8.101:2379</p>
<p>member b200a8bec19bd22e is healthy: got healthy result from http://192.168.8.103:2379</p>
<p>cluster is healthy</p>
<p><b><span>集群成员查看</span></b></p>
<p># etcdctl <b>member list</b></p>
<p>2947dd07df9e44da: name=etcd2 peerURLs=http://192.168.8.102:2380 clientURLs=http://192.168.8.102:2379 isLeader=false</p>
<p>571bf93ce7760601: name=etcd1 peerURLs=http://192.168.8.101:2380 clientURLs=http://192.168.8.101:2379 <span>isLeader=true</span></p>
<p>b200a8bec19bd22e: name=etcd3 peerURLs=http://192.168.8.103:2380 clientURLs=http://192.168.8.103:2379 isLeader=false</p>
<p><b><span>删除集群成员</span></b></p>
<p># etcdctl <b>member remove </b>b200a8bec19bd22e</p>
<p>Removed member 4d11141f72b2744c from cluster</p>
<p># etcdctl member list</p>
<p>2947dd07df9e44da: name=etcd2 peerURLs=http://192.168.8.102:2380 clientURLs=http://192.168.8.102:2379 isLeader=false</p>
<p>571bf93ce7760601: name=etcd1 peerURLs=http://192.168.8.101:2380 clientURLs=http://192.168.8.101:2379 isLeader=true</p>
<p><b><span>添加集群成员</span></b></p>
<p>https://github.com/coreos/etcd/blob/master/Documentation/op-guide/runtime-configuration.md</p>
<p><b><span>注意:</span>步骤很重要,不然会报集群ID不匹配</b></p>
<p># etcdctl member add --help</p>
<p>NAME:</p>
<p>etcdctl member add - add a new member to the etcd cluster</p>
<p>USAGE:</p>
<p>etcdctl member add</p>
<p><b><span>1.将目标节点添加到集群</span></b></p>
<p># etcdctl member <b>add etcd3 http://192.168.8.103:2380</b></p>
<p>Added member named etcd3 with ID 28e0d98e7ec15cd4 to cluster</p>
<p>ETCD_NAME="etcd3"</p>
<p>ETCD_INITIAL_CLUSTER="etcd3=http://192.168.8.103:2380,etcd2=http://192.168.8.102:2380,etcd1=http://192.168.8.101:2380"</p>
<p>ETCD_INITIAL_CLUSTER_STATE="existing"</p>
<p># etcdctl member list</p>
<p>2947dd07df9e44da: name=etcd2 peerURLs=http://192.168.8.102:2380 clientURLs=http://192.168.8.102:2379 isLeader=false</p>
<p>571bf93ce7760601: name=etcd1 peerURLs=http://192.168.8.101:2380 clientURLs=http://192.168.8.101:2379 isLeader=true</p>
<p><b>d4f257d2b5f99b64</b>: peerURLs=http://192.168.8.103:2380</p>
<p>此时,集群会为目标节点生成一个唯一的member ID</p>
<p><b><span>2.清空目标节点的data-dir</span></b></p>
<p>#rm -rf /opt/etcd</p>
<p><b><span>注意:</span></b>节点删除后,集群中的成员信息会更新,新节点加入集群是作为一个全新的节点加入,如果data-dir有数据,etcd启动时会读取己经存在的数据,启动时仍然用的老member ID,也会造成,集群不无法加入,所以一定要清空新节点的data-dir</p>
<p>2016-08-12 01:59:41.084928 E | rafthttp: failed to find member 2947dd07df9e44da in cluster ce2f2517679629de</p>
<p>2016-08-12 01:59:41.133698 W | rafthttp: failed to process raft message (raft: stopped)</p>
<p>2016-08-12 01:59:41.135746 W | rafthttp: failed to process raft message (raft: stopped)</p>
<p>2016-08-12 01:59:41.170915 E | rafthttp: failed to find member 2947dd07df9e44da in cluster ce2f2517679629de</p>
<p><b><span>3.在目标节点上启动etcd</span></b></p>
<p>etcd --name <b><span>etcd3</span></b> --data-dir /opt/etcd \</p>
<p>--initial-advertise-peer-urls http://192.168.8.<b>103</b>:2380 \</p>
<p>--listen-peer-urls http://192.168.8.<b>103</b>:2380 \</p>
<p>--listen-client-urls http://192.168.8.<b>103</b>:2379,http://127.0.0.1:2379 \</p>
<p>--advertise-client-urls http://192.168.8.<b>103</b>:2379 \</p>
<p>--initial-cluster-token etcd-cluster-1 \</p>
<p>--initial-cluster <b>etcd1=http://192.168.8.101:2380,etcd2=http://192.168.8.102:2380,etcd3=http://192.168.8.103:2380</b> \</p>
<p>--initial-cluster-state <b><span>existing</span></b></p>
<p><b><span>注意:</span></b> 这里的initial标记一定要指定为existing,如果为new则会自动生成一个新的member ID,和前面添加节点时生成的ID不一致,故日志中会报节点ID不匹配的错</p>
<p># etcdctl member list</p>
<p>28e0d98e7ec15cd4: name=etcd3 peerURLs=http://192.168.8.103:2380 clientURLs=http://192.168.8.103:2379 isLeader=false</p>
<p>2947dd07df9e44da: name=etcd2 peerURLs=http://192.168.8.102:2380 clientURLs=http://192.168.8.102:2379 isLeader=false</p>
<p>571bf93ce7760601: name=etcd1 peerURLs=http://192.168.8.101:2380 clientURLs=http://192.168.8.101:2379 isLeader=true</p>
<p><b><span>增</span></b><b><span>删改查</span></b></p>
<p># etcdctl set foo "bar"</p>
<p>bar</p>
<p># etcdctl get foo</p>
<p>bar</p>
<p># etcdctl mkdir hello</p>
<p># etcdctl ls</p>
<p>/foo</p>
<p>/hello</p>
<p># etcdctl --output extended get foo</p>
<p>Key: /foo</p>
<p>Created-Index: 9</p>
<p>Modified-Index: 9</p>
<p>TTL: 0</p>
<p>Index: 10</p>
<p>bar</p>
<p># etcdctl <b>--output json</b> get foo</p>
<p>{"action":"get","node":{"key":"/foo","value":"bar","nodes":null,"createdIndex":9,"modifiedIndex":9},"prevNode":null}</p>
<p># etcdctl <b>update</b> foo "etcd cluster is ok"</p>
<p>etcd cluster is ok</p>
<p># etcdctl get foo</p>
<p>etcd cluster is ok</p>
<p># etcdctl <b>import --snap</b> /opt/etcd/member/snap/db</p>
<p>starting to import snapshot /opt/etcd/member/snap/db with 10 clients</p>
<p>2016-08-12 01:18:17.281921 I | entering dir: /</p>
<p>finished importing 0 keys</p>
<p><b><span>REST API</span></b></p>
<p>https://github.com/coreos/etcd/tree/master/Documentation/learning</p>
<p># curl 192.168.8.101:2379/v2/keys</p>
<p>{"action":"get","node":{"dir":true,"nodes":[{"key":"/foo","value":"etcd cluster is ok","modifiedIndex":28,"createdIndex":9},{"key":"/hello","dir":true,"modifiedIndex":10,"createdIndex":10},{"key":"/registry","dir":true,"modifiedIndex":47,"createdIndex":47}]}}</p>
<p># curl -fs -X PUT 192.168.8.101:2379/v2/keys/_test</p>
<p>{"action":"set","node":{"key":"/_test","value":"","modifiedIndex":1439,"createdIndex":1439}}</p>
<p># curl -X GET 192.168.8.101:2379/v2/keys/_test</p>
<p>{"action":"get","node":{"key":"/_test","value":"","modifiedIndex":1439,"createdIndex":1439}}</p>
<p><span><b>四.systemd管控</b></span></p>
<p><span><b>1.建用户etcd</b></span></p>
<p>useradd -r -s /sbin/nologin etcd</p>
<p>chown -R etcd: /opt/etcd</p>
<p><span><b>2.创建systemd服务脚本etcd.service</b></span></p>
<p>cat &gt;/lib/systemd/system/etcd.service &lt;&lt;HERE</p>
<p><span></span></p>
<p><span>Description=Etcd Server</span></p>
<p><span>After=network.target</span></p>
<p><span>After=network-online.target</span></p>
<p><span>Wants=network-online.target</span></p>
<p><span></span></p>
<p><span>Type=notify</span></p>
<p><span>WorkingDirectory=/opt/etcd/</span></p>
<p><span>User=etcd</span></p>
<p><span>ExecStart=/usr/local/bin/etcd --config-file /etc/etcd.conf</span></p>
<p><span>Restart=on-failure</span></p>
<p><span>LimitNOFILE=1000000</span></p>
<p><span></span></p>
<p><span>WantedBy=multi-user.target</span></p>
<p>HERE</p>
<p><span><b>3.创建主配置文件etcd.conf</b></span></p>
<p>cat &gt;/etc/etcd.conf &lt;&lt;HERE</p>
<p><span>name: etcd2</span></p>
<p><span>data-dir: "/opt/etcd"</span></p>
<p><span>listen-peer-urls: "http://192.168.8.102:2380"</span></p>
<p><span>listen-client-urls: "http://192.168.8.102:2379,http://127.0.0.1:2379"</span></p>
<p><span>advertise-client-urls: "http://192.168.8.102:2379"</span></p>
<p>HERE</p>
<p><b><span>提示:</span></b>etcd-3.x版本支持<b>yaml</b>和<b>json</b>两种配置文件格式,配置模板见https://github.com/coreos/etcd/blob/master/etcd.conf.yml.sample</p>
<p>不同节点的配置文件不同,如上是etcd2的范本</p>
<p><span><b>4.测试systemd启动</b></span></p>
<p># systemctl enable etcd</p>
<p>Created symlink from /etc/systemd/system/multi-user.target.wants/etcd.service to /usr/lib/systemd/system/etcd.service.</p>
<p># systemctl start etcd</p>
<p># systemctl status etcd</p>
<p><b>●</b> etcd.service - Etcd Server</p>
<p>Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled; vendor preset: disabled)</p>
<p>Active: <b>active (running)</b> since 五 2016-08-12 03:06:30 CST; 8min ago</p>
<p>Main PID: 12099 (etcd)</p>
<p>CGroup: /system.slice/etcd.service</p>
<p>└─12099 /usr/local/bin/etcd --config-file /etc/etcd.conf</p>
<p>8月 12 03:10:30 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:11:00 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:11:30 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:12:00 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:12:30 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:13:00 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:13:30 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:14:00 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:14:30 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>8月 12 03:15:00 node2.example.com etcd: <b>the clock difference against peer 571bf93ce776...1s]</b></p>
<p>Hint: Some lines were ellipsized, use -l to show in full.</p>
頁: [1]
查看完整版本: etcd 集群管理维护