ELK beats通用配置说明(12th)

幸福有爱 發表於 2023-9-27 00:00:00

<div id="navCategory"><h5 class="catalogue">目录</h5><ul class="first_class_ul"><li>Shipper</li><li>Output</li><li>Redis Output (不推荐)</li></ul></div><p>Beats配置文件是以YAML语法，该文件包含用于所有的beats的通用配置选项，以及其特点的选项。下面说说通用的配置，特定的配置要看各自beat文档。通用的配置如下几部分：</p>
<ul>
<li>Shipper</li>
<li>Output</li>
<li>Logging(可选)</li>
<li>Run Options（可选）</li>
</ul>
<p class="maodian"></p><h2>Shipper</h2>
<p>包含beat配置选项和一些控制其行为的常规设置。</p>
<p>其实每个配置选项的注释说明已经说的很清楚了，有些人就是视而不见。</p>
<p>如下所示：</p><pre class="brush:bash;toolbar:false">shipper:
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
# If this options is not defined, the hostname is used.
#name:

# The tags of the shipper are included in their own field with each
# transaction published. Tags make it easy to group servers by different
# logical properties.
tags: ["service-X", "web-tier"]

# Uncomment the following if you want to ignore transactions created
# by the server on which the shipper is installed. This option is useful
# to remove duplicates if shippers are installed on multiple servers.
ignore_outgoing: true

# How often (in seconds) shippers are publishing their IPs to the topology map.
# The default is 10 seconds.
refresh_topology_freq: 10

# Expiration time (in seconds) of the IPs published by a shipper to the topology map.
# All the IPs will be deleted afterwards. Note, that the value must be higher than
# refresh_topology_freq. The default is 15 seconds.
topology_expire: 15

# Configure local GeoIP database support.
# If no paths are not configured geoip is disabled.
#geoip:
#paths:
#- "/usr/share/GeoIP/GeoLiteCity.dat"
#- "/usr/local/var/GeoIP/GeoLiteCity.dat"</pre><p></p>
<h4>name</h4>
<p>beat名称，如果没设置以hostname名自居。该名字包含在每个发布事务的shipper字段。可以以该名字对单个beat发送的所有事务分组。</p>
<p>在启动时，每个beat将发送自己的IP、端口、名字到elasticsearch。这些信息存储在elasticsearch作为网络拓扑图，将每个beat的IP和端口与在这里你所指定的名字映射。</p>
<p>当一个beat接收到一个新的请求和响应称为事务，beat会查询elasticsearch查看网络拓扑是否包含该源服务器IP和端口以及目标服务器。如果该信息可用，在输出的client_server字段被设置成运行在源服务器的beat名称，并且server字段被设置成运行在目标服务器的beat名称。</p>
<p>要在elasticsearch中使用拓扑图的话，必须设置save_topology为TRUE并且elasticsearch为输出。</p><pre class="brush:bash;toolbar:false">shipper:
name: "ttlsa-shipper"</pre><p></p>
<h4>tags</h4>
<p>beat标签列表，包含在每个发布事务的tags字段。标签可用很容易的按照不同的逻辑分组服务器。例如，一个web集群服务器，可以对beat添加上webservers标签，然后在kibana的visualisation界面以该标签过滤和查询整组服务器。</p><pre class="brush:bash;toolbar:false">shipper:
tags: ["mysql-db", "aws", "rdb"]</pre><p></p>
<h4>ignore_outgoing</h4>
<p>如果启用了ignore_outgoing选项，beat将忽略从运行beat服务器上所有事务。不好描述，看下面的解释。</p>
<p>这是非常有用的，当两个beat发布相同的事务。因为一个beat认为是输出队列的事务，另一个beat认为是输入队列的事务。你可以结束这个重复的事务，启用该选项即可。</p>
<p>例如，有下面这个情景，三台服务器每台都安装了一个beat，t1在server1和server2之间交换事务，t2在server2和server3之间交换事务。</p>
<p><img title="ELK beats通用配置说明(12th)" class="alignnone size-full wp-image-10733" src="https://zhuji.jb51.net/uploads/img/20230519/6f7729e8def255480c059d27e03192ae.jpg" width="536" height="34"></p>
<p>默认情况下，每个事务要被索引两次，因为beat2会看到两个事务。当ignore_outgoing为false时，发布的事务是这样的：</p>
<ul class="itemizedlist" type="disc">
<li class="listitem">Beat1: t1</li>
<li class="listitem">Beat2: t1 and t2</li>
<li class="listitem">Beat3: t2</li>
</ul>
<p>为了避免重复，需要强制beat只发送输入的事务，忽略本地服务器创建的事务。当ignore_outgoing为true时，发布的事务是这样的：</p>
<ul class="itemizedlist" type="disc">
<li class="listitem">Beat1: none</li>
<li class="listitem">Beat2: t1</li>
<li class="listitem">Beat3: t2</li>
</ul>
<h4>refresh_topology_freq</h4>
<p>拓扑图刷新的间隔。也就是设置每个beat向拓扑图发布其IP地址的频率。默认是10秒。</p>
<h4>topology_expire</h4>
<p>拓扑的过期时间。在beat停止发布其IP地址时非常有用。当过期后IP地址将自动的从拓扑图中删除。默认是15秒。</p>
<h4>geoip.paths</h4>
<p>GeoIP数据库的搜索路径。beat找到GeoIP数据库后加载，然后对每个事务输出client的GeoIP位置。</p>
<p>推荐值为<code class="literal">/usr/share/GeoIP/GeoLiteCity.dat</code> 和<code class="literal">/usr/local/var/GeoIP/GeoLiteCity.dat。</code></p>
<p>目前只有Packetbeat使用该选项。</p>
<p class="maodian"></p><h2>Output</h2>
<p>可以配置多个输出来导出相关事务。当前支持的输出类型有：</p>
<div class="itemizedlist">
<ul class="itemizedlist" type="disc">
<li class="listitem">Elasticsearch</li>
<li class="listitem">Logstash</li>
<li class="listitem">
Redis (不推荐)</li>
<li class="listitem">File</li>
<li class="listitem">Console</li>
</ul>
</div>
<p>可以同时启用一个或多个输出。输出插件负责发送JSON格式化的事务数据到下一个管道。同时还维护网络拓扑。</p>
<h4>Elasticsearch Output</h4>
<p>当指定elasticsearch作为输出，beat通过elasticsearch HTTP API将事务直接发送到elasticsearch。</p><pre class="brush:bash;toolbar:false">output:
elasticsearch:
# The Elasticsearch cluster
hosts: ["http://es.ttlsa.com:9200"]

# Comment this option if you don't want to store the topology in
# Elasticsearch. The default is false.
# This option makes sense only for Packetbeat
save_topology: true

# Optional index name. The default is packetbeat and generates
# YYYY.MM.DD keys.
index: "packetbeat"

# List of root certificates for HTTPS server verifications
cas: ["/etc/pki/root/ca.pem"]

# TLS configuration.
tls:
   # Certificate for TLS client authentication
   certificate: "/etc/pki/client/cert.pem"

   # Client Certificate Key
   certificatekey: "/etc/pki/client/cert.key"</pre><p>启用SSL，在hosts配置项指定https。</p><pre class="brush:bash;toolbar:false">output:
elasticsearch:
# The Elasticsearch cluster
hosts: ["https://localhost:9200"]

# Comment this option if you don't want to store the topology in
# Elasticsearch. The default is false.
# This option makes sense only for Packetbeat
save_topology: true

# HTTP basic auth
username: "admin"
password: "s3cr3t"</pre><p>如果elasticsearch节点通过IP:PORT定义，需要加protocol: https，如下：</p><pre class="brush:bash;toolbar:false">output:
elasticsearch:
# The Elasticsearch cluster
hosts: ["localhost"]

# Optional http or https. Default is http
protocol: "https"

# Comment this option if you don't want to store the topology in
# Elasticsearch. The default is false.
# This option makes sense only for Packetbeat
save_topology: true

# HTTP basic auth
username: "admin"
password: "s3cr3t"</pre><p></p>
<h4>hosts</h4>
<p>可以指定连接的elasticsearch节点列表。事件将随机分配到这些节点。如果某个节点不可达，事件将自动发送到另一个节点。每个elasticsearch节点定义个格式：URL或者IP:PORT。如http://es1.ttlsa.com,https://es2.ttlsa.com或者10.0.0.1。如果没有指定端口默认是9200。</p>
<p>当以IP:PORT形式定义elasticsearch节点，则schema和path取自protocol和path配置项。如：</p><pre class="brush:bash;toolbar:false">output:
elasticsearch:
# The Elasticsearch cluster
hosts: ["10.45.3.2:9220", "10.45.3.1:9230"]

# Optional http or https. Default is http
protocol: https

# HTTP Path at which each Elasticsearch server lives
path: /elasticsearch</pre><p>在上面的例子中，Elasticsearch可用节点是https://10.45.3.2:9220/elasticsearch和https://10.45.3.1:9230/elasticsearch。</p>
<div class="titlepage">
<div>
<div>
<h4 class="title">worker</h4>
<p>配置每台主机发送事件到elasticsearch的worker数量。在负载均衡模式下最好启用。例如，2台主机和3个worker，一共将启动6个worker，每台主机3个worker。</p>
</div>
</div>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">host (不推荐)</h4>
</div>
</div>
</div>
<p>elasticsearch服务的主机。该选项不建议使用，已经被hosts替换。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
port (不推荐)</h4>
</div>
</div>
</div>
<p>elasticsearch服务的端口。该选项不建议使用，已经被hosts替换。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
username</h4>
</div>
</div>
</div>
<p>连接elasticsearch的基础验证用户名。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
password</h4>
</div>
</div>
</div>
<p>连接elasticsearch的基础验证密码。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">protocol</h4>
</div>
</div>
</div>
<p>定义哪种协议可达elasticsearch。选项有http或者https。默认是http。但是，如果在hosts配置项指定了URL，URL中指定的协议将覆盖protocol值。</p>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">path</h4>
</div>
</div>
</div>
<p>调用HTTP API的前置路径前缀。一般用在elasticsearch监听在HTTP反向代理，同时又自定义API前缀。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">index</h4>
</div>
</div>
</div>
<p>指定写入事件的索引根名称。默认是beat名称。例如Packetbeat，根索引名称是<code class="literal">YYYY.MM.DD</code> (如, <code class="literal">packetbeat-2015.11.29</code>)。</p>
</div>
<div class="section">
<div class="titlepage">
<h4 class="title">max_retries</h4>
<p>发送到特定logstash的最大尝试次数。如果达到该次数仍不成功，事件将被丢弃。默认是3。</p>
<p>值0表示禁用重试。值小于0将无限重试知道事件已经发布。</p>
</div>
<p>如果输出插件把事件丢弃，每个beat要实现必须去顶是否要丢失事件或者尝试再次发送。如果到达max_retries后发送操作还是不成功，beat可选通知。</p>
</div>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">bulk_max_size</h4>
</div>
</div>
</div>
<p>单个elasticsearch批量API索引请求的最大事件数。默认是50。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
timeout</h4>
</div>
</div>
</div>
<p>elasticsearch请求超时事件。默认90秒。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
flush_interval</h4>
</div>
</div>
</div>
<p>新事件两个批量API索引请求之间需要等待的秒数。如果bulk_max_size在该值之前到达，额外的批量索引请求生效。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
save_topology</h4>
</div>
</div>
</div>
<p>elasticsearch是否保持拓扑。默认false。该值只支持Packetbeat。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
topology_expire</h4>
</div>
</div>
</div>
<p>elasticsearch保存拓扑信息的有效时间。默认15秒。</p>
<div class="titlepage">
<div>
<div>
<h4 class="title">tls</h4>
</div>
</div>
</div>
<p>配置TLS参数选项，如证书颁发机构等，用于基于https的连接。如果tls丢失，主机的CAs用于https连接elasticsearch。</p>
</div>
<div class="titlepage">
<div>
<div>
<h2 class="title">Logstash Output</h2>
</div>
</div>
</div>
<p>logstash输出通过使用lumberjack协议将事件直接发送到logstash。要使用此选项，必须在logstash上安装和配置logstash-input-beats插件。logstash允许额外的处理和生成事件路由。</p>
<p>每个发送到logstash事件包含额外的索引和过滤元数据。如：</p><pre class="brush:bash;toolbar:false">{
...
"@metadata": {
   "beat": "<beat>",
   "type": "<event type>"
}
}</pre><p>在logstash，你可以配置elasticsearch输出插件使用元数据和事件类型进行索引。</p>
<p>下面的logstash1.5配置文件设置logstash使用beat报告的索引和文档类型将事件索引到elasticsearch。索引使用取决于logstash确定的@timestamp字段。</p><pre class="brush:bash;toolbar:false">input {
beats {
port => 5044
}
}

output {
elasticsearch {
host => "localhost"
port => "9200"
protocol => "http"
index => "%{[@metadata]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata]}"
}
}</pre><p>logstash 2.x 相同的配置：</p><pre class="brush:bash;toolbar:false">input {
beats {
port => 5044
}
}

output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "%{[@metadata]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata]}"
}
}</pre><p>事件被索引到elasticsearch，类似于将事件通过beats直接索引到elasticsearch。如下配置，如何配置beat使用logstash：</p><pre class="brush:bash;toolbar:false">output:
logstash:
hosts: ["localhost:5044"]

# index configures '@metadata.beat' field to be used by Logstash for
# indexing. By Default the beat name is used (e.g. filebeat, topbeat, packetbeat)
index: mybeat</pre><p></p>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">hosts</h4>
</div>
</div>
</div>
<p>要连接logstast的服务器列表。每个列表项可以包含端口号。如果没有指定端口，将使用默认值。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
worker</h4>
</div>
</div>
</div>
<p>配置每个主机发布事件的worker数量。在负载均衡模式下最好启用。例如，如果2台主机和3个worker，一共6个worker将启动，每台3个worker。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
loadbalance</h4>
</div>
</div>
</div>
<p>如果设置为TRUE和配置了多台logstash主机，输出插件将负载均衡的发布事件到所有logstash主机。如果设置为false，输出插件发送所有事件到随机的一台主机上，如果选择的不可达将切换到另一台主机。默认是false。</p><pre class="brush:bash;toolbar:false">output:
logstash:
hosts: ["localhost:5044", "localhost:5045"]

# configure index prefix name
index: mybeat

# configure logstash plugin to loadbalance events between the logstash instances
loadbalance: true</pre><p></p>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">port</h4>
</div>
</div>
</div>
<p>hosts配置项如果没有指定端口好将使用的默认端口。默认是10200。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
index</h4>
</div>
</div>
</div>
<p>如上解释</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
tls</h4>
</div>
</div>
</div>
<p>如上解释</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
timeout</h4>
</div>
</div>
</div>
<p>等待logstash响应的超时时间，默认30秒。</p>
<div class="titlepage">
<div>
<div>
<h4 class="title">max_retries</h4>
<p>如上解释</p>
</div>
</div>
</div>
<p class="maodian"></p><h2>Redis Output (不推荐)</h2>
</div>
</div>
<p>被beats代替，不推荐使用了。不再此做介绍了。</p>
<div class="titlepage">
<div>
<div>
<h2 class="title">File Output</h2>
</div>
</div>
</div>
<p>文件输出将事务转存到一个文件，每个事务是一个JSON格式。主要用于测试。也可以用作logstash输入。</p><pre class="brush:bash;toolbar:false">output:

# File as output
# Options:
# path: where to save the files
# filename: name of the files
# rotate_every_kb: maximum size of the files in path
# number of files: maximum number of files in path
file:
path: "/tmp/packetbeat"
filename: packetbeat
rotate_every_kb: 1000
number_of_files: 7</pre><p></p>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">path</h4>
</div>
</div>
</div>
<p>指定文件保存的路径。必须的。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
filename</h4>
</div>
</div>
</div>
<p>文件名。默认是 Beat 名称。上面配置将生成 <code class="literal">packetbeat</code>, <code class="literal">packetbeat.1</code>, <code class="literal">packetbeat.2 等文件。</code></p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
rotate_every_kb</h4>
</div>
</div>
</div>
<p>定义每个文件最大大小。当大小到达该值文件将轮滚。默认值是1000 KB。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
number_of_files</h4>
</div>
</div>
</div>
<p>保留文件最大数量。文件数量到达该值将删除最旧的文件。默认是7，一星期。</p>
<div class="titlepage">
<div>
<div>
<h2 class="title">Console Output</h2>
</div>
</div>
</div>
<p>标准输出，JSON 格式。</p>
</div>
<p></p><pre class="brush:bash;toolbar:false">output:
console:
pretty: true</pre><p></p>
<div class="titlepage">
<div>
<div>
<h4 class="title">pretty</h4>
</div>
</div>
</div>
<p>如果设置为TRUE，事件将很友好的格式化标准输出。默认false。</p>
<div class="titlepage">
<div>
<div>
<h2 class="title">Logging (Optional)</h2>
</div>
</div>
</div>
<p>配置beats日志。日志可以写入到syslog也可以是轮滚日志文件。默认是syslog。</p><pre class="brush:bash;toolbar:false">logging:
level: warning

# enable file rotation with default configuration
to_files: true

# do not log to syslog
to_syslog: false

files:
path: /var/log/mybeat
name: mybeat.log
keepfiles: 7</pre><p></p>
<div class="titlepage">
<div>
<div>
<h3 class="title">Logging options</h3>
</div>
</div>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
to_syslog</h4>
</div>
</div>
</div>
<p>如果启用发送所有日志到系统日志。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
to_files</h4>
</div>
</div>
</div>
<p>日志发送到轮滚文件。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
level</h4>
</div>
</div>
</div>
<p>日志级别。debug, info, warning, error 或 critical。如果使用debug，但没有配置selectors，* selectors将被使用。默认error。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">selectors</h4>
</div>
</div>
</div>
<p>The list of debugging-only selector tags used by different Beats components. Use <code class="literal">*</code> to enable debug output for all components. For example add <code class="literal">publish</code> to display all the debug messages related to event publishing. When starting the Beat, selectors can be overwritten using the <code class="literal">-d</code> command line option (<code class="literal">-d</code>also sets the debug log level).</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
files.path</h4>
</div>
</div>
</div>
<p>日志文件目录。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
files.name</h4>
</div>
</div>
</div>
<p>日志文件名称。默认是Beat 名称。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">files.rotateeverybytes</h4>
</div>
</div>
</div>
<p>日志文件的最大大小。默认 10485760 (10 MB)。</p>
</div>
<div class="section">
<div class="titlepage">
<div>
<div>
<h4 class="title">
files.keepfiles</h4>
</div>
</div>
</div>
<p>保留日志周期。默认 7。值范围为2 到 1024。</p>
<div class="titlepage">
<div>
<div>
<h3 class="title">Logging Format</h3>
</div>
</div>
</div>
<p>每个日志类型有不同的日志格式：</p>
<div class="itemizedlist">
<ul class="itemizedlist" type="disc">
<li class="listitem">to syslog: 系统日志加上自己的时间戳。</li>
<li class="listitem">to file: RFC 3339 格式用于时间戳<code class="literal">2006-01-02T15:04:05Z07:00 WARN log-message</code>. 该给事包含时区和日志级别。</li>
<li class="listitem">to stderr: UTC 格式用于时间戳 <code class="literal">2015/11/12 09:03:37.369262 geolite.go:52: WARN log-message。该格式包括UTC时间戳和毫秒，主要用于调试。</code>
</li>
</ul>
</div>
</div>
<div class="titlepage">
<div>
<div>
<h2 class="title">Run Options (Optional)</h2>
</div>
</div>
</div>
<p>beats创建套接字后放权。打开套接字需要root访问权限，但不是所有都需要该权限。因此，建议以普通用户运行beats。可以通过uid、gid来指定。</p>
<p>在Linux上，setuid不会改变所有线程的uid，所以Go garbage收集器还将以root用户运行。另外注意，进程监控需要以root权限运行。</p><pre class="brush:bash;toolbar:false">runoptions:
uid=501
gid=501</pre><p></p>

頁: [1]

圆梦公社's Archiver

ELK beats通用配置说明(12th)