xnow - EFK日志体系快速入门

Distant Storm

7月，远方的风暴。

Elasticsearch、Filebeat、Kibana 合称EFK，常用于构建企业级的日志系统。Filebeat 在其中负责数据的采集：读取日志文件，将其发往Elasticsearch；Elasticsearch是数据库，处理和存储Filebeat发过来的数据；Kibana是UI，用于查询Elasticsearch中的数据。三者协作，构成了完整的日志平台。此外，Elasticsearch具有强大的全文搜索能力，也常在业务上用作搜索引擎。

本文将使用docker compose编排Nginx、Filebeat、Elasticsearch和Kibana，实现将Nginx的access log分字段解析后存储到Elasticsearch，并在Kibana上展示和查询。

演示环境：

Host：Debian 12
Docker版本：Docker 28.3.2

EFK各软件的版本在compose.yaml 中均有指定。不同版本Elasticsearch、Filebeat、Kibana的配置与界面很可能不一样，尽量使用新版本，如果和文章中的版本有出入导致无法成功配置，请看看官方文档。

$ mkdir /tmp/efk
$ cd /tmp/efk

Nginx配置文件

创建Nginx的目录，存放Nginx的配置文件和日志

$ mkdir -p nginx/logs

编辑Nginx的配置文件 nginx/nginx.conf，创建如下的精简化配置

user  nginx;
worker_processes  auto;
error_log  /var/log/nginx/error.log notice;
pid        /run/nginx.pid;

events {
  worker_connections  1024;
}

http {
  include       /etc/nginx/mime.types;
  default_type  application/octet-stream;

  log_format main '[$time_local] | $http_host | $remote_addr | "$remote_user" | ' 
    '$request | $status | $body_bytes_sent | "$http_user_agent" | ' 
    '"$http_referer" | "$http_x_forwarded_for" | $request_time'; 


  access_log  /var/log/nginx/access.log  main;
  sendfile        on;
  keepalive_timeout  65;

  server {
    listen 80 default_server;
    server_name _;
    location / {
      root   /usr/share/nginx/html;
      index  index.html index.htm;
    }
  }
}

在文件中添加了server段配置。此外，修改了 Nginx的access log的格式，使用竖线分割日志各字段，可读性更高，也更方便使用工具分析。后续在Elasticsearch中，将使用pipeline对Nginx日志进行解析。

Filebeat配置

和Filebeat具备相同采集能力的还有Logstash、Fluentd等，选择Filebeat的原因是因为它用Golang编写，足够轻量，占用的资源少，适合在每个节点上作为守护进程部署。

创建Filebeat的配置文件目录

$ mkdir -p filebeat/data
$ sudo chown -R 1000:1000 filebeat

创建fileabeat的配置文件，filebeat/filebeat.yml，内容如下

filebeat.inputs:
  - type: filestream
    id: nginx_access_log_id
    prospector.scanner.fingerprint.length: 64
    paths:
      - /tmp/logs/access.log
    fields:
      log_type: nginx_access_log
    pipeline: "nginx_access_log_pipeline"


  - type: filestream
    id: nginx_error_log_id
    prospector.scanner.fingerprint.length: 64
    paths:
      - /tmp/logs/error.log
    fields:
      log_type: nginx_error_log

output.elasticsearch:
  hosts: ['elasticsearch:9200']
  flush_interval: 1s

  indices:
    - index: 'nginx_access_log-%{+yyyy_MM_dd}'
      when.equals:
        fields.log_type: "nginx_access_log"

    - index: 'nginx_error_log-%{+yyyy_MM_dd}'
      when.equals:
        fields.log_type: "nginx_error_log"

文件中定义了2个input，分别读取Nginx的access.log 和 error.log日志文件，并发往Elasticsearch中对应的index。

Elasticsearch的index的名字中使用了变量%{+yyyy_MM_dd}，用于区分index的日期，便于后续按日期删除久远的index。
对access log的index配置了pipeline。这个pipeline是用于解析Nginx的access log的。之后会手动在Elasticsearch中创建这个pipeline。

Filebeat的配置建议参考：https://www.elastic.co/docs/reference/beats/filebeat/filebeat-reference-yml

Filebeat除了可以将日志输出到Elasticsearch外，支持的output插件的还有Kafka和Logstash等等。使用Kafka（Filebeat -> Kafka -> Elasticsearch）的好处是可以做到流量削峰，应对瞬时大流量；使用Logstash（Filebeat -> Logstash -> Elasticsearch）的好处是可以利用Logstash丰富的日志解析处理能力；当然，也能将二者进一步结合，形成Filebeat -> Kafka -> Logstash -> Elasticsearch 的处理链路。不论经历了多少轮中间件的处理，日志的最终归宿一般还是Elasticsearch。

docker compose 部署

创建Elasticsearch的数据目录

$ mkdir es_data
$ sudo chown -R 1000:1000 es_data

创建compose.yaml文件，编排所有的容器，内容如下：

services:
  nginx:
    image: nginx:1.29.0
    volumes:
      - ./nginx/logs/:/var/log/nginx/         # 日志输出到宿主机目录
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf
    ports:
      - 80:80

  filebeat:
    image: elastic/filebeat:9.1.0
    container_name: filebeat
    restart: always
    volumes:
      - ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml
      - ./filebeat/data:/usr/share/filebeat/data
      - ./nginx/logs:/tmp/logs:ro   # 挂载宿主机上的日志目录到容器内

  elasticsearch:
    image: elasticsearch:9.0.4
    container_name: elasticsearch
    restart: always
    environment:
      - cluster.name=elasticsearch
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
      - discovery.type=single-node
      - xpack.security.enabled=false
      - xpack.security.enrollment.enabled=false
    ports:
      - "9200:9200"
      - "9300:9300"
    volumes:
      - ./es_data:/usr/share/elasticsearch/data
    healthcheck:
      interval: 5s
      retries: 60
      test: curl --fail -s -o /dev/null http://127.0.0.1:9200/

  kibana:
    image: kibana:9.0.4
    container_name: kibana
    restart: always
    environment:
      - ELASTICSEARCH_URL=http://elasticsearch:9200
    ports:
      - "5601:5601"
    depends_on:
      elasticsearch:
        condition: service_healthy

配置说明：

nginx：把nginx/nginx.conf 映射到容器内，把Nginx容器里的日志目录映射到宿主机上。
filebeat：把Nginx的日志以只读方式映射到Filebeat容器内，方便Filebeat读取和采集。
elasticsearch：将Elasticsearch的数据保存目录映射到外部做数据持久化。
kibana：配置Elasticsearch的地址配置到环境变量中，启动的时候会依赖Elasticsearch。

本文这里使用了单节点的Elasticsearch。如果想了解在Docker中对Elasticsearch做集群化部署的方案，请查看官方文档：https://www.elastic.co/docs/deploy-manage/deploy/self-managed/install-elasticsearch-docker-compose

安全声明：本文简化了部署的配置，关闭了Elasticsearch的认证等功能，在生产中部署时，还是需要启用这些安全选项。

启动compose

$ docker compose up -d

检查Elasticsearch的状态

$ curl http://127.0.0.1:9200/_cat/health?v
epoch      timestamp cluster       status node.total node.data shards pri relo init unassign unassign.pri pending_tasks max_task_wait_time active_shards_percent
1753751269 01:07:49  elasticsearch yellow          1         1     37  37    0    0        5            0             0                  -                 88.1%

正常的Elasticsearch集群状态应该是green，由于这里是单节点Elasticsearch，所以状态是yellow，但不影响后面的操作。

Elasticsearch配置pipeline

Elasticsearch中pipeline可以对消息执行解析、拆分、转化、增删等操作。详细可以查看文档： https://www.elastic.co/docs/reference/enrich-processor/

创建如下的json文件，命名为 nginx_access_log_pipeline.json，内容是用于解析Nginx的access log的pipepine。其中的grok处理器提供了从字符串字段中提取并结构化字段的模式，为各个字段指定名字，设置类型。

{
  "description": "Nginx access log pipeline",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "\\[%{HTTPDATE:timestamp}\\] \\| %{DATA:domain} \\| %{IP:client_ip} \\| \"%{DATA:remote_user}\" \\| %{WORD:method} %{DATA:request} %{DATA:http_version} \\| %{NUMBER:status_code:int} \\| %{NUMBER:response_size:int} \\| \"%{DATA:user_agent}\" \\| \"%{DATA:referer}\" \\| \"%{DATA:x_forwarded_for}\" \\| %{NUMBER:response_time:float}"
        ],
        "ignore_missing": true
      }
    },
    {
      "date": {
        "field": "timestamp",
        "target_field": "@timestamp",
        "formats": [
          "dd/MMM/yyyy:HH:mm:ss Z"
        ]
      }
    },
    {
      "remove": {
        "field": [
          "timestamp",
          "ecs.version"
        ]
      }
    }
  ]
}

通过Elasticsearch接口创建pipeline

$ curl -X PUT -H "content-type: application/json" \
    http://127.0.0.1:9200/_ingest/pipeline/nginx_access_log_pipeline \
    -d @./nginx_access_log_pipeline.json

检查创建好的pipeline

# 查看所有pipline
$ curl http://127.0.0.1:9200/_ingest/pipeline | jq .

# 或者，查看具体的pipeline
$ curl http://127.0.0.1:9200/_ingest/pipeline/nginx_access_log_pipeline?pretty

pipeline相关api的文档：https://www.elastic.co/guide/en/elasticsearch/reference/master/get-pipeline-api.html

测试日志采集

启动docker compose后，访问Nginx会产生日志，并被Filebeat采集和发往Elasticsearch中。

$ curl http://127.0.0.1/    # status_code 200
$ curl http://127.0.0.1/x   # status_code 404

查看Elasticsearch中的日志数量

$ curl -s http://127.0.0.1:9200/_cat/indices?v | awk '/nginx/ || NR ==1'
health status index                          uuid                   pri rep docs.count docs.deleted store.size pri.store.size dataset.size
yellow open   nginx_access_log-2025_07_30    L4T0gXEJQjiOIIvOX3oBhw   1   1          2            0       227b           227b         227b
yellow open   nginx_error_log-2025_07_30     88e58AUhQ3SW2_zRNWPj7w   1   1         19            0     36.7kb         36.7kb       36.7kb

这里可以看到Nginx的access log和error log的条目数量和占用的空间大小。

配置Kibana查询

在浏览器中打开Kibana的地址: http://127.0.0.1:5601 。通过如下的方法，创建Kibana的数据视图，并在Kibana中执行各式各样的查询方法。

创建Kibana的Data View

1 ）打开Kibana的左侧边栏，点击Management，然后继续在左侧边栏的Kibana下点击Data Views 。或者直接打开 http://127.0.0.1:5601/app/management/kibana/dataViews 。
2）在新页面中点击Create data view这个按钮。在弹出的表单中填写数据视图。
- Name：nginx_access_log
- Index pattern：nginx_access_log-*
- 填写好之后，点击 Save data view to Kibana 按钮提交。
3）用和上面同样的方法为Nginx的error log创建 nginx_error_log的Data view。

在Kibana中查询日志

1）打开左侧栏，在 Analytics 下点击 Discover ，或者直接打开链接 http://127.0.0.1:5601/app/discover#/ 。
2）在页面左上角的 Data view中选中刚刚创建的 nginx_access_log，就可以看到刚刚生成的access log了，有2条访问记录。
- 在左侧可以看到日志的拆分后的所有字段
- 在搜索框中输入 status_code: 404 查看http返回码为 404的日志。
3）Kibana上切换Data view到nginx_error_log
- 查看界面左侧的日志字段，由于没有用pipeline做字段解析，大部分都是内置字段
- 在搜索框中输入 "no such file"，使用模糊搜索的方式查询日志，可以查到对应的错误日志。

Elasticsearch的常用API

Elasticsearch 的API接口操作方法见链接，https://www.elastic.co/docs/api/doc/elasticsearch/v9/

检查集群的基本信息，包括各组组件版本，兼容性。

$ curl http://127.0.0.1:9200/

查看集群的健康状态

$ curl http://127.0.0.1:9200/_cluster/health?pretty

查看Elasticsearch集群的节点信息

$ curl -s http://127.0.0.1:9200/_cat/nodes?v

index的操作

创建index

$ curl -X PUT -H "content-type: application/json" http://127.0.0.1:9200/test-00001 -d '{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}'

查看index的设置

$ curl http://127.0.0.1:9200/test-00001/_settings?pretty

查看所有index

$ curl http://127.0.0.1:9200/_cat/indices

查看具体index的信息，包括分片、副本等详细信息

$ curl http://127.0.0.1:9200/test-00001

向index批量写入数据，如果指定的index不存在，会自动创建

$ curl -X POST http://127.0.0.1:9200/_bulk?pretty \
  -H "Content-Type: application/json" \
  -d'{ "index" : { "_index" : "test-00001" } }
    {"name": "kong", "age": "bar" }
    { "index" : { "_index" : "test-00001" } }
    {"name": "ming", "title": "bar" }
    { "index" : { "_index" : "test-00001" } }
    {"name": "fang", "title": "bar" }
'

查看index的mapping，mapping相当于关系数据库中的schema，用于定义索引的结构和字段类型等。

$ curl http://127.0.0.1:9200/nginx_error_log-2025_07_30/_mapping

删除index

$ curl -X DELETE http://127.0.0.1:9200/test-00001

搜索操作

Elasticsearch的详细查询用法，参考：https://www.elastic.co/docs/reference/query-languages/querydsl ，下面介绍常用的两种。

查看access log 索引中的status_code=404的条目。

$ curl -H "content-type: application/json" http://127.0.0.1:9200/nginx_access_log-2025_07_30/_search -d  '
{"query":{"match": {"status_code": 404}}}'

对error log 进行模糊搜索，查找 "not found" 字符串。

$ curl -H "content-type: application/json" http://127.0.0.1:9200/nginx_error_log-2025_07_30/_search -d  '
{"query":{"query_string": {"query":"No such file"}}}'

总结

本文概述了EFK日志管理系统的核心概念和组件，使用docker compose 搭建了一个简易版本的EFK系统。如果要将其用于生产环境，各个组件都还有很大的优化空间，尤其是在Elasticsearch的集群部署和安全认证方面还有很多工作要做。

需要指出的是，EFK体系作为一个强大而复杂的日志管理解决方案，其功能远不止本文所涉及的入门知识，Filebeat、Elasticsearch、Kibana的配置项都十分丰富，本文仅仅浅尝了其中的百分之一。当遇到问题时，建议读读文中提到的官方文档。最后，期望你玩的愉快。

昔我往矣

EFK日志体系快速入门