插件窝 干货文章 如何在Linux上配置高可用的容器日志管理

如何在Linux上配置高可用的容器日志管理

Kafka Elasticsearch 配置 Docker 630    来源:    2025-04-15

Linux高可用容器日志管理配置指南

一、方案概述

在Linux环境下配置高可用的容器日志管理系统,需要考虑日志收集、传输、存储和分析等多个环节的可靠性。以下是基于主流开源工具的推荐方案:

核心组件

  1. 日志收集:Fluentd/Fluent Bit
  2. 日志缓冲:Kafka
  3. 日志存储:Elasticsearch集群
  4. 可视化:Kibana
  5. 编排系统:Docker Swarm/Kubernetes

二、详细配置步骤

1. 日志收集层配置 (Fluentd/Fluent Bit)

安装Fluentd

# 使用td-agent安装
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-focal-td-agent4.sh | sh

配置Fluentd (td-agent.conf)

<source>
  @type forward
  port 24224
  bind 0.0.0.0
</source>

<match docker.**>
  @type kafka2
  brokers kafka1:9092,kafka2:9092,kafka3:9092
  topic_name docker_logs
  <format>
    @type json
  </format>
</match>

或使用更轻量的Fluent Bit

# 安装Fluent Bit
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh

Fluent Bit配置

[INPUT]
    Name forward
    Listen 0.0.0.0
    Port 24224

[OUTPUT]
    Name kafka
    Match *
    Brokers kafka1:9092,kafka2:9092,kafka3:9092
    Topics docker_logs

2. 消息队列层 (Kafka) 配置

安装Zookeeper和Kafka集群

# 示例:使用Docker Compose部署3节点Kafka集群
version: '3'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    ports:
      - "2181:2181"

  kafka1:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka1:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3

  kafka2:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    ports:
      - "9093:9093"
    environment:
      KAFKA_BROKER_ID: 2
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka2:9093
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3

  kafka3:
    image: confluentinc/cp-kafka:latest
    depends_on:
      - zookeeper
    ports:
      - "9094:9094"
    environment:
      KAFKA_BROKER_ID: 3
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka3:9094
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3

3. 日志存储层 (Elasticsearch) 配置

部署Elasticsearch集群

# 示例:3节点ES集群docker-compose配置
version: '3'
services:
  es01:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.14.0
    environment:
      - node.name=es01
      - cluster.name=es-docker-cluster
      - discovery.seed_hosts=es02,es03
      - cluster.initial_master_nodes=es01,es02,es03
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms2g -Xmx2g"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata01:/usr/share/elasticsearch/data
    ports:
      - 9200:9200

  es02:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.14.0
    environment:
      - node.name=es02
      - cluster.name=es-docker-cluster
      - discovery.seed_hosts=es01,es03
      - cluster.initial_master_nodes=es01,es02,es03
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms2g -Xmx2g"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata02:/usr/share/elasticsearch/data

  es03:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.14.0
    environment:
      - node.name=es03
      - cluster.name=es-docker-cluster
      - discovery.seed_hosts=es01,es02
      - cluster.initial_master_nodes=es01,es02,es03
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms2g -Xmx2g"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata03:/usr/share/elasticsearch/data

volumes:
  esdata01:
  esdata02:
  esdata03:

4. 日志消费层配置

Fluentd从Kafka消费并写入ES

<source>
  @type kafka2
  brokers kafka1:9092,kafka2:9092,kafka3:9092
  topics docker_logs
  format json
</source>

<match **>
  @type elasticsearch
  host es01
  port 9200
  logstash_format true
  logstash_prefix docker-logs
  flush_interval 10s
  <buffer>
    @type file
    path /var/log/td-agent/buffer/elasticsearch
    flush_mode interval
    retry_type exponential_backoff
    retry_wait 1s
    retry_max_interval 60s
    retry_timeout 60m
    chunk_limit_size 8MB
    total_limit_size 1GB
    overflow_action block
  </buffer>
</match>

5. 可视化层 (Kibana) 配置

安装Kibana

docker run -d --name kibana --net elasticsearch_net -p 5601:5601 docker.elastic.co/kibana/kibana:7.14.0

三、高可用性保障措施

1. 组件冗余

  • 每个组件至少部署3个节点
  • 使用奇数个节点保证选举一致性

2. 数据持久化

# 为关键组件配置持久化存储
volumes:
  esdata01:
    driver: local
    driver_opts:
      type: none
      device: /data/elasticsearch/node1
      o: bind

3. 监控告警

  • 使用Prometheus监控各组件状态
  • 配置Grafana仪表盘
  • 设置关键指标告警(如ES节点离线、Kafka积压等)

4. 日志轮转与归档

# 配置logrotate管理节点日志
/var/log/containers/*.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 644 root root
}

四、容器日志驱动配置

Docker日志驱动配置

# 修改daemon.json
{
  "log-driver": "fluentd",
  "log-opts": {
    "fluentd-address": "fluentd:24224",
    "tag": "docker.{{.Name}}",
    "labels": "production_status"
  }
}

Kubernetes日志收集

# 使用DaemonSet部署Fluent Bit
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:1.8.0
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config

五、性能优化建议

  1. 批量写入:调整Fluentd的flush_interval和buffer_chunk_limit
  2. 索引优化:在ES中按日期创建索引,设置合理的分片数
  3. 资源限制:为各组件配置合理的CPU/内存限制
  4. 网络优化:在高流量环境下考虑使用专用网络

六、故障排查命令

# 检查Fluentd状态
systemctl status td-agent

# 查看Kafka主题状态
kafka-topics --bootstrap-server kafka1:9092 --list

# 检查ES集群健康状态
curl -X GET "es01:9200/_cluster/health?pretty"

# 查看Kibana日志
docker logs kibana

# 测试日志流水线
echo '{"message":"test log"}' | nc fluentd 24224

通过以上配置,您可以建立一个高可用的容器日志管理系统,能够处理大规模容器化环境的日志需求,并确保在组件故障时系统仍能正常运行。