grep - 文本搜索
grep "ERROR" /var/log/syslog
grep -i "warning" /var/log/messages
grep -A 3 -B 2 "critical" application.log # 显示匹配行前后内容
awk - 强大的文本分析工具
awk '{print $1}' access.log | sort | uniq -c | sort -nr # 统计IP访问量
awk -F':' '{print $5}' /etc/passwd | sort | uniq # 提取用户全名
sed - 流编辑器
sed -n '10,20p' large.log # 查看10-20行
sed '/ERROR/!d' app.log # 只保留包含ERROR的行
logrotate - 日志轮转工具
# 配置示例 (/etc/logrotate.d/yourapp)
/var/log/yourapp/*.log {
daily
missingok
rotate 7
compress
delaycompress
notifempty
create 640 root adm
sharedscripts
postrotate
/usr/bin/systemctl reload yourapp > /dev/null
endscript
}
journalctl - systemd日志查看
journalctl -u nginx --since "2023-01-01" --until "2023-01-02"
journalctl -p err -b # 本次启动的错误日志
journalctl -f # 实时跟踪日志
安装与配置:
# 安装Java (ELK依赖)
sudo apt install openjdk-11-jdk
# 下载并安装ELK组件
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-amd64.deb
wget https://artifacts.elastic.co/downloads/kibana/kibana-7.10.2-amd64.deb
wget https://artifacts.elastic.co/downloads/logstash/logstash-7.10.2.deb
Logstash配置示例 (/etc/logstash/conf.d/apache.conf
):
input {
file {
path => "/var/log/apache2/access.log"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
stdout { codec => rubydebug }
}
安装:
# Ubuntu安装示例
wget https://packages.graylog2.org/repo/packages/graylog-4.3-repository_latest.deb
sudo dpkg -i graylog-4.3-repository_latest.deb
sudo apt-get update && sudo apt-get install graylog-server graylog-enterprise-plugins
配置输入源: 1. 通过Web界面(默认http://your-server:9000)配置Syslog/UDP输入 2. 配置提取器(Extractors)解析日志字段
Prometheus安装:
wget https://github.com/prometheus/prometheus/releases/download/v2.30.3/prometheus-2.30.3.linux-amd64.tar.gz
tar xvfz prometheus-*.tar.gz
cd prometheus-*
配置示例 (prometheus.yml
):
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['localhost:9100'] # Node Exporter
Grafana安装:
sudo apt-get install -y apt-transport-https
sudo apt-get install -y software-properties-common wget
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
sudo apt-get update
sudo apt-get install grafana
gnuplot
# 示例:绘制CSV数据
echo 'plot "data.csv" using 1:2 with lines' | gnuplot -persist
termgraph
# 安装
pip install termgraph
# 使用
echo "Jan 200\nFeb 300\nMar 400" | termgraph --title "Monthly Sales"
Grafana仪表板配置
Kibana可视化
# 统计访问最多的IP
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -20
# 统计HTTP状态码
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -nr
# 使用GoAccess实时分析
goaccess /var/log/nginx/access.log --log-format=COMBINED --real-time-html --output=report.html
# 使用vmstat和gnuplot
vmstat 1 10 > vmstat.out
gnuplot -e "set terminal png; set output 'vmstat.png'; plot 'vmstat.out' using 13 with lines title 'CPU idle'"
import pandas as pd
import matplotlib.pyplot as plt
# 读取日志
logs = pd.read_csv('app.log', sep='\t', names=['timestamp', 'level', 'message'])
# 分析日志级别分布
level_counts = logs['level'].value_counts()
level_counts.plot(kind='bar')
plt.title('Log Level Distribution')
plt.savefig('log_levels.png')
日志轮转优化
ELK性能调优
# Elasticsearch配置 (/etc/elasticsearch/jvm.options)
-Xms4g
-Xmx4g
# Logstash管道优化
pipeline.workers: 4
pipeline.batch.size: 125
实时分析优化
inotifywait -m -e modify /var/log/app.log | while read; do
tail -n 1 /var/log/app.log | grep "ERROR" && notify-send "Error detected"
done
确保日志文件权限正确
chmod 640 /var/log/sensitive.log
chown root:adm /var/log/sensitive.log
敏感信息过滤
# Logstash过滤器示例
filter {
mutate {
gsub => [
"message", "(password=)[^&\s]+", "\1[REDACTED]",
"message", "(credit_card=)\d+", "\1[REDACTED]"
]
}
}
加密日志传输
# Filebeat配置SSL
output.logstash:
hosts: ["logstash.example.com:5044"]
ssl.certificate_authorities: ["/etc/filebeat/logstash.crt"]
通过以上工具和技术的组合,您可以构建一个强大的Linux日志分析和可视化系统,从基础监控到复杂的业务分析都能胜任。