Prometheus + Nginx Exporter 메트릭 수집
Prometheus는 풀(Pull) 방식으로 대상 서버의 메트릭을 수집하는 오픈소스 모니터링 시스템입니다. Nginx와 Tomcat의 내부 상태를 Prometheus 메트릭으로 노출하면 Grafana 등의 시각화 도구와 연동해 실시간 대시보드와 알림 시스템을 구축할 수 있습니다. 이 챕터에서는 Nginx stub_status 활성화부터 Docker Compose로 전체 모니터링 스택을 구성하는 방법까지 단계별로 다룹니다.
Nginx stub_status 모듈 활성화
Nginx는 ngx_http_stub_status_module을 통해 현재 연결 수, 처리된 요청 수 등의 기본 통계를 HTTP 엔드포인트로 제공합니다.
모듈 포함 여부 확인
nginx -V 2>&1 | grep stub_status
# 출력에 --with-http_stub_status_module 이 있으면 사용 가능
/nginx_status 엔드포인트 설정
# /etc/nginx/conf.d/status.conf
server {
listen 127.0.0.1:8080; # 로컬에서만 접근 (보안)
server_name localhost;
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
allow 172.16.0.0/12; # Docker 내부 네트워크 허용
deny all;
}
}
nginx -t && systemctl reload nginx
# 동작 확인
curl http://127.0.0.1:8080/nginx_status
응답 예시:
Active connections: 42
server accepts handled requests
1024 1024 5678
Reading: 2 Writing: 5 Waiting: 35
| 항목 | 설명 |
|---|---|
Active connections | 현재 활성 연결 수 (Reading + Writing + Waiting) |
accepts | 수락된 전체 TCP 연결 수 |
handled | 실제 처리된 연결 수 (drop 없으면 accepts와 동일) |
requests | 처리된 전체 HTTP 요청 수 |
Reading | 요청 헤더를 읽는 중인 연결 수 |
Writing | 응답을 전송 중인 연결 수 |
Waiting | Keep-alive 대기 중인 연결 수 |
nginx-prometheus-exporter 설치 및 설정
Nginx stub_status 페이지의 텍스트 출력을 Prometheus 메트릭 형식으로 변환해주는 공식 exporter입니다.
바이너리 직접 설치
# 최신 릴리즈 확인: https://github.com/nginxinc/nginx-prometheus-exporter/releases
VERSION="1.3.0"
ARCH="amd64"
wget https://github.com/nginxinc/nginx-prometheus-exporter/releases/download/v${VERSION}/nginx-prometheus-exporter_${VERSION}_linux_${ARCH}.tar.gz
tar -xzf nginx-prometheus-exporter_${VERSION}_linux_${ARCH}.tar.gz
sudo mv nginx-prometheus-exporter /usr/local/bin/
sudo chmod +x /usr/local/bin/nginx-prometheus-exporter
# 실행 테스트
nginx-prometheus-exporter -nginx.scrape-uri=http://127.0.0.1:8080/nginx_status
systemd 서비스로 등록
# /etc/systemd/system/nginx-prometheus-exporter.service
[Unit]
Description=Nginx Prometheus Exporter
After=network.target
[Service]
User=nobody
ExecStart=/usr/local/bin/nginx-prometheus-exporter \
-nginx.scrape-uri=http://127.0.0.1:8080/nginx_status \
-web.listen-address=:9113 \
-web.telemetry-path=/metrics
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl enable --now nginx-prometheus-exporter
# 메트릭 노출 확인
curl http://localhost:9113/metrics
출력 예시:
# HELP nginx_connections_accepted Accepted client connections
# TYPE nginx_connections_accepted counter
nginx_connections_accepted 1024
# HELP nginx_connections_active Active client connections
# TYPE nginx_connections_active gauge
nginx_connections_active 42
# HELP nginx_connections_handled Handled client connections
# TYPE nginx_connections_handled counter
nginx_connections_handled 1024
# HELP nginx_http_requests_total Total http requests
# TYPE nginx_http_requests_total counter
nginx_http_requests_total 5678
# HELP nginx_connections_reading Connections where Nginx is reading the request header
# TYPE nginx_connections_reading gauge
nginx_connections_reading 2
# HELP nginx_connections_waiting Idle client connections
# TYPE nginx_connections_waiting gauge
nginx_connections_waiting 35
# HELP nginx_connections_writing Connections where Nginx is writing the response
# TYPE nginx_connections_writing gauge
nginx_connections_writing 5
Prometheus 설치 및 scrape 설정
Prometheus 바이너리 설치
VERSION="2.51.2"
wget https://github.com/prometheus/prometheus/releases/download/v${VERSION}/prometheus-${VERSION}.linux-amd64.tar.gz
tar -xzf prometheus-${VERSION}.linux-amd64.tar.gz
sudo mv prometheus-${VERSION}.linux-amd64/{prometheus,promtool} /usr/local/bin/
sudo mkdir -p /etc/prometheus /var/lib/prometheus
sudo mv prometheus-${VERSION}.linux-amd64/{consoles,console_libraries} /etc/prometheus/
prometheus.yml 설정
# /etc/prometheus/prometheus.yml
global:
scrape_interval: 15s # 기본 수집 주기
evaluation_interval: 15s # 규칙 평가 주기
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
rule_files:
- /etc/prometheus/rules/*.yml
scrape_configs:
# Prometheus 자체 메트릭
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
# Nginx Exporter
- job_name: 'nginx'
static_configs:
- targets: ['localhost:9113']
relabel_configs:
- source_labels: [__address__]
target_label: instance
regex: '([^:]+).*'
replacement: '${1}'
# Tomcat JMX Exporter (아래 섹션 참조)
- job_name: 'tomcat'
static_configs:
- targets: ['localhost:9404']
metrics_path: /metrics
systemd 서비스 등록
# /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
After=network.target
[Service]
User=prometheus
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/data \
--storage.tsdb.retention.time=30d \
--web.listen-address=:9090 \
--web.enable-lifecycle
Restart=on-failure
[Install]
WantedBy=multi-user.target
useradd --no-create-home --shell /bin/false prometheus
chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus
systemctl daemon-reload
systemctl enable --now prometheus
# 웹 UI 확인: http://localhost:9090
Tomcat JMX Exporter 연동
Tomcat은 JMX(Java Management Extensions)를 통해 JVM 힙 메모리, 스레드 풀, 요청 처리량 등의 내부 상태를 노출합니다. jmx_prometheus_javaagent를 사용하면 이를 Prometheus 메트릭으로 변환할 수 있습니다.
javaagent JAR 다운로드
VERSION="0.20.0"
wget https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/${VERSION}/jmx_prometheus_javaagent-${VERSION}.jar \
-O /opt/tomcat/lib/jmx_prometheus_javaagent.jar
JMX Exporter 설정 파일
# /opt/tomcat/conf/jmx-exporter.yml
---
lowercaseOutputLabelNames: true
lowercaseOutputName: true
whitelistObjectNames:
- "Catalina:type=GlobalRequestProcessor,name=*"
- "Catalina:type=ThreadPool,name=*"
- "java.lang:type=Memory"
- "java.lang:type=GarbageCollector,name=*"
- "java.lang:type=Threading"
- "java.lang:type=ClassLoading"
- "java.lang:type=OperatingSystem"
rules:
# Tomcat 요청 처리 메트릭
- pattern: 'Catalina<type=GlobalRequestProcessor, name="(.+)"><>(\w+)'
name: tomcat_$2_total
labels:
connector: "$1"
help: Tomcat global request processor metric $2
type: COUNTER
# Tomcat 스레드 풀 메트릭
- pattern: 'Catalina<type=ThreadPool, name="(.+)"><>(\w+)'
name: tomcat_threadpool_$2
labels:
connector: "$1"
help: Tomcat thread pool metric $2
type: GAUGE
# JVM 메모리
- pattern: "java.lang<type=Memory><HeapMemoryUsage>(\\w+)"
name: jvm_memory_heap_$1_bytes
help: JVM heap memory $1
# GC 통계
- pattern: "java.lang<type=GarbageCollector, name=(.+)><>(CollectionCount|CollectionTime)"
name: jvm_gc_$2_total
labels:
gc: "$1"
Tomcat JVM 옵션에 javaagent 추가
# /opt/tomcat/bin/setenv.sh
CATALINA_OPTS="$CATALINA_OPTS -javaagent:/opt/tomcat/lib/jmx_prometheus_javaagent.jar=9404:/opt/tomcat/conf/jmx-exporter.yml"
systemctl restart tomcat
# 메트릭 확인
curl http://localhost:9404/metrics | head -30
Docker Compose로 전체 모니터링 스택 구성
# docker-compose.yml
version: '3.8'
services:
# Nginx 웹서버
nginx:
image: nginx:1.25-alpine
container_name: nginx
ports:
- "80:80"
- "8080:8080"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/status.conf:/etc/nginx/conf.d/status.conf:ro
networks:
- monitoring
# Nginx Prometheus Exporter
nginx-exporter:
image: nginx/nginx-prometheus-exporter:1.3.0
container_name: nginx-exporter
command:
- -nginx.scrape-uri=http://nginx:8080/nginx_status
ports:
- "9113:9113"
depends_on:
- nginx
networks:
- monitoring
# Prometheus
prometheus:
image: prom/prometheus:v2.51.2
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- ./prometheus/rules:/etc/prometheus/rules:ro
- prometheus-data:/prometheus
command:
- --config.file=/etc/prometheus/prometheus.yml
- --storage.tsdb.retention.time=30d
- --web.enable-lifecycle
networks:
- monitoring
# Alertmanager
alertmanager:
image: prom/alertmanager:v0.27.0
container_name: alertmanager
ports:
- "9093:9093"
volumes:
- ./alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro
- alertmanager-data:/alertmanager
networks:
- monitoring
# Grafana
grafana:
image: grafana/grafana:10.4.2
container_name: grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin123
- GF_USERS_ALLOW_SIGN_UP=false
volumes:
- grafana-data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning:ro
depends_on:
- prometheus
networks:
- monitoring
volumes:
prometheus-data:
alertmanager-data:
grafana-data:
networks:
monitoring:
driver: bridge
# prometheus/prometheus.yml (Docker Compose 환경)
global:
scrape_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
rule_files:
- /etc/prometheus/rules/*.yml
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'nginx'
static_configs:
- targets: ['nginx-exporter:9113']
- job_name: 'tomcat'
static_configs:
- targets: ['tomcat:9404']
실행:
docker compose up -d
docker compose ps
Alertmanager 알림 설정
Slack 알림 설정
# alertmanager/alertmanager.yml
global:
slack_api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL'
smtp_smarthost: 'smtp.gmail.com:587'
smtp_from: 'alertmanager@example.com'
smtp_auth_username: 'your-email@gmail.com'
smtp_auth_password: 'your-app-password'
route:
group_by: ['alertname', 'job']
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receiver: 'default'
routes:
- match:
severity: critical
receiver: 'critical-alerts'
- match:
severity: warning
receiver: 'slack-warnings'
receivers:
- name: 'default'
slack_configs:
- channel: '#alerts'
title: '[{{ .Status | toUpper }}] {{ .CommonAnnotations.summary }}'
text: '{{ range .Alerts }}*Alert:* {{ .Annotations.description }}\n*Labels:* {{ range .Labels.SortedPairs }}{{ .Name }}={{ .Value }} {{ end }}{{ end }}'
- name: 'critical-alerts'
slack_configs:
- channel: '#critical-alerts'
title: '🚨 CRITICAL: {{ .CommonAnnotations.summary }}'
text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'
email_configs:
- to: 'ops-team@example.com'
subject: 'CRITICAL Alert: {{ .CommonAnnotations.summary }}'
body: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'
- name: 'slack-warnings'
slack_configs:
- channel: '#alerts'
title: '⚠️ WARNING: {{ .CommonAnnotations.summary }}'
text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'
Prometheus 알림 룰 파일
# prometheus/rules/nginx-alerts.yml
groups:
- name: nginx_alerts
rules:
# 높은 에러율
- alert: NginxHighErrorRate
expr: rate(nginx_http_requests_total{status=~"5.."}[5m]) / rate(nginx_http_requests_total[5m]) > 0.05
for: 2m
labels:
severity: critical
annotations:
summary: "Nginx 5xx 에러율 5% 초과"
description: "{{ $labels.instance }}: 5xx 에러율 {{ $value | humanizePercentage }}"
# 활성 연결 수 임계치 초과
- alert: NginxHighConnections
expr: nginx_connections_active > 1000
for: 5m
labels:
severity: warning
annotations:
summary: "Nginx 활성 연결 수 1000 초과"
description: "{{ $labels.instance }}: 현재 {{ $value }} 연결"
# Nginx 다운
- alert: NginxDown
expr: up{job="nginx"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Nginx exporter 응답 없음"
description: "{{ $labels.instance }}: Nginx 서버 다운 또는 exporter 장애"
PromQL 기본 쿼리 예제
Prometheus 웹 UI(http://localhost:9090) 또는 Grafana에서 사용할 수 있는 쿼리입니다.
# 1. 초당 Nginx 요청 수 (RPS)
rate(nginx_http_requests_total[5m])
# 2. 지난 1시간 동안의 총 요청 증가량
increase(nginx_http_requests_total[1h])
# 3. 현재 활성 Nginx 연결 수
nginx_connections_active
# 4. Waiting 연결 비율
nginx_connections_waiting / nginx_connections_active * 100
# 5. Tomcat JVM 힙 사용률 (%)
jvm_memory_heap_used_bytes / jvm_memory_heap_max_bytes * 100
# 6. Tomcat 활성 스레드 수
tomcat_threadpool_currentthreadcount{connector="http-nio-8080"}
# 7. Tomcat 요청 처리 속도 (RPS)
rate(tomcat_requestcount_total[5m])
# 8. 95번째 백분위수 응답 시간 (histogram 포맷 필요)
histogram_quantile(0.95, rate(nginx_request_duration_seconds_bucket[5m]))
# 9. Nginx 연결 수락률 (초당)
rate(nginx_connections_accepted[1m])
# 10. GC 실행 빈도 (초당)
rate(jvm_gc_CollectionCount_total[5m])
이 메트릭들을 Grafana 대시보드 패널에 등록하면 실시간으로 시스템 상태를 모니터링할 수 있습니다. 다음 챕터에서 Grafana 대시보드 구성 방법을 자세히 다룹니다.