ELK Stack Integration — Filebeat + Elasticsearch + Kibana
The ELK Stack is the industry-standard open-source solution for log collection, storage, and visualization. It enables real-time collection of millions of log entries from web servers like Nginx and Tomcat, and allows rapid root-cause analysis through powerful search capabilities. This chapter walks through setting up the ELK Stack with Docker Compose, collecting Nginx/Tomcat logs via Filebeat, and building a Kibana dashboard from end to end.
ELK Stack Components
The ELK Stack consists of four core components.
Elasticsearch is a distributed search and analytics engine. It provides a JSON-based RESTful API, stores log data in indices, and supports full-text search powered by Apache Lucene. Horizontal scaling through sharding and replicas allows it to handle massive log volumes with ease.
Logstash is a data collection, transformation, and forwarding pipeline. It ingests data from various input plugins (file, Kafka, Beats), parses it using Grok patterns and other filters, and outputs to Elasticsearch. Because it is resource-intensive, Filebeat is often used as a lightweight replacement for simple log forwarding.
Kibana is a web UI for visualizing Elasticsearch data. It offers Discover (log exploration), Visualize (charting), Dashboard (dashboard builder), and APM (performance monitoring), among many other features.
Filebeat is a lightweight log shipper installed on servers. It monitors log files in real time and forwards them directly to Logstash or Elasticsearch. Its minimal resource footprint makes it safe to deploy on production servers without impacting application performance.
┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ ┌──────────────┐
│ Nginx/ │───▶│ Filebeat │───▶│ Logstash │───▶│ Elasticsearch│
│ Tomcat │ │ (collect) │ │ (parse/transform) │ │ (store) │
│ (generate) │ └─────────────┘ └─────────────────────┘ └──────┬───────┘
└─────────────┘ │
┌────▼────┐
│ Kibana │
│(visualize)
└─────────┘
Setting Up ELK Stack with Docker Compose
The entire stack is defined in a single Docker Compose file. Versions and passwords are managed through a .env file.
mkdir -p ~/elk-stack/{logstash/pipeline,filebeat,nginx/logs,tomcat/logs}
cd ~/elk-stack
Create .env file:
# .env
ELK_VERSION=8.12.0
ELASTIC_PASSWORD=changeme123!
KIBANA_PASSWORD=changeme123!
ENCRYPTION_KEY=a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4
Write docker-compose.yml:
# docker-compose.yml
version: '3.8'
services:
# ──────────────────────────────────────────────
# Elasticsearch
# ──────────────────────────────────────────────
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:${ELK_VERSION}
container_name: elasticsearch
environment:
- node.name=elasticsearch
- cluster.name=elk-cluster
- discovery.type=single-node
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
- xpack.security.enabled=true
- xpack.security.http.ssl.enabled=false
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- es_data:/usr/share/elasticsearch/data
ports:
- "9200:9200"
networks:
- elk
healthcheck:
test: ["CMD-SHELL", "curl -s -u elastic:${ELASTIC_PASSWORD} http://localhost:9200/_cluster/health | grep -q '\"status\":\"green\"\\|\"status\":\"yellow\"'"]
interval: 30s
timeout: 10s
retries: 5
# ──────────────────────────────────────────────
# Kibana
# ──────────────────────────────────────────────
kibana:
image: docker.elastic.co/kibana/kibana:${ELK_VERSION}
container_name: kibana
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
- ELASTICSEARCH_USERNAME=kibana_system
- ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
- xpack.encryptedSavedObjects.encryptionKey=${ENCRYPTION_KEY}
ports:
- "5601:5601"
networks:
- elk
depends_on:
elasticsearch:
condition: service_healthy
# ──────────────────────────────────────────────
# Logstash
# ──────────────────────────────────────────────
logstash:
image: docker.elastic.co/logstash/logstash:${ELK_VERSION}
container_name: logstash
environment:
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
- "LS_JAVA_OPTS=-Xms256m -Xmx256m"
volumes:
- ./logstash/pipeline:/usr/share/logstash/pipeline:ro
ports:
- "5044:5044" # Beats input
- "5000:5000" # TCP input
networks:
- elk
depends_on:
elasticsearch:
condition: service_healthy
# ──────────────────────────────────────────────
# Filebeat
# ──────────────────────────────────────────────
filebeat:
image: docker.elastic.co/beats/filebeat:${ELK_VERSION}
container_name: filebeat
user: root
environment:
- ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
volumes:
- ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro
- ./nginx/logs:/var/log/nginx:ro
- ./tomcat/logs:/var/log/tomcat:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- filebeat_data:/usr/share/filebeat/data
networks:
- elk
depends_on:
elasticsearch:
condition: service_healthy
command: filebeat -e --strict.perms=false
volumes:
es_data:
driver: local
filebeat_data:
driver: local
networks:
elk:
driver: bridge
Logstash Pipeline Configuration
Define Grok patterns to parse Nginx access logs and Tomcat logs.
# logstash/pipeline/nginx.conf
input {
beats {
port => 5044
}
}
filter {
# ─── Nginx access log parsing ────────────────────────────────────────────
if [fields][log_type] == "nginx_access" {
grok {
match => {
"message" => '%{IPORHOST:client_ip} - %{DATA:user_name} \[%{HTTPDATE:time_local}\] "%{WORD:method} %{DATA:request_uri} HTTP/%{NUMBER:http_version}" %{NUMBER:status_code:int} %{NUMBER:body_bytes_sent:int} "%{DATA:http_referer}" "%{DATA:user_agent}"'
}
tag_on_failure => ["_grokparsefailure_nginx"]
}
# Parse timestamp
date {
match => ["time_local", "dd/MMM/yyyy:HH:mm:ss Z"]
target => "@timestamp"
}
# Parse User-Agent string
useragent {
source => "user_agent"
target => "ua"
}
# Enrich with GeoIP data
geoip {
source => "client_ip"
target => "geoip"
}
# Categorize HTTP status codes
if [status_code] >= 500 {
mutate { add_field => { "status_category" => "5xx_error" } }
} else if [status_code] >= 400 {
mutate { add_field => { "status_category" => "4xx_error" } }
} else if [status_code] >= 300 {
mutate { add_field => { "status_category" => "3xx_redirect" } }
} else {
mutate { add_field => { "status_category" => "2xx_success" } }
}
}
# ─── Tomcat access log parsing ───────────────────────────────────────────
if [fields][log_type] == "tomcat_access" {
grok {
match => {
"message" => '%{IPORHOST:client_ip} - %{DATA:user_name} \[%{HTTPDATE:time_local}\] "%{WORD:method} %{DATA:request_uri} HTTP/%{NUMBER:http_version}" %{NUMBER:status_code:int} %{NUMBER:body_bytes_sent:int} %{NUMBER:response_time:int}'
}
tag_on_failure => ["_grokparsefailure_tomcat"]
}
date {
match => ["time_local", "dd/MMM/yyyy:HH:mm:ss Z"]
target => "@timestamp"
}
# Convert response time from ms to seconds
if [response_time] {
ruby {
code => "event.set('response_time_sec', event.get('response_time').to_f / 1000)"
}
}
}
# ─── Tomcat error log parsing ────────────────────────────────────────────
if [fields][log_type] == "tomcat_error" {
grok {
match => {
"message" => '%{DATA:time_local} %{LOGLEVEL:log_level} \[%{DATA:thread}\] %{JAVACLASS:logger} %{GREEDYDATA:log_message}'
}
tag_on_failure => ["_grokparsefailure_tomcat_error"]
}
}
# ─── Common: remove noisy fields ─────────────────────────────────────────
mutate {
remove_field => ["agent", "ecs", "input", "log", "host"]
}
}
output {
if [fields][log_type] == "nginx_access" {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
user => "elastic"
password => "${ELASTIC_PASSWORD}"
index => "nginx-access-%{+YYYY.MM.dd}"
template_name => "nginx-access"
}
} else if [fields][log_type] == "tomcat_access" {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
user => "elastic"
password => "${ELASTIC_PASSWORD}"
index => "tomcat-access-%{+YYYY.MM.dd}"
}
} else if [fields][log_type] == "tomcat_error" {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
user => "elastic"
password => "${ELASTIC_PASSWORD}"
index => "tomcat-error-%{+YYYY.MM.dd}"
}
}
# Debug only — remove in production
# stdout { codec => rubydebug }
}
Filebeat Configuration
Configure Filebeat to monitor Nginx and Tomcat log files.
# filebeat/filebeat.yml
filebeat.inputs:
# ─── Nginx access log ───────────────────────────────────────────────────
- type: log
id: nginx-access
enabled: true
paths:
- /var/log/nginx/access.log
- /var/log/nginx/*access*.log
fields:
log_type: nginx_access
server: nginx
fields_under_root: false
multiline:
type: pattern
pattern: '^\d+\.\d+\.\d+\.\d+'
negate: true
match: after
# ─── Nginx error log ────────────────────────────────────────────────────
- type: log
id: nginx-error
enabled: true
paths:
- /var/log/nginx/error.log
fields:
log_type: nginx_error
server: nginx
fields_under_root: false
# ─── Tomcat access log ──────────────────────────────────────────────────
- type: log
id: tomcat-access
enabled: true
paths:
- /var/log/tomcat/localhost_access_log.*.txt
- /var/log/tomcat/access_log.*
fields:
log_type: tomcat_access
server: tomcat
fields_under_root: false
# ─── Tomcat error log (multiline for Java stack traces) ──────────────────
- type: log
id: tomcat-error
enabled: true
paths:
- /var/log/tomcat/catalina.out
fields:
log_type: tomcat_error
server: tomcat
fields_under_root: false
multiline:
type: pattern
# Lines not starting with a date are appended to the previous line
pattern: '^\d{2}-[A-Za-z]{3}-\d{4}'
negate: true
match: after
max_lines: 500
timeout: 5s
# ─── Processors ──────────────────────────────────────────────────────────
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_docker_metadata: ~
- drop_fields:
fields: ["agent.ephemeral_id", "agent.hostname", "agent.id", "agent.version"]
ignore_missing: true
# ─── Output: send to Logstash ────────────────────────────────────────────
output.logstash:
hosts: ["logstash:5044"]
bulk_max_size: 2048
backoff.init: 1s
backoff.max: 60s
# ─── Logging ─────────────────────────────────────────────────────────────
logging.level: info
logging.to_files: true
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 7
permissions: 0644
# ─── Monitoring ──────────────────────────────────────────────────────────
monitoring.enabled: false
Elasticsearch Index Template
Register an index template that is automatically applied when a new index is created.
# Set kibana_system password (run once)
curl -X POST "http://localhost:9200/_security/user/kibana_system/_password" \
-H "Content-Type: application/json" \
-u elastic:changeme123! \
-d '{"password": "changeme123!"}'
# Register Nginx access log index template
curl -X PUT "http://localhost:9200/_index_template/nginx-access" \
-H "Content-Type: application/json" \
-u elastic:changeme123! \
-d '{
"index_patterns": ["nginx-access-*"],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"index.lifecycle.name": "nginx-logs-policy",
"index.lifecycle.rollover_alias": "nginx-access"
},
"mappings": {
"properties": {
"@timestamp": { "type": "date" },
"client_ip": { "type": "ip" },
"method": { "type": "keyword" },
"request_uri": { "type": "keyword" },
"status_code": { "type": "integer" },
"status_category": { "type": "keyword" },
"body_bytes_sent": { "type": "long" },
"response_time": { "type": "float" },
"user_agent": { "type": "text" },
"geoip": {
"properties": {
"location": { "type": "geo_point" },
"country_name": { "type": "keyword" },
"city_name": { "type": "keyword" }
}
}
}
}
},
"priority": 200
}'
Index Lifecycle Management (ILM)
Create an ILM policy to automatically roll over and delete aging indices.
curl -X PUT "http://localhost:9200/_ilm/policy/nginx-logs-policy" \
-H "Content-Type: application/json" \
-u elastic:changeme123! \
-d '{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_primary_shard_size": "10gb",
"max_age": "1d"
},
"set_priority": { "priority": 100 }
}
},
"warm": {
"min_age": "7d",
"actions": {
"shrink": { "number_of_shards": 1 },
"forcemerge": { "max_num_segments": 1 },
"set_priority": { "priority": 50 }
}
},
"cold": {
"min_age": "30d",
"actions": {
"set_priority": { "priority": 0 },
"freeze": {}
}
},
"delete": {
"min_age": "90d",
"actions": { "delete": {} }
}
}
}
}'
Starting and Initializing the Stack
cd ~/elk-stack
# Start the stack (initial startup takes a few minutes)
docker-compose up -d
# Follow logs
docker-compose logs -f elasticsearch
docker-compose logs -f logstash
docker-compose logs -f filebeat
# Verify Elasticsearch cluster health
curl -u elastic:changeme123! http://localhost:9200/_cluster/health?pretty
# List indices (after log ingestion begins)
curl -u elastic:changeme123! http://localhost:9200/_cat/indices?v
# Test Filebeat connectivity
docker-compose exec filebeat filebeat test output
docker-compose exec filebeat filebeat test config
Generating Test Logs
# Append sample Nginx access log entries
cat >> ~/elk-stack/nginx/logs/access.log << 'EOF'
192.168.1.100 - - [31/Mar/2026:10:00:01 +0900] "GET /api/users HTTP/1.1" 200 1234 "-" "Mozilla/5.0"
192.168.1.101 - - [31/Mar/2026:10:00:02 +0900] "POST /api/login HTTP/1.1" 401 89 "-" "curl/7.68.0"
192.168.1.102 - - [31/Mar/2026:10:00:03 +0900] "GET /static/js/app.js HTTP/1.1" 304 0 "http://example.com/" "Chrome/120"
10.0.0.1 - admin [31/Mar/2026:10:00:04 +0900] "DELETE /api/admin/users/5 HTTP/1.1" 403 45 "-" "Python/requests"
EOF
# Confirm documents were indexed
curl -u elastic:changeme123! \
"http://localhost:9200/nginx-access-*/_count?pretty"
Kibana Dashboard Setup
Open Kibana at http://localhost:5601 and build a dashboard.
Step 1: Create a Data View
- Navigate to Stack Management → Data Views → Create data view
- Name:
Nginx Access Logs - Index pattern:
nginx-access-* - Timestamp field:
@timestamp
Step 2: Explore Logs in Discover
- Open Kibana → Discover
- Select the new data view and adjust the time range (e.g., Last 24 hours)
- Use KQL queries to filter:
# Show only 5xx errors
status_code >= 500
# Requests from a specific IP
client_ip: "192.168.1.100"
# Failed POST requests
method: "POST" AND status_code >= 400
# Slow responses (over 1 second)
response_time > 1000
Step 3: Build Visualizations
# Bar Chart — request count by HTTP status category
- Aggregation: Count
- Bucket: Terms → status_category
- Quickly see the ratio of 2xx / 4xx / 5xx
# Line Chart — requests over time
- Y-axis: Count
- X-axis: Date histogram → @timestamp (Auto interval)
# Pie Chart — top URL paths
- Aggregation: Count
- Bucket: Terms → request_uri.keyword (Top 10)
Step 4: Assemble a Dashboard
- Kibana → Dashboard → Create dashboard
- Add panel → select the charts built above
- Arrange the layout and save as "Nginx Overview"
Resource Requirements and Operational Considerations
| Component | Minimum RAM | Recommended RAM | CPU |
|---|---|---|---|
| Elasticsearch | 1 GB | 4 GB+ | 2 cores+ |
| Logstash | 512 MB | 1 GB+ | 1 core+ |
| Kibana | 512 MB | 1 GB | 1 core |
| Filebeat | 50 MB | 100 MB | Very low |
Key operational tips:
- JVM heap sizing: Elasticsearch allocates half the available memory to its JVM heap. On an 8 GB server,
-Xms4g -Xmx4gis the optimal setting. - Disk space: Maintain at least 30× the daily log volume as free disk space. Always configure an ILM policy to prevent disk exhaustion.
- Shard sizing: Keep each shard between 10 GB and 50 GB. Too many small shards degrade cluster performance.
- Security: In production, always enable TLS, change the default
elasticpassword, and configure role-based access control (RBAC). - Backups: Use the
snapshotAPI or Kibana's Snapshot and Restore feature to schedule regular backups.