ELK Stack Integration — Filebeat + Elasticsearch + Kibana

The ELK Stack is the industry-standard open-source solution for log collection, storage, and visualization. It enables real-time collection of millions of log entries from web servers like Nginx and Tomcat, and allows rapid root-cause analysis through powerful search capabilities. This chapter walks through setting up the ELK Stack with Docker Compose, collecting Nginx/Tomcat logs via Filebeat, and building a Kibana dashboard from end to end.

ELK Stack Components

The ELK Stack consists of four core components.

Elasticsearch is a distributed search and analytics engine. It provides a JSON-based RESTful API, stores log data in indices, and supports full-text search powered by Apache Lucene. Horizontal scaling through sharding and replicas allows it to handle massive log volumes with ease.

Logstash is a data collection, transformation, and forwarding pipeline. It ingests data from various input plugins (file, Kafka, Beats), parses it using Grok patterns and other filters, and outputs to Elasticsearch. Because it is resource-intensive, Filebeat is often used as a lightweight replacement for simple log forwarding.

Kibana is a web UI for visualizing Elasticsearch data. It offers Discover (log exploration), Visualize (charting), Dashboard (dashboard builder), and APM (performance monitoring), among many other features.

Filebeat is a lightweight log shipper installed on servers. It monitors log files in real time and forwards them directly to Logstash or Elasticsearch. Its minimal resource footprint makes it safe to deploy on production servers without impacting application performance.

┌─────────────┐    ┌─────────────┐    ┌─────────────────────┐    ┌──────────────┐
│  Nginx/     │───▶│  Filebeat   │───▶│  Logstash           │───▶│ Elasticsearch│
│  Tomcat     │    │  (collect)  │    │  (parse/transform)  │    │  (store)     │
│  (generate) │    └─────────────┘    └─────────────────────┘    └──────┬───────┘
└─────────────┘                                                          │
                                                                    ┌────▼────┐
                                                                    │ Kibana  │
                                                                    │(visualize)
                                                                    └─────────┘

Setting Up ELK Stack with Docker Compose

The entire stack is defined in a single Docker Compose file. Versions and passwords are managed through a .env file.

mkdir -p ~/elk-stack/{logstash/pipeline,filebeat,nginx/logs,tomcat/logs}
cd ~/elk-stack

Create .env file:

# .env
ELK_VERSION=8.12.0
ELASTIC_PASSWORD=changeme123!
KIBANA_PASSWORD=changeme123!
ENCRYPTION_KEY=a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4

Write docker-compose.yml:

# docker-compose.yml
version: '3.8'

services:
  # ──────────────────────────────────────────────
  # Elasticsearch
  # ──────────────────────────────────────────────
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:${ELK_VERSION}
    container_name: elasticsearch
    environment:
      - node.name=elasticsearch
      - cluster.name=elk-cluster
      - discovery.type=single-node
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
      - xpack.security.enabled=true
      - xpack.security.http.ssl.enabled=false
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - es_data:/usr/share/elasticsearch/data
    ports:
      - "9200:9200"
    networks:
      - elk
    healthcheck:
      test: ["CMD-SHELL", "curl -s -u elastic:${ELASTIC_PASSWORD} http://localhost:9200/_cluster/health | grep -q '\"status\":\"green\"\\|\"status\":\"yellow\"'"]
      interval: 30s
      timeout: 10s
      retries: 5

  # ──────────────────────────────────────────────
  # Kibana
  # ──────────────────────────────────────────────
  kibana:
    image: docker.elastic.co/kibana/kibana:${ELK_VERSION}
    container_name: kibana
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
      - ELASTICSEARCH_USERNAME=kibana_system
      - ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
      - xpack.encryptedSavedObjects.encryptionKey=${ENCRYPTION_KEY}
    ports:
      - "5601:5601"
    networks:
      - elk
    depends_on:
      elasticsearch:
        condition: service_healthy

  # ──────────────────────────────────────────────
  # Logstash
  # ──────────────────────────────────────────────
  logstash:
    image: docker.elastic.co/logstash/logstash:${ELK_VERSION}
    container_name: logstash
    environment:
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
      - "LS_JAVA_OPTS=-Xms256m -Xmx256m"
    volumes:
      - ./logstash/pipeline:/usr/share/logstash/pipeline:ro
    ports:
      - "5044:5044"   # Beats input
      - "5000:5000"   # TCP input
    networks:
      - elk
    depends_on:
      elasticsearch:
        condition: service_healthy

  # ──────────────────────────────────────────────
  # Filebeat
  # ──────────────────────────────────────────────
  filebeat:
    image: docker.elastic.co/beats/filebeat:${ELK_VERSION}
    container_name: filebeat
    user: root
    environment:
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
    volumes:
      - ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro
      - ./nginx/logs:/var/log/nginx:ro
      - ./tomcat/logs:/var/log/tomcat:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - filebeat_data:/usr/share/filebeat/data
    networks:
      - elk
    depends_on:
      elasticsearch:
        condition: service_healthy
    command: filebeat -e --strict.perms=false

volumes:
  es_data:
    driver: local
  filebeat_data:
    driver: local

networks:
  elk:
    driver: bridge

Logstash Pipeline Configuration

Define Grok patterns to parse Nginx access logs and Tomcat logs.

# logstash/pipeline/nginx.conf
input {
  beats {
    port => 5044
  }
}

filter {
  # ─── Nginx access log parsing ────────────────────────────────────────────
  if [fields][log_type] == "nginx_access" {
    grok {
      match => {
        "message" => '%{IPORHOST:client_ip} - %{DATA:user_name} \[%{HTTPDATE:time_local}\] "%{WORD:method} %{DATA:request_uri} HTTP/%{NUMBER:http_version}" %{NUMBER:status_code:int} %{NUMBER:body_bytes_sent:int} "%{DATA:http_referer}" "%{DATA:user_agent}"'
      }
      tag_on_failure => ["_grokparsefailure_nginx"]
    }
    # Parse timestamp
    date {
      match => ["time_local", "dd/MMM/yyyy:HH:mm:ss Z"]
      target => "@timestamp"
    }
    # Parse User-Agent string
    useragent {
      source => "user_agent"
      target => "ua"
    }
    # Enrich with GeoIP data
    geoip {
      source => "client_ip"
      target => "geoip"
    }
    # Categorize HTTP status codes
    if [status_code] >= 500 {
      mutate { add_field => { "status_category" => "5xx_error" } }
    } else if [status_code] >= 400 {
      mutate { add_field => { "status_category" => "4xx_error" } }
    } else if [status_code] >= 300 {
      mutate { add_field => { "status_category" => "3xx_redirect" } }
    } else {
      mutate { add_field => { "status_category" => "2xx_success" } }
    }
  }

  # ─── Tomcat access log parsing ───────────────────────────────────────────
  if [fields][log_type] == "tomcat_access" {
    grok {
      match => {
        "message" => '%{IPORHOST:client_ip} - %{DATA:user_name} \[%{HTTPDATE:time_local}\] "%{WORD:method} %{DATA:request_uri} HTTP/%{NUMBER:http_version}" %{NUMBER:status_code:int} %{NUMBER:body_bytes_sent:int} %{NUMBER:response_time:int}'
      }
      tag_on_failure => ["_grokparsefailure_tomcat"]
    }
    date {
      match => ["time_local", "dd/MMM/yyyy:HH:mm:ss Z"]
      target => "@timestamp"
    }
    # Convert response time from ms to seconds
    if [response_time] {
      ruby {
        code => "event.set('response_time_sec', event.get('response_time').to_f / 1000)"
      }
    }
  }

  # ─── Tomcat error log parsing ────────────────────────────────────────────
  if [fields][log_type] == "tomcat_error" {
    grok {
      match => {
        "message" => '%{DATA:time_local} %{LOGLEVEL:log_level} \[%{DATA:thread}\] %{JAVACLASS:logger} %{GREEDYDATA:log_message}'
      }
      tag_on_failure => ["_grokparsefailure_tomcat_error"]
    }
  }

  # ─── Common: remove noisy fields ─────────────────────────────────────────
  mutate {
    remove_field => ["agent", "ecs", "input", "log", "host"]
  }
}

output {
  if [fields][log_type] == "nginx_access" {
    elasticsearch {
      hosts => ["http://elasticsearch:9200"]
      user => "elastic"
      password => "${ELASTIC_PASSWORD}"
      index => "nginx-access-%{+YYYY.MM.dd}"
      template_name => "nginx-access"
    }
  } else if [fields][log_type] == "tomcat_access" {
    elasticsearch {
      hosts => ["http://elasticsearch:9200"]
      user => "elastic"
      password => "${ELASTIC_PASSWORD}"
      index => "tomcat-access-%{+YYYY.MM.dd}"
    }
  } else if [fields][log_type] == "tomcat_error" {
    elasticsearch {
      hosts => ["http://elasticsearch:9200"]
      user => "elastic"
      password => "${ELASTIC_PASSWORD}"
      index => "tomcat-error-%{+YYYY.MM.dd}"
    }
  }
  # Debug only — remove in production
  # stdout { codec => rubydebug }
}

Filebeat Configuration

Configure Filebeat to monitor Nginx and Tomcat log files.

# filebeat/filebeat.yml
filebeat.inputs:

  # ─── Nginx access log ───────────────────────────────────────────────────
  - type: log
    id: nginx-access
    enabled: true
    paths:
      - /var/log/nginx/access.log
      - /var/log/nginx/*access*.log
    fields:
      log_type: nginx_access
      server: nginx
    fields_under_root: false
    multiline:
      type: pattern
      pattern: '^\d+\.\d+\.\d+\.\d+'
      negate: true
      match: after

  # ─── Nginx error log ────────────────────────────────────────────────────
  - type: log
    id: nginx-error
    enabled: true
    paths:
      - /var/log/nginx/error.log
    fields:
      log_type: nginx_error
      server: nginx
    fields_under_root: false

  # ─── Tomcat access log ──────────────────────────────────────────────────
  - type: log
    id: tomcat-access
    enabled: true
    paths:
      - /var/log/tomcat/localhost_access_log.*.txt
      - /var/log/tomcat/access_log.*
    fields:
      log_type: tomcat_access
      server: tomcat
    fields_under_root: false

  # ─── Tomcat error log (multiline for Java stack traces) ──────────────────
  - type: log
    id: tomcat-error
    enabled: true
    paths:
      - /var/log/tomcat/catalina.out
    fields:
      log_type: tomcat_error
      server: tomcat
    fields_under_root: false
    multiline:
      type: pattern
      # Lines not starting with a date are appended to the previous line
      pattern: '^\d{2}-[A-Za-z]{3}-\d{4}'
      negate: true
      match: after
      max_lines: 500
      timeout: 5s

# ─── Processors ──────────────────────────────────────────────────────────
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_docker_metadata: ~
  - drop_fields:
      fields: ["agent.ephemeral_id", "agent.hostname", "agent.id", "agent.version"]
      ignore_missing: true

# ─── Output: send to Logstash ────────────────────────────────────────────
output.logstash:
  hosts: ["logstash:5044"]
  bulk_max_size: 2048
  backoff.init: 1s
  backoff.max: 60s

# ─── Logging ─────────────────────────────────────────────────────────────
logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7
  permissions: 0644

# ─── Monitoring ──────────────────────────────────────────────────────────
monitoring.enabled: false

Elasticsearch Index Template

# Set kibana_system password (run once)
curl -X POST "http://localhost:9200/_security/user/kibana_system/_password" \
  -H "Content-Type: application/json" \
  -u elastic:changeme123! \
  -d '{"password": "changeme123!"}'

# Register Nginx access log index template
curl -X PUT "http://localhost:9200/_index_template/nginx-access" \
  -H "Content-Type: application/json" \
  -u elastic:changeme123! \
  -d '{
    "index_patterns": ["nginx-access-*"],
    "template": {
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0,
        "index.lifecycle.name": "nginx-logs-policy",
        "index.lifecycle.rollover_alias": "nginx-access"
      },
      "mappings": {
        "properties": {
          "@timestamp":       { "type": "date" },
          "client_ip":        { "type": "ip" },
          "method":           { "type": "keyword" },
          "request_uri":      { "type": "keyword" },
          "status_code":      { "type": "integer" },
          "status_category":  { "type": "keyword" },
          "body_bytes_sent":  { "type": "long" },
          "response_time":    { "type": "float" },
          "user_agent":       { "type": "text" },
          "geoip": {
            "properties": {
              "location":     { "type": "geo_point" },
              "country_name": { "type": "keyword" },
              "city_name":    { "type": "keyword" }
            }
          }
        }
      }
    },
    "priority": 200
  }'

Index Lifecycle Management (ILM)

Create an ILM policy to automatically roll over and delete aging indices.

curl -X PUT "http://localhost:9200/_ilm/policy/nginx-logs-policy" \
  -H "Content-Type: application/json" \
  -u elastic:changeme123! \
  -d '{
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_primary_shard_size": "10gb",
              "max_age": "1d"
            },
            "set_priority": { "priority": 100 }
          }
        },
        "warm": {
          "min_age": "7d",
          "actions": {
            "shrink":     { "number_of_shards": 1 },
            "forcemerge": { "max_num_segments": 1 },
            "set_priority": { "priority": 50 }
          }
        },
        "cold": {
          "min_age": "30d",
          "actions": {
            "set_priority": { "priority": 0 },
            "freeze": {}
          }
        },
        "delete": {
          "min_age": "90d",
          "actions": { "delete": {} }
        }
      }
    }
  }'

Starting and Initializing the Stack

cd ~/elk-stack

# Start the stack (initial startup takes a few minutes)
docker-compose up -d

# Follow logs
docker-compose logs -f elasticsearch
docker-compose logs -f logstash
docker-compose logs -f filebeat

# Verify Elasticsearch cluster health
curl -u elastic:changeme123! http://localhost:9200/_cluster/health?pretty

# List indices (after log ingestion begins)
curl -u elastic:changeme123! http://localhost:9200/_cat/indices?v

# Test Filebeat connectivity
docker-compose exec filebeat filebeat test output
docker-compose exec filebeat filebeat test config

Generating Test Logs

# Append sample Nginx access log entries
cat >> ~/elk-stack/nginx/logs/access.log << 'EOF'
192.168.1.100 - - [31/Mar/2026:10:00:01 +0900] "GET /api/users HTTP/1.1" 200 1234 "-" "Mozilla/5.0"
192.168.1.101 - - [31/Mar/2026:10:00:02 +0900] "POST /api/login HTTP/1.1" 401 89 "-" "curl/7.68.0"
192.168.1.102 - - [31/Mar/2026:10:00:03 +0900] "GET /static/js/app.js HTTP/1.1" 304 0 "http://example.com/" "Chrome/120"
10.0.0.1 - admin [31/Mar/2026:10:00:04 +0900] "DELETE /api/admin/users/5 HTTP/1.1" 403 45 "-" "Python/requests"
EOF

# Confirm documents were indexed
curl -u elastic:changeme123! \
  "http://localhost:9200/nginx-access-*/_count?pretty"

Kibana Dashboard Setup

Open Kibana at http://localhost:5601 and build a dashboard.

Step 1: Create a Data View

Navigate to Stack Management → Data Views → Create data view
Name: Nginx Access Logs
Index pattern: nginx-access-*
Timestamp field: @timestamp

Step 2: Explore Logs in Discover

Open Kibana → Discover
Select the new data view and adjust the time range (e.g., Last 24 hours)
Use KQL queries to filter:

# Show only 5xx errors
status_code >= 500

# Requests from a specific IP
client_ip: "192.168.1.100"

# Failed POST requests
method: "POST" AND status_code >= 400

# Slow responses (over 1 second)
response_time > 1000

Step 3: Build Visualizations

# Bar Chart — request count by HTTP status category
- Aggregation: Count
- Bucket: Terms → status_category
- Quickly see the ratio of 2xx / 4xx / 5xx

# Line Chart — requests over time
- Y-axis: Count
- X-axis: Date histogram → @timestamp (Auto interval)

# Pie Chart — top URL paths
- Aggregation: Count
- Bucket: Terms → request_uri.keyword (Top 10)

Step 4: Assemble a Dashboard

Kibana → Dashboard → Create dashboard
Add panel → select the charts built above
Arrange the layout and save as "Nginx Overview"

Resource Requirements and Operational Considerations

Component	Minimum RAM	Recommended RAM	CPU
Elasticsearch	1 GB	4 GB+	2 cores+
Logstash	512 MB	1 GB+	1 core+
Kibana	512 MB	1 GB	1 core
Filebeat	50 MB	100 MB	Very low

Key operational tips:

JVM heap sizing: Elasticsearch allocates half the available memory to its JVM heap. On an 8 GB server, -Xms4g -Xmx4g is the optimal setting.
Disk space: Maintain at least 30× the daily log volume as free disk space. Always configure an ILM policy to prevent disk exhaustion.
Shard sizing: Keep each shard between 10 GB and 50 GB. Too many small shards degrade cluster performance.
Security: In production, always enable TLS, change the default elastic password, and configure role-based access control (RBAC).
Backups: Use the snapshot API or Kibana's Snapshot and Restore feature to schedule regular backups.

ELK Stack Components​

Setting Up ELK Stack with Docker Compose​

Logstash Pipeline Configuration​

Filebeat Configuration​

Elasticsearch Index Template​

Index Lifecycle Management (ILM)​

Starting and Initializing the Stack​

Generating Test Logs​

Kibana Dashboard Setup​

Resource Requirements and Operational Considerations​