Pro Tips — Architecture Design and Failure Response Patterns

Beyond theory, this chapter covers architecture decision scenarios and failure points commonly encountered in production environments.

3 Real-World Architecture Patterns

Pattern 1: Single Server (Development / Small Scale)

[Internet]
    │
[Single Server]
  ├─ Nginx (port 80/443)
  └─ Tomcat (port 8080, blocked from external access)

Simplest configuration
Both web server and WAS run on a single machine
Suitable for development environments and low-traffic small services
Nginx handles static files and acts as a proxy to Tomcat
Drawback: SPOF (Single Point of Failure) — if the one server goes down, the entire service goes down

Pattern 2: Separate Web Server and WAS (Standard Production)

[Internet]
    │
[Web Server]  — Nginx (public IP)
    │   └─ Serves static files
    │   └─ SSL Termination
    │
[WAS Server]  — Tomcat (private IP, no direct external access)
    └─ Business logic
    └─ DB connections

Web server and WAS physically separated onto different machines
Tomcat's ports blocked by firewall from external access → enhanced security
If the Nginx server goes down, Tomcat remains alive; if Tomcat goes down, only a 503 error page is shown
Standard configuration for small-to-medium production services

Pattern 3: Load Balancer + WAS Cluster (High Availability)

[Internet]
    │
[Load Balancer] — AWS ALB / Nginx / HAProxy
    │
    ├─ [WAS Server 1] — Tomcat
    ├─ [WAS Server 2] — Tomcat
    └─ [WAS Server 3] — Tomcat

[Database Cluster]
  ├─ Primary DB
  └─ Replica DB (read distribution)

Run multiple WAS instances for horizontal scaling (Scale Out)
If one WAS instance goes down, the remaining instances continue serving traffic
Suitable for high-traffic services and environments requiring zero-downtime deployment
Requires a separate session management strategy (Sticky Session, Redis Session)

Common Failure Points and Solutions

Failure 1: Security Incident from Direct Tomcat Exposure

Symptom: Tomcat's default port (8080) is directly exposed to the internet

# Incorrect configuration — anyone can directly access Tomcat
curl http://your-server:8080/manager/html
→ Tomcat Manager page is exposed

Solution:

# Block port 8080 from external access via firewall (iptables example)
iptables -A INPUT -p tcp --dport 8080 -s 127.0.0.1 -j ACCEPT  # allow localhost
iptables -A INPUT -p tcp --dport 8080 -j DROP                  # block everything else

# For AWS Security Group:
# Port 8080: Source = Nginx server IP only

Failure 2: Timeout Mismatch Between Web Server and WAS

Symptom: For slow API requests (20+ seconds), Nginx cuts the connection first, causing 502 Bad Gateway errors for clients

Cause: Nginx's proxy_read_timeout default value (60 seconds) is shorter than the actual WAS processing time

# Incorrect configuration (using defaults only)
location /api/ {
    proxy_pass http://tomcat;
    # proxy_read_timeout defaults to 60s
}

Solution:

# Increase timeout for slow endpoints like file uploads or heavy processing
location /api/upload {
    proxy_pass http://tomcat;
    proxy_connect_timeout  10s;   # Max wait for Tomcat connection
    proxy_send_timeout     300s;  # Max wait for request transmission
    proxy_read_timeout     300s;  # Max wait for Tomcat response
}

Failure 3: Performance Degradation from WAS Serving Static Files

Symptom: Tomcat threads saturated by image/JS file requests, causing API response delays

[Scenario] With Tomcat maxThreads=200:
- 180 threads occupied by image requests
- Only 20 threads remaining for actual API requests
→ API response delays and queue buildup spike

Solution: Have Nginx serve static files directly

server {
    root /var/www/myapp;

    # Nginx handles static files directly (not forwarded to Tomcat)
    location ~* \.(html|css|js|png|jpg|gif|ico|woff2|pdf)$ {
        expires 7d;
        add_header Cache-Control "public, immutable";
    }

    # Forward everything else to Tomcat
    location / {
        proxy_pass http://127.0.0.1:8080;
    }
}

Failure 4: Momentary 503 Errors During WAS Restart

Symptom: Users see 503 errors for tens of seconds when Tomcat is restarted for deployment

Solution: Handle errors with a maintenance page in Nginx

upstream tomcat_backend {
    server 127.0.0.1:8080;
    # Try the next server if Tomcat doesn't respond (cluster environment)
    # server 127.0.0.1:8081 backup;
}

server {
    location / {
        proxy_pass http://tomcat_backend;

        # Show custom error page on connection error
        proxy_intercept_errors on;
        error_page 502 503 504 /maintenance.html;
    }

    # Maintenance page (returned immediately as a static file)
    location = /maintenance.html {
        root /var/www/maintenance;
        internal;
    }
}

Architecture Selection Checklist

Answer these questions when designing a new service architecture.

□ What is the expected DAU (Daily Active Users)?
  - Under 100K: Single server or Nginx+Tomcat on separate servers
  - 100K–1M: Load balancer + 2–3 WAS instances
  - 1M+: Auto-scaling and CDN required

□ Is the service heavy on static files?
  - Yes: Nginx serves static files directly + CDN integration
  - No (API Only): Use Nginx only as a reverse proxy

□ Is zero-downtime deployment required?
  - Yes: 2+ WAS instances + load balancer + rolling deployment
  - No: Simple restart deployment is acceptable

□ What is the session management strategy?
  - Sticky Session: Simple but sessions may be lost if a specific WAS fails
  - Redis Session Sharing: More complex but more reliable

□ What are the budget and staffing constraints?
  - Small team: Use managed services (AWS ALB, Cloud Run)
  - Sufficient operational capacity: Build Nginx + Tomcat cluster directly

Key Summary

Situation	Recommended Architecture
Development / Testing	Tomcat standalone or Nginx+Tomcat on single server
Small-scale production	Nginx + Tomcat on separate servers (2 machines)
Medium scale and above	Nginx load balancer + Tomcat cluster
WAS protection	Block Tomcat ports from external access via firewall
Performance optimization	Always have Nginx handle static files

3 Real-World Architecture Patterns​

Pattern 1: Single Server (Development / Small Scale)​

Pattern 2: Separate Web Server and WAS (Standard Production)​

Pattern 3: Load Balancer + WAS Cluster (High Availability)​

Common Failure Points and Solutions​

Failure 1: Security Incident from Direct Tomcat Exposure​

Failure 2: Timeout Mismatch Between Web Server and WAS​

Failure 3: Performance Degradation from WAS Serving Static Files​

Failure 4: Momentary 503 Errors During WAS Restart​

Architecture Selection Checklist​

Key Summary​