The Target & Threat Context
Our client, a rapidly scaling e-commerce powerhouse, tasked us with a comprehensive security audit of their new microservices architecture. Their platform handled millions of transactions daily, processing sensitive customer data, payment information, and intricate supply chain logistics. The stakes, as always, were sky-high. Compliance requirements (PCI DSS, GDPR, CCPA) meant any breach could lead to catastrophic financial penalties and irreparable reputational damage.
The system under review was a complex ecosystem built predominantly on Node.js microservices, orchestrated via Kubernetes (EKS) within their AWS VPC. Data was stored in Aurora PostgreSQL, S3 buckets, and DynamoDB. The specific component that caught our eye was a seemingly benign PDF generation service. This service, let's call it pdf-gen-svc, was responsible for creating customer invoices, shipping labels, and custom reports. It was a standalone Node.js application running on an EC2 instance within a private subnet, using a headless Chrome instance (Puppeteer) to render HTML content into PDFs. The service exposed a REST API endpoint, /generate-pdf, which accepted a JSON payload containing a sourceUrl and an optional customHeaders object.
The architectural design dictated that all outbound requests from internal services, including pdf-gen-svc, were routed through a centralized internal proxy. This proxy was intended to enforce network policies, perform logging, and cache frequently accessed external resources. The pdf-gen-svc would send a request to this internal proxy, which would then fetch the content from the specified sourceUrl and return it to pdf-gen-svc for rendering. The proxy itself was a custom-built Go application, running on a separate EC2 instance, and was designed to be highly performant.
The client's AWS environment was fairly mature, but like many organizations, they had a mix of older and newer configurations. While some newer services were implementing stricter controls like IMDSv2, the pdf-gen-svc and its internal proxy had been deployed before these standards were universally enforced. This created a subtle but critical vulnerability window that we were about to pry wide open.
Corrected Code / Configuration
Here's how the pdf-gen-svc and the internal proxy should be hardened:
// pdf-gen-svc (Node.js) - Hardened
const express = require('express');
const axios = require('axios');
const app = express();
app.use(express.json());
const INTERNAL_PROXY_URL = 'http://internal-proxy.corp:8080/fetch';
// Function to validate URLs against private IP ranges and whitelist
function isValidUrl(url) {
try {
const parsedUrl = new URL(url);
const hostname = parsedUrl.hostname;
// Whitelist allowed domains (e.g., example.com, cdn.example.com)
const allowedDomains = ['example.com', 'cdn.example.com'];
if (!allowedDomains.includes(hostname)) {
// Check for private IP ranges if not in whitelist
const ip = require('net').isIP(hostname) ? hostname : null;
if (ip) {
const isPrivate = require('ip-is-private')(ip); // Using a library for robustness
if (isPrivate || ip.startsWith('169.254')) {
return false; // Block private IPs and link-local
}
} else {
// If not an IP and not in whitelist, resolve DNS to check for private IPs
// This requires careful asynchronous handling and is often done at a firewall level.
// For simplicity, we'll assume DNS resolution is handled by a trusted resolver
// or that strict egress filtering prevents internal IP resolution for external domains.
}
}
return true;
} catch {
return false;
}
}
app.post('/generate-pdf', async (req, res) => {
const { sourceUrl, customHeaders } = req.body;
if (!isValidUrl(sourceUrl)) {
return res.status(400).send('Invalid or disallowed source URL.');
}
// Sanitize customHeaders: only allow explicitly whitelisted headers
const allowedCustomHeaders = ['User-Agent', 'Referer']; // Example whitelist
const sanitizedHeaders = {};
if (customHeaders) {
for (const headerName in customHeaders) {
if (allowedCustomHeaders.includes(headerName)) {
sanitizedHeaders[headerName] = customHeaders[headerName];
}
}
}
try {
const proxyResponse = await axios.post(INTERNAL_PROXY_URL, {
targetUrl: sourceUrl,
headers: sanitizedHeaders
});
res.status(200).send('PDF generated successfully.');
} catch (error) {
console.error('Error during PDF generation:', error.message);
res.status(500).send('Failed to generate PDF.');
}
});
app.listen(3000, () => console.log('PDF Gen Service listening on port 3000'));
And for the internal proxy (conceptual Go code - Hardened):
// internal-proxy.corp (Go) - Hardened
package main
import (
"fmt"
"io/ioutil"
"log"
"net/http"
"net/url" // For parsing URLs
"strings"
)
// Function to validate target URLs for the proxy
func isValidProxyTargetUrl(targetURL string) bool {
u, err := url.Parse(targetURL)
if err != nil || (u.Scheme != "http" && u.Scheme != "https") {
return false
}
// Implement strict whitelisting for domains the proxy is allowed to fetch from.
// Or, more robustly, block all private IPs after DNS resolution.
// For simplicity, we'll assume the `pdf-gen-svc` has already done primary URL validation.
// The proxy's role is to ensure it doesn't get tricked by headers.
// Crucially, the proxy *must not* use X-Forwarded-For for routing.
return true
}
func fetchHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != "POST" {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
var requestBody struct {
TargetURL string `json:"targetUrl"`
Headers map[string]string `json:"headers"`
}
// ... parse requestBody ...
if !isValidProxyTargetUrl(requestBody.TargetURL) {
http.Error(w, "Invalid target URL for proxy", http.StatusBadRequest)
return
}
// CRITICAL FIX: The proxy *must not* use X-Forwarded-For or similar headers for routing.
// It should *always* use the `targetURL` provided directly for the actual network connection.
req, err := http.NewRequest("GET", requestBody.TargetURL, nil) // Always use TargetURL
if err != nil {
http.Error(w, fmt.Sprintf("Failed to create request: %v", err), http.StatusInternalServerError)
return
}
// Only set whitelisted headers if necessary, or pass none from user input.
// For this example, we'll assume the pdf-gen-svc has already sanitized them.
for k, v := range requestBody.Headers {
// Explicitly block sensitive headers from being set by user input on the proxy
if strings.EqualFold(k, "Host") || strings.EqualFold(k, "X-Forwarded-For") || strings.EqualFold(k, "X-Real-IP") {
continue // Do not allow these to be set by user
}
req.Header.Set(k, v)
}
client := &http.Client{}
resp, err := client.Do(req)
if err != nil {
http.Error(w, fmt.Sprintf("Failed to fetch content: %v", err), http.StatusInternalServerError)
return
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
http.Error(w, fmt.Sprintf("Failed to read response body: %v", err), http.StatusInternalServerError)
return
}
w.WriteHeader(resp.StatusCode)
w.Write(body)
}
func main() {
http.HandleFunc("/fetch", fetchHandler)
log.Fatal(http.ListenAndServe(":8080", nil))
}