CG
SkillsPerforming API Inventory and Discovery
Start Free
Back to Skills Library
API Security🟡 Intermediate

Performing API Inventory and Discovery

Performs API inventory and discovery to identify all API endpoints in an organization's environment including documented, undocumented, shadow, zombie, and deprecated APIs.

8 min read7 code examples

Prerequisites

  • Written authorization specifying the target domains and network ranges
  • Passive traffic capture capability (network tap, proxy, or cloud traffic mirroring)
  • Active scanning tools: Amass, subfinder, httpx, and nuclei
  • JavaScript analysis tools: LinkFinder, JS-Miner, or custom parsers
  • Access to cloud console (AWS, Azure, GCP) for API gateway inventory
  • Burp Suite Professional for passive API endpoint discovery

Performing API Inventory and Discovery

When to Use

  • Mapping the complete API attack surface of an organization before a security assessment
  • Identifying shadow APIs deployed by development teams without security review
  • Discovering deprecated or zombie API versions that remain accessible but unmaintained
  • Finding undocumented API endpoints exposed through mobile applications, SPAs, or microservices
  • Building an API inventory for compliance requirements (PCI-DSS, SOC2, GDPR)

Do not use without written authorization. API discovery involves scanning network infrastructure and analyzing traffic.

Prerequisites

  • Written authorization specifying the target domains and network ranges
  • Passive traffic capture capability (network tap, proxy, or cloud traffic mirroring)
  • Active scanning tools: Amass, subfinder, httpx, and nuclei
  • JavaScript analysis tools: LinkFinder, JS-Miner, or custom parsers
  • Access to cloud console (AWS, Azure, GCP) for API gateway inventory
  • Burp Suite Professional for passive API endpoint discovery

Workflow

Step 1: Passive API Discovery from Traffic Analysis

import re
import json
from collections import defaultdict

# Parse HAR file from browser developer tools or proxy
def analyze_har_for_apis(har_file_path):
    """Extract API endpoints from HTTP Archive (HAR) file."""
    with open(har_file_path) as f:
        har = json.load(f)

    api_endpoints = defaultdict(lambda: {
        "methods": set(), "content_types": set(),
        "auth_types": set(), "count": 0
    })

    for entry in har["log"]["entries"]:
        url = entry["request"]["url"]
        method = entry["request"]["method"]

        # Identify API patterns
        api_patterns = [
            r'/api/', r'/v\d+/', r'/graphql', r'/rest/',
            r'/ws/', r'/rpc/', r'/grpc', r'/json',
        ]

        if any(re.search(p, url) for p in api_patterns):
            # Normalize the URL (remove query params and IDs)
            normalized = re.sub(r'\?.*$', '', url)
            normalized = re.sub(r'/\d+(/|$)', '/{id}\\1', normalized)
            normalized = re.sub(
                r'/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}',
                '/{uuid}', normalized)

            ep = api_endpoints[normalized]
            ep["methods"].add(method)
            ep["count"] += 1

            # Detect authentication type
            for header in entry["request"]["headers"]:
                name = header["name"].lower()
                if name == "authorization":
                    if "bearer" in header["value"].lower():
                        ep["auth_types"].add("Bearer/JWT")
                    elif "basic" in header["value"].lower():
                        ep["auth_types"].add("Basic")
                elif name == "x-api-key":
                    ep["auth_types"].add("API Key")

            # Detect content type
            content_type = next(
                (h["value"] for h in entry["request"]["headers"]
                 if h["name"].lower() == "content-type"), None)
            if content_type:
                ep["content_types"].add(content_type.split(";")[0])

    print(f"Discovered {len(api_endpoints)} unique API endpoints:\n")
    for url, info in sorted(api_endpoints.items()):
        methods = ", ".join(sorted(info["methods"]))
        auth = ", ".join(info["auth_types"]) or "None"
        print(f"  [{methods}] {url}")
        print(f"    Auth: {auth} | Requests: {info['count']}")

    return api_endpoints

Step 2: Active API Endpoint Discovery

# DNS enumeration for API subdomains
amass enum -d example.com -o amass_results.txt
subfinder -d example.com -o subfinder_results.txt

# Filter for API-related subdomains
grep -iE '(api|rest|graphql|ws|gateway|backend|internal|staging|dev|v1|v2)' \
    amass_results.txt subfinder_results.txt | sort -u > api_subdomains.txt

# Check which subdomains are alive
cat api_subdomains.txt | httpx -status-code -content-length -title \
    -tech-detect -o live_apis.txt

# Probe common API paths on each live subdomain
cat api_subdomains.txt | while read domain; do
    for path in /api /api/v1 /api/v2 /graphql /swagger.json /openapi.json \
                /api-docs /docs /health /status /metrics /actuator; do
        curl -s -o /dev/null -w "%{http_code} %{url_effective}\n" \
            "https://${domain}${path}" 2>/dev/null | grep -v "^404"
    done
done
import requests
import concurrent.futures

def discover_api_endpoints(base_domains):
    """Actively probe for API endpoints across discovered domains."""

    # Common API paths to test
    API_PATHS = [
        "/api", "/api/v1", "/api/v2", "/api/v3",
        "/graphql", "/gql", "/query",
        "/rest", "/json", "/rpc",
        "/swagger.json", "/swagger/v1/swagger.json",
        "/openapi.json", "/openapi.yaml", "/api-docs",
        "/docs", "/redoc", "/explorer",
        "/.well-known/openid-configuration",
        "/health", "/healthz", "/ready",
        "/status", "/info", "/version",
        "/metrics", "/prometheus",
        "/actuator", "/actuator/health", "/actuator/info",
        "/admin", "/admin/api", "/internal",
        "/debug", "/debug/vars", "/debug/pprof",
        "/ws", "/websocket", "/socket.io",
        "/grpc", "/twirp",
    ]

    discovered = []

    def check_endpoint(domain, path):
        for scheme in ["https", "http"]:
            url = f"{scheme}://{domain}{path}"
            try:
                resp = requests.get(url, timeout=5, allow_redirects=False,
                                  verify=False)
                if resp.status_code not in (404, 502, 503):
                    return {
                        "url": url,
                        "status": resp.status_code,
                        "content_type": resp.headers.get("Content-Type", ""),
                        "server": resp.headers.get("Server", ""),
                        "size": len(resp.content),
                    }
            except requests.exceptions.RequestException:
                pass
        return None

    with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:
        futures = {}
        for domain in base_domains:
            for path in API_PATHS:
                future = executor.submit(check_endpoint, domain, path)
                futures[future] = (domain, path)

        for future in concurrent.futures.as_completed(futures):
            result = future.result()
            if result:
                discovered.append(result)
                print(f"  [FOUND] {result['url']} -> {result['status']} ({result['content_type']})")

    return discovered

Step 3: JavaScript Source Analysis for API Endpoints

import re
import requests

def extract_apis_from_javascript(js_urls):
    """Extract API endpoints from JavaScript source files."""
    api_pattern = re.compile(
        r'''(?:['"`])((?:/api/|/v[0-9]+/|/graphql|/rest/)[^'"`\s<>{}]+)(?:['"`])''',
        re.IGNORECASE
    )
    url_pattern = re.compile(
        r'''(?:['"`])(https?://[a-zA-Z0-9._-]+(?:\.[a-zA-Z]{2,})+(?:/[^'"`\s<>{}]*)?)(?:['"`])'''
    )
    fetch_pattern = re.compile(
        r'''(?:fetch|axios|ajax|XMLHttpRequest|\.get|\.post|\.put|\.delete|\.patch)\s*\(\s*(?:['"`])([^'"`]+)'''
    )

    all_endpoints = set()

    for js_url in js_urls:
        try:
            resp = requests.get(js_url, timeout=10)
            content = resp.text

            # Extract relative API paths
            for match in api_pattern.findall(content):
                all_endpoints.add(("relative", match))

            # Extract absolute URLs
            for match in url_pattern.findall(content):
                if any(kw in match.lower() for kw in ["/api", "/v1", "/v2", "graphql"]):
                    all_endpoints.add(("absolute", match))

            # Extract from fetch/axios calls
            for match in fetch_pattern.findall(content):
                all_endpoints.add(("fetch", match))

        except requests.exceptions.RequestException:
            pass

    print(f"\nAPI endpoints discovered from JavaScript ({len(all_endpoints)}):")
    for source, endpoint in sorted(all_endpoints):
        print(f"  [{source}] {endpoint}")

    return all_endpoints

# Find JavaScript files from the target domain
def find_js_files(domain):
    """Discover JavaScript files from a web application."""
    resp = requests.get(f"https://{domain}", timeout=10)
    js_files = re.findall(r'src=["\']([^"\']+\.js[^"\']*)', resp.text)
    full_urls = []
    for js in js_files:
        if js.startswith("http"):
            full_urls.append(js)
        elif js.startswith("//"):
            full_urls.append(f"https:{js}")
        elif js.startswith("/"):
            full_urls.append(f"https://{domain}{js}")
    return full_urls

Step 4: Cloud API Gateway Inventory

import boto3

def inventory_aws_apis():
    """Inventory all APIs in AWS API Gateway."""
    apigw = boto3.client('apigateway')
    apigwv2 = boto3.client('apigatewayv2')

    apis = []

    # REST APIs (API Gateway v1)
    rest_apis = apigw.get_rest_apis()
    for api in rest_apis['items']:
        resources = apigw.get_resources(restApiId=api['id'])
        stages = apigw.get_stages(restApiId=api['id'])

        for stage in stages['item']:
            for resource in resources['items']:
                for method in resource.get('resourceMethods', {}).keys():
                    apis.append({
                        "type": "REST",
                        "name": api['name'],
                        "stage": stage['stageName'],
                        "path": resource['path'],
                        "method": method,
                        "url": f"https://{api['id']}.execute-api.{boto3.session.Session().region_name}.amazonaws.com/{stage['stageName']}{resource['path']}",
                        "created": str(api.get('createdDate', '')),
                    })

    # HTTP APIs (API Gateway v2)
    http_apis = apigwv2.get_apis()
    for api in http_apis['Items']:
        routes = apigwv2.get_routes(ApiId=api['ApiId'])
        stages = apigwv2.get_stages(ApiId=api['ApiId'])

        for route in routes['Items']:
            apis.append({
                "type": "HTTP",
                "name": api['Name'],
                "route": route['RouteKey'],
                "api_id": api['ApiId'],
                "protocol": api['ProtocolType'],
            })

    print(f"\nAWS API Inventory ({len(apis)} endpoints):")
    for api in apis:
        print(f"  [{api['type']}] {api.get('name')} - {api.get('method', '')} {api.get('path', api.get('route', ''))}")

    return apis

Step 5: API Version and Shadow API Detection

def detect_shadow_and_zombie_apis(discovered_endpoints, documented_endpoints):
    """Compare discovered APIs against documented inventory."""

    # Normalize endpoints for comparison
    def normalize(ep):
        ep = re.sub(r'/v\d+/', '/vX/', ep)
        ep = re.sub(r'/\d+', '/{id}', ep)
        return ep.lower().rstrip('/')

    documented_normalized = {normalize(ep) for ep in documented_endpoints}

    shadow_apis = []  # Discovered but not documented
    zombie_apis = []  # Old versions still accessible

    for ep in discovered_endpoints:
        normalized = normalize(ep["url"])

        if normalized not in documented_normalized:
            # Check if it is an old version of a documented API
            if re.search(r'/v[0-9]+/', ep["url"]):
                zombie_apis.append(ep)
            else:
                shadow_apis.append(ep)

    print(f"\nShadow APIs (undocumented): {len(shadow_apis)}")
    for api in shadow_apis:
        print(f"  [SHADOW] {api['url']} -> {api['status']}")

    print(f"\nZombie APIs (deprecated versions): {len(zombie_apis)}")
    for api in zombie_apis:
        print(f"  [ZOMBIE] {api['url']} -> {api['status']}")

    # Check if zombie APIs lack security controls
    for api in zombie_apis:
        resp = requests.get(api["url"], timeout=5)
        if resp.status_code not in (401, 403):
            print(f"  [CRITICAL] Zombie API accessible without auth: {api['url']}")

    return shadow_apis, zombie_apis

Key Concepts

TermDefinition
Shadow APIAn API deployed by a development team without going through the official API management or security review process
Zombie APIA deprecated or old API version that remains accessible and running but is no longer maintained or monitored
API InventoryA comprehensive catalog of all APIs in an organization including endpoint URLs, owners, versions, authentication methods, and data classifications
Improper Inventory ManagementOWASP API9:2023 - failure to maintain an accurate API inventory, leading to unmonitored and unprotected API endpoints
Attack SurfaceThe total set of API endpoints, methods, and parameters that an attacker can potentially interact with
API SprawlThe uncontrolled proliferation of APIs in an organization, often resulting from microservice adoption without centralized governance

Tools & Systems

  • Amass: OWASP tool for attack surface mapping through DNS enumeration, web scraping, and API discovery
  • httpx: Fast HTTP probing tool for validating discovered domains and identifying live API endpoints
  • nuclei: Template-based scanner for detecting exposed API documentation, debug endpoints, and misconfigured services
  • Swagger UI Detector: Tool for finding exposed Swagger/OpenAPI documentation endpoints across the organization
  • Akto: API security platform that discovers APIs through traffic analysis and maintains an automated inventory

Common Scenarios

Scenario: Enterprise API Attack Surface Assessment

Context: A large enterprise has 200+ development teams using microservices. The security team suspects many undocumented APIs are exposed to the internet. A comprehensive API inventory is needed for a security audit.

Approach:

  1. DNS enumeration discovers 340 subdomains, 45 contain API-related keywords (api, rest, gateway, backend)
  2. Active probing of all subdomains with API path wordlist discovers 127 live API endpoints
  3. JavaScript analysis of the main web application reveals 34 API endpoints, 8 of which point to undocumented internal services
  4. AWS API Gateway inventory shows 67 REST APIs and 23 HTTP APIs across 12 accounts
  5. Cross-referencing against the official API catalog: 31 shadow APIs (undocumented), 14 zombie APIs (deprecated versions)
  6. 3 zombie APIs have no authentication, exposing customer data through endpoints that were supposed to be decommissioned
  7. 2 shadow APIs expose internal admin functions to the internet without authorization

Pitfalls:

  • Only checking documented API endpoints and missing shadow APIs deployed outside the API gateway
  • Not scanning JavaScript bundles where frontend applications hardcode API endpoint URLs
  • Missing APIs behind non-standard ports or subpaths
  • Not checking for multiple API versions where older versions may lack security controls
  • Assuming all APIs go through the API gateway when some may be directly exposed

Output Format

## API Inventory and Discovery Report

**Organization**: Example Corp
**Assessment Date**: 2024-12-15
**Domains Scanned**: 340

### Summary

| Category | Count |
|----------|-------|
| Total APIs Discovered | 127 |
| Documented APIs | 82 |
| Shadow APIs (undocumented) | 31 |
| Zombie APIs (deprecated) | 14 |
| APIs Without Authentication | 8 |
| APIs Exposing Sensitive Data | 5 |

### Critical Findings

1. **Zombie API**: api-v1.example.com/api/v1/users - Deprecated in 2022,
   still accessible, no authentication required, returns full user data
2. **Shadow API**: internal-tools.example.com/api/admin - Admin functions
   exposed to internet without authorization
3. **Exposed Documentation**: 12 Swagger UI instances accessible publicly,
   revealing full API schema and endpoint details

Verification Criteria

Confirm successful execution by validating:

  • [ ] All prerequisite tools and access requirements are satisfied
  • [ ] Each workflow step completed without errors
  • [ ] Output matches expected format and contains expected data
  • [ ] No security warnings or misconfigurations detected
  • [ ] Results are documented and evidence is preserved for audit

Compliance Framework Mapping

This skill supports compliance evidence collection across multiple frameworks:

  • SOC 2: CC6.1 (Logical Access), CC6.6 (System Boundaries)
  • ISO 27001: A.14.1 (Security Requirements), A.9.4 (System Access Control)
  • NIST 800-53: AC-3 (Access Enforcement), SI-10 (Input Validation), SC-8 (Transmission Confidentiality)
  • OWASP LLM Top 10: LLM06 (Excessive Agency), LLM08 (Excessive Autonomy)

Claw GRC Tip: When this skill is executed by a registered agent, compliance evidence is automatically captured and mapped to the relevant controls in your active frameworks.

Deploying This Skill with Claw GRC

Agent Execution

Register this skill with your Claw GRC agent for automated execution:

# Install via CLI
npx claw-grc skills add performing-api-inventory-and-discovery

# Or load dynamically via MCP
grc.load_skill("performing-api-inventory-and-discovery")

Audit Trail Integration

When executed through Claw GRC, every step of this skill generates tamper-evident audit records:

  • SHA-256 chain hashing ensures no step can be modified after execution
  • Evidence artifacts (configs, scan results, logs) are automatically attached to relevant controls
  • Trust score impact — successful execution increases your agent's trust score

Continuous Compliance

Schedule this skill for recurring execution to maintain continuous compliance posture. Claw GRC monitors for drift and alerts when re-execution is needed.

Use with Claw GRC Agents

This skill is fully compatible with Claw GRC's autonomous agent system. Deploy it to any registered agent via MCP, and every execution will be logged in the tamper-evident audit trail.

// Load this skill in your agent
npx claw-grc skills add performing-api-inventory-and-discovery
// Or via MCP
grc.load_skill("performing-api-inventory-and-discovery")

Tags

api-securityowaspapi-discoveryshadow-apiinventoryattack-surface

Related Skills

API Security

Detecting Shadow API Endpoints

5m·intermediate
API Security

Implementing API Security Posture Management

6m·intermediate
API Security

Performing API Rate Limiting Bypass

8m·intermediate
API Security

Performing API Security Testing with Postman

8m·intermediate
API Security

Testing API Authentication Weaknesses

9m·intermediate
API Security

Testing API for Broken Object Level Authorization

9m·intermediate

Skill Details

Domain
API Security
Difficulty
intermediate
Read Time
8 min
Code Examples
7

On This Page

When to UsePrerequisitesWorkflowKey ConceptsTools & SystemsCommon ScenariosOutput FormatAPI Inventory and Discovery ReportVerification CriteriaCompliance Framework MappingDeploying This Skill with Claw GRC

Deploy This Skill

Add this skill to your Claw GRC agent and start automating.

Get Started Free →