Bug Bounty Hunting: OSINT, Reconnaissance & Attack Surface Mapping Playbook

Executive Summary

Bug bounty hunting in 2026 demands a disciplined, automated approach to reconnaissance that integrates open-source intelligence (OSINT) with continuous asset discovery. Modern attack surface mapping combines passive intelligence gathering, active enumeration, and intelligent automation across multiple attack vectorsâ€”subdomains, cloud infrastructure, hidden endpoints, and company-owned infrastructure. This playbook provides a battle-tested methodology and toolchain designed to identify the complete externally-visible attack surface before exploitation.

I. Foundation: Understanding Attack Surface Mapping

Attack surface mapping is the systematic identification of all externally accessible systems, applications, and services belonging to a target organization. The goal is comprehensive coverage across multiple dimensions:

Horizontal Expansion (infrastructure breadth): Discovering related domains, acquisitions, IP ranges, cloud assets, and associated infrastructure owned or operated by the target organization.

Vertical Deepening (component depth): Identifying technologies, versions, services, open ports, endpoints, parameters, and exposed functionality within discovered assets.

The fundamental principle is that you cannot exploit what you cannot find. Professional bug bounty hunters spend 60-70% of their engagement time on reconnaissance because the quality of subsequent vulnerability discovery depends entirely on the completeness of the attack surface map.

II. Phase 1: Passive Reconnaissance & Intelligence Gathering

Passive reconnaissance involves extracting publicly available information without directly probing the target. This phase establishes your baseline understanding of the organization's digital footprint.

A. Subdomain Enumeration (Passive)

Primary Tools:

Subfinder is the industry-standard for rapid passive subdomain discovery. It aggregates data from 30+ passive sources including Certificate Transparency logs, DNS APIs, web archives, and threat intelligence feeds. Subfinder excels due to its speedâ€”capable of resolving thousands of subdomains per minute.

Command: subfinder -d example.com -all

Amass (OWASP) provides the most comprehensive results by combining passive sources (DNS records, CT logs, archives, reverse DNS) with graph-based relationship tracking. While slower than Subfinder, Amass discovers 20-30% more subdomains on average, particularly for large organizations.

Command: amass enum -d example.com -passive

BBOT is an emerging alternative that reports 20-50% higher subdomain discovery rates than Amass/Subfinder due to NLP-powered mutation techniques and 100+ integrated modules. It's event-driven and recursive, meaning discovered subdomains automatically trigger additional enumeration.

Command: bbot -t example.com -p subdomain-enum

Findomain prioritizes speed with parallel processing across multiple data sources and supports API integration for automation at scale.

B. Certificate Transparency Log Mining

SSL/TLS certificates are mandatory for HTTPS domains and are logged publicly in Certificate Transparency (CT) logs. These logs reveal domain names, subdomains, and email addresses associated with certificates.

Primary Sources:

Censys (censys.io) â€“ Queries CT logs in real-time, supports bulk searches
Crt.sh (free CT log aggregator) â€“ Simplest interface, no authentication required
SecurityTrails â€“ Historical certificate data with API access

CT logs are particularly valuable because they reveal:

Development/staging subdomains (dev.example.com, staging-api.example.com)
Acquisition targets (recently acquired domains)
Internal naming schemes and architecture hints
Email domains and addresses

Command: curl "https://crt.sh/?q=%.example.com&output=json" | jq '.[] | .name_value' | tr ',' '\n' | sort -u

C. GitHub Dorking & Code Repository Intelligence

Developer mistakes are one of the richest sources of reconnaissance data. GitHub repositories frequently leak API endpoints, internal domain names, configuration files, and credentials.

Techniques:

Organization targeting: Use org:targetcompany in GitHub's code search to find all public repositories
File discovery: Search for configuration patterns like .env, config.json, settings.xml, docker-compose.yml
Keyword hunting: Search for company-specific patterns, internal API names, or infrastructure terms
Endpoint harvesting: Look for API URLs, webhook targets, and service endpoints in code

Tools:

GitHub CLI with advanced search: gh search code --query="org:examplecorp api_key"
GitDorker (automated dork execution against org code)
Gitleaks / TruffleHog (scan repositories for exposed secrets)

Key Dorks:

org:targetcompany filename:.env
org:targetcompany "api_key"
org:targetcompany "internal"
user:employee_name filename:config

D. Internet-Wide Scanning Platforms (Passive Querying)

Platforms like Shodan, ZoomEye, Censys, and Netlas maintain databases of internet-wide scans. Querying these platforms (without triggering active scans) provides:

Open ports and services on company IP addresses
Technology fingerprints and versions
Certificate information and SSL/TLS details
Historical data on previously exposed assets

Platform Comparison (2026):

Platform	Ports Scanned	Strength	Use Case
Shodan	1237	Largest dataset, best UI	Initial discovery
ZoomEye	3828	Asian infrastructure, historical data	Comprehensive coverage
Censys	N/A	Certificate focus, API-first	Structured searches
Netlas	146 public / 1300+ private	Consistent freshness, ASM	Attack surface monitoring

E. WHOIS, DNS & ASN Reconnaissance

ASN Mapping identifies the IP space owned or operated by the target organization, enabling you to discover forgotten or unmaintained infrastructure.

Command: asnmap -i 17.0.0.0 -o output.json (discovers all IP prefixes for an ASN)

Horizontal domain correlation uses WHOIS records, domain registrations, and business registries to identify:

Related companies (acquisitions, subsidiaries)
Shared infrastructure (same IP ranges, name servers, registrars)
Associated domain owners

Tools: bgp.he.net (ASN lookup), WHOIS queries via whois -h whois.radb.net, WhoisXMLAPI

F. Favicon Hashing for Infrastructure Discovery

Favicon hashes (MMurMurHash3) identify services across the internet that share the same application or icon. This technique reveals:

All publicly accessible instances of internal tools (Jira, Jenkins, etc.)
Services behind load balancers or CDNs with the same favicon
Related infrastructure operated by the same team

Command: Extract favicon hash, query Shodan: http.favicon.hash:[hash]

Tools: Huntrs (automated favicon extraction + reverse IP lookup), OWASP Favicon Database

III. Phase 2: Active Enumeration & Service Discovery

Active enumeration involves direct probing of discovered assets and their network infrastructure. This phase requires careful scoping to avoid triggering WAF/IDS alerts.

A. Comprehensive Subdomain Validation & HTTP Probing

Discovered subdomains must be validated (some may not resolve or respond). The tool HTTPX efficiently probes thousands of subdomains in parallel, returning only those with active HTTP/HTTPS services.

Command: cat subdomains.txt | httpx -status-code -title -tech-detect -o alive_hosts.txt

HTTPX output includes:

HTTP status codes (identify redirects, errors, accessible endpoints)
Page titles (useful for identifying admin panels, login pages)
Technology fingerprinting (detects Wordpress, nginx versions, etc.)
Response headers (reveals CDN, WAF, and service information)

B. DNS Resolution & Wildcard Detection

DNS resolution validates that discovered subdomains actually resolve to IP addresses. PureDNS combines high-speed DNS resolution (via MassDNS) with sophisticated wildcard subdomain filtering:

Command: puredns bruteforce wordlist.txt example.com -r resolvers.txt --wildcard-tests 50

PureDNS filters out false positives from wildcard DNS records that respond to any subdomain query, saving time and reducing noise.

C. Port Scanning & Service Enumeration

Naabu (by Project Discovery) provides fast network scanning across thousands of IPs/subdomains, identifying open ports and services.

Command: naabu -l hosts.txt -p 80,443,8080,8443,3306,5432,27017 -o ports.txt

Critical ports to scan:

Web services: 80, 443, 8080, 8443, 8888, 9000
Databases: 3306 (MySQL), 5432 (PostgreSQL), 27017 (MongoDB), 6379 (Redis)
Remote access: 22 (SSH), 3389 (RDP), 5900 (VNC)
APIs & services: 8000-8100 (common API ranges)

D. Web Technology Fingerprinting

Technology detection reveals:

CMS and framework versions (WordPress 5.8.1, etc.)
Outdated or vulnerable libraries
Custom applications and internal tools
Infrastructure components (load balancers, WAFs)

Tools:

Wappalyzer (passive technology detection)
Nuclei with tech-detect templates
Shodan/Censys queries for specific technologies

E. Cloud Infrastructure & Storage Enumeration

Cloud platforms (AWS, GCP, Azure) often expose storage buckets, databases, and services through misconfigurations.

S3 Bucket Discovery:

Identify cloud usage (DNS CNAME to s3.amazonaws.com, GCP records, etc.)
Enumerate bucket name permutations: example, example-prod, example-backup, example-data
Test bucket accessibility via HTTP: http://bucket-name.s3.amazonaws.com/
Assess permissions using AWS CLI or testPermissions APIs

Command: aws s3 ls s3://bucket-name/ --region us-east-1 (if accessible)

Tools:

cloud_enum (multi-cloud enumeration for AWS/GCP/Azure)
S3Scanner / GCPBucketBrute
Nuclei cloud templates (detect misconfigured cloud resources)

GCP Buckets: https://storage.googleapis.com/bucket-name/ Azure Blobs: https://accountname.blob.core.windows.net/container-name/

IV. Phase 3: Deep Reconnaissance & Hidden Asset Discovery

A. JavaScript Analysis & Endpoint Extraction

Modern web applications load functionality from JavaScript files. These files often contain:

API endpoints (hardcoded URLs to /api/v1/, /graphql, etc.)
Authentication tokens and service URLs
Frontend routing and application structure
References to internal services and domains

Extraction Methods:

Burp Suite + JS Link Finder: Crawl the application via Burp proxy; JS Link Finder automatically extracts endpoints from intercepted JavaScript files
Manual parsing: Search JavaScript for patterns like fetch(), axios.post(), /api/
Automated tools: JSpector, JS Miner parse all JavaScript in scope

Command: grep -r "fetch(\|axios\.\|api\/" *.js | grep -oP '(https?:)?//[^\s"]+' | sort -u

B. Parameter Discovery & Hidden Input Fields

Web applications often have undocumented or hidden parameters that accept user input. Discovering these parameters reveals:

Debugging endpoints (debug=true, verbose=1)
Undocumented features (internal flags, admin features)
Legacy parameter support that maintains backward compatibility
Sensitive features protected by weak validation

Parameter Discovery Tools:

Arjun (Python): Brute-force parameter names from wordlists against target endpoints Command: arjun -u https://example.com -o found_params.txt
ParamSpider: Harvest parameters from Wayback Machine archives and web crawls Command: python3 paramspider.py -d example.com --stream
Param Miner (Burp extension): Uses advanced diffing techniques to guess up to 65,000 parameter names per request
FFUF: Fast parameter fuzzing with custom wordlists Command: ffuf -w wordlist.txt -u "https://example.com/api/user?FUZZ=test" -fw 20

C. Directory & Path Enumeration

Common paths and directories often exist across web applications:

Command: ffuf -w common_paths.txt -u https://example.com/FUZZ -o dirs.txt

High-value paths to fuzz:

/admin/, /administrator/, /wp-admin/ (admin panels)
/.env, /config/, /settings/ (configuration files)
/.git/, /.svn/ (version control exposure)
/api/, /api/v1/, /api/v2/ (API endpoints)
/backup/, /old/, /test/ (legacy/backup data)
/.well-known/ (service discovery)

D. API Enumeration & Hidden Endpoints

APIs are frequently targeted in bug bounties due to weak authentication and authorization. Discovering the full API surface is critical.

API Discovery Workflow:

Identify API patterns in traffic: /api/, /graphql, /rest/
Extract endpoints from JavaScript files
Use Burp Intruder to fuzz path segments
Test different HTTP methods (GET, POST, PUT, DELETE, PATCH)
Discover parameters specific to each endpoint

Command: httpx -l subdomains.txt -path /api/ -status-code (identifies APIs)

GraphQL Detection: Check for /graphql, /graphql/query, /query endpoints and attempt introspection queries

V. Phase 4: Attack Surface Automation & Orchestration

A. Integrated Scanning Frameworks

BBOT is the most advanced reconnaissance automation platform (2026), combining 100+ modules with recursive, event-driven triggering.

Example: Complete Subdomain + Cloud + Ports + Vulnerabilities Scan

bbot -t example.com \
  -p subdomain-enum cloud-enum web-basic \
  -m nmap gowitness nuclei \
  --allow-deadly

BBOT modules:

subdomain-enum: Multiple passive and active subdomain sources
cloud-enum: AWS, GCP, Azure bucket and resource discovery
web-basic: Technology fingerprinting, robots.txt, certificate analysis
nmap: Port scanning and service enumeration
gowitness: Web screenshots for visual asset mapping
nuclei: Template-based vulnerability scanning

Output is stored in Neo4j graph database for relationship visualization and filtering.

B. Nuclei Template-Based Scanning

Nuclei is a template-based vulnerability scanner designed for automation. Templates define:

Target matching conditions (technology-specific scans)
HTTP requests (fuzzing payloads, custom headers)
Response analysis (matchers for success/failure)

Nuclei Workflow:

# Update templates daily
nuclei -update-templates

# Scan for specific vulnerability class
nuclei -l live_hosts.txt -tags sqli,xss,rce -o vulnerabilities.txt

# Run custom templates
nuclei -l hosts.txt -t custom_templates/ -config nuclei.conf

Critical Nuclei Templates for Reconnaissance:

http/cves/ (CVE-specific checks)
http/technologies/ (app fingerprinting)
cloud-config/ (cloud misconfiguration)
dns/ (DNS-based checks)
ssl/ (certificate issues)

C. HTTPX Integration Pipeline

Chain multiple Project Discovery tools for efficiency:

# Discover subdomains
subfinder -d example.com | \
# Validate HTTP services
httpx -status-code -title -tech-detect | \
# Screenshot web interfaces
gowitness screenshot --web-screenshot -i - | \
# Scan for vulnerabilities
nuclei -l - -o final_report.txt

D. Parameter & Endpoint Discovery at Scale

# Extract all JavaScript endpoints
cat alive_hosts.txt | \
while read url; do
  curl -s "$url" | grep -oP "(https?://[^\s\"']+)" | sort -u >> all_endpoints.txt
done

# Fuzz parameters on discovered endpoints
ffuf -w params_wordlist.txt -u "[ENDPOINT]?FUZZ=test" -fc 404

VI. Threat Intelligence & External Datasources

A. Shodan Queries for Reconnaissance

Shodan provides immediate visibility into exposed services without active scanning.

High-Value Queries:

org:"Company Name" port:80,443,8080
org:"Company Name" product:nginx
org:"Company Name" "X-Powered-By"
org:"Company Name" http.title:"Admin"
org:"Company Name" os:Linux
org:"Company Name" mongodb

Favicon Hash Search:

http.favicon.hash:[hash_value]  # Find all instances of a specific app

B. Censys & SecurityTrails

Censys: Query CT logs, IP space ownership, certificate timelines
SecurityTrails: Historical DNS records, WHOIS changes, subdomain history

VII. Organizing & Prioritizing Results

A. Data Consolidation

Aggregate results into structured files:

# All discovered subdomains
cat subfinder.txt amass.txt bbot.txt | sort -u > all_subdomains.txt

# Alive hosts with technologies
httpx -l all_subdomains.txt -json > live_hosts.json

# Merge port scan and service data
jq -r '.host + ":" + .port' nmap_output.json > services.txt

B. Risk Prioritization

Focus on highest-risk assets:

Exposed admin panels (identify by title, path, technology)
Known vulnerable technologies (outdated Wordpress, Joomla, etc.)
Cloud storage (S3 buckets, GCP storage, Azure blobs)
API endpoints without apparent authentication
Development/staging subdomains (weaker security controls)
Database services exposed on non-standard ports

VIII. Detection Evasion & Operational Security

A. Rate Limiting & Stealth

Professional scanners implement rate limiting to avoid triggering WAF/IDS:

# HTTPX with reduced concurrency
httpx -l hosts.txt -c 10 -timeout 10

# FFUF with delay between requests
ffuf -w wordlist.txt -u "https://example.com/FUZZ" -p "0.1-0.2"

B. WAF Detection & Evasion

Identify and bypass Web Application Firewalls:

WAF Detection:

Use WAFW00F to identify WAF type
Analyze error messages, status codes, and response patterns
Check for common WAF headers (X-Sucuri, X-Fortinet, etc.)

Evasion Techniques:

Randomize User-Agent headers
Distribute traffic across multiple IPs
Use slow scanning rates
Encode payloads (Base64, URL encoding)
Leverage CNAME/redirect chains to bypass IP-based blocking

IX. Complete 2026 Toolchain Summary

Phase	Primary Tools	Alternative Tools
Passive Subdomain	Subfinder, Amass, BBOT	Findomain, Sublist3r
DNS Resolution	PureDNS, MassDNS	DNSRecon, Massdns
HTTP Probing	HTTPX	Httprobe, Curl
Port Scanning	Naabu	Nmap, Masscan
Technology Fingerprint	Wappalyzer, Nuclei	Shodan queries
Cloud Enumeration	cloud_enum, BBOT	S3Scanner, GCPBucketBrute
API Discovery	Burp Suite, JS Link Finder	Manual JavaScript analysis
Parameter Discovery	Arjun, ParamSpider, FFUF	Param Miner, x8
Vulnerability Scanning	Nuclei	Burp Suite, OWASP ZAP
Automation/Orchestration	BBOT, Hakscale	Custom bash/Python
Visualization	Neo4j (BBOT output)	Maltego, Shodan UI

X. Workflow Example: Complete Reconnaissance

Target: example.com (unknown scope)

Day 1: Passive Intelligence

# Subdomain enumeration (parallel)
subfinder -d example.com -o subfinder.txt &
amass enum -d example.com -passive -o amass.txt &
echo "example.com" | bbot -p subdomain-enum -o bbot/ &

# CT log mining
curl "https://crt.sh/?q=%.example.com&output=json" | jq -r '.[].name_value' | tr ',' '\n' > ct.txt

# GitHub dorking
gh search code --query="org:example-org api" > github_endpoints.txt

# Consolidate
cat subfinder.txt amass.txt bbot/scanned_subdomains.txt ct.txt | sort -u > all_subdomains.txt

Day 2: Validation & Enumeration

# HTTP probing with technology detection
httpx -l all_subdomains.txt -status-code -title -tech-detect -json > live_hosts.json

# Extract high-value targets
jq -r 'select(.status_code | [200,301,302,403] | contains([.])) | .url' live_hosts.json > targets.txt

# Port scanning
naabu -l targets.txt -p 80,443,8080,8443,3306,5432,27017 -json > ports.json

# Cloud enumeration
bbot -t example.com -p cloud-enum -o cloud_results.txt

Day 3: Deep Reconnaissance & Vulnerability Assessment

# JavaScript endpoint extraction & analysis
for url in $(cat targets.txt); do
  curl -s "$url" | grep -oP "(https?://[^\s\"'<>]+)" | grep -E "api|graphql" >> api_endpoints.txt
done

# Parameter discovery
arjun -u "$(head -1 targets.txt)" -o found_params.txt

# Nuclei vulnerability scanning
nuclei -l targets.txt -t templates/ -tags sqli,xss,ssrf,rce -o vulns.txt

# Custom template scans for identified technologies
nuclei -l targets.txt -t custom_templates/wordpress/ -o wordpress_vulns.txt

Day 4: Reporting & Prioritization

# Consolidate findings
{
  echo "=== Discovered Assets ==="
  wc -l all_subdomains.txt
  echo "=== Live Services ==="
  jq '.status_code' live_hosts.json | sort | uniq -c
  echo "=== High-Risk Technologies ==="
  jq -r '.technologies[].name' live_hosts.json | sort | uniq -c | sort -rn
  echo "=== Vulnerabilities Found ==="
  wc -l vulns.txt
} > reconnaissance_summary.txt

XI. Advanced Techniques & 2026 Trends

A. AI-Powered OSINT

Modern bug bounty professionals leverage AI/LLM tools for:

Automated script generation for custom reconnaissance
Payload crafting and optimization
Analyzing error messages for vulnerability clues
Summarizing large datasets and identifying patterns

Tools: Claude, GPT-4 (for payload generation and analysis), Nuclei + AI plugins

B. Agentic & Continuous Reconnaissance

2025-2026 tools implement autonomous agents that:

Run continuous, automated scanning
Trigger dependent scans based on discoveries
Learn from previous findings
Adapt scanning strategies in real-time

BBOT exemplifies this approach with event-driven modules that automatically cascade discoveries.

C. Multicloud & Serverless Reconnaissance

Focus areas:

Lambda function enumeration
API Gateway discovery
Serverless database exposure
Container registry scanning
Infrastructure-as-Code repository mining

XII. Ethical & Legal Considerations

Always verify scope against the bug bounty program rules
Avoid resource exhaustion (rate limit scanning)
Do not access or download data from misconfigured storage
Report findings responsibly with clear proof-of-concept
Respect rate limits on third-party OSINT platforms
Use VPN/proxy to avoid exposing your IP during scans

Conclusion

Attack surface mapping in 2026 is a discipline combining automated discovery, intelligent filtering, and manual verification. Success depends on:

Comprehensive tooling (BBOT, Nuclei, HTTPX + specialized tools)
Layered approach (passive â†’ active â†’ deep enumeration)
Integration (automated pipelines combining multiple data sources)
Context awareness (prioritizing findings based on risk)
Continuous learning (staying current with new tools and techniques)

The reconnaissance phase determines the quality of subsequent vulnerability discovery. Invest 60-70% of your time in thorough, systematic attack surface mapping, and the exploitation phase becomes exponentially more productive.

jfmaes/Bugbounty.md

Bug Bounty Hunting: OSINT, Reconnaissance & Attack Surface Mapping Playbook

Executive Summary

I. Foundation: Understanding Attack Surface Mapping

II. Phase 1: Passive Reconnaissance & Intelligence Gathering

A. Subdomain Enumeration (Passive)

B. Certificate Transparency Log Mining

C. GitHub Dorking & Code Repository Intelligence

D. Internet-Wide Scanning Platforms (Passive Querying)

E. WHOIS, DNS & ASN Reconnaissance

F. Favicon Hashing for Infrastructure Discovery

III. Phase 2: Active Enumeration & Service Discovery

A. Comprehensive Subdomain Validation & HTTP Probing

B. DNS Resolution & Wildcard Detection

C. Port Scanning & Service Enumeration

D. Web Technology Fingerprinting

E. Cloud Infrastructure & Storage Enumeration

IV. Phase 3: Deep Reconnaissance & Hidden Asset Discovery

A. JavaScript Analysis & Endpoint Extraction

B. Parameter Discovery & Hidden Input Fields

C. Directory & Path Enumeration

D. API Enumeration & Hidden Endpoints

V. Phase 4: Attack Surface Automation & Orchestration

A. Integrated Scanning Frameworks

B. Nuclei Template-Based Scanning

C. HTTPX Integration Pipeline

D. Parameter & Endpoint Discovery at Scale

VI. Threat Intelligence & External Datasources

A. Shodan Queries for Reconnaissance

B. Censys & SecurityTrails

VII. Organizing & Prioritizing Results

A. Data Consolidation

B. Risk Prioritization

VIII. Detection Evasion & Operational Security

A. Rate Limiting & Stealth

B. WAF Detection & Evasion

IX. Complete 2026 Toolchain Summary

X. Workflow Example: Complete Reconnaissance

XI. Advanced Techniques & 2026 Trends

A. AI-Powered OSINT

B. Agentic & Continuous Reconnaissance

C. Multicloud & Serverless Reconnaissance

XII. Ethical & Legal Considerations

Conclusion