Skip to content

Instantly share code, notes, and snippets.

@jfmaes
Created January 27, 2026 08:53
Show Gist options
  • Select an option

  • Save jfmaes/bf4756b24d61bd6c219976b732cc3763 to your computer and use it in GitHub Desktop.

Select an option

Save jfmaes/bf4756b24d61bd6c219976b732cc3763 to your computer and use it in GitHub Desktop.
Bugbounty.md

Bug Bounty Hunting: OSINT, Reconnaissance & Attack Surface Mapping Playbook

Executive Summary

Bug bounty hunting in 2026 demands a disciplined, automated approach to reconnaissance that integrates open-source intelligence (OSINT) with continuous asset discovery. Modern attack surface mapping combines passive intelligence gathering, active enumeration, and intelligent automation across multiple attack vectors—subdomains, cloud infrastructure, hidden endpoints, and company-owned infrastructure. This playbook provides a battle-tested methodology and toolchain designed to identify the complete externally-visible attack surface before exploitation.


I. Foundation: Understanding Attack Surface Mapping

Attack surface mapping is the systematic identification of all externally accessible systems, applications, and services belonging to a target organization. The goal is comprehensive coverage across multiple dimensions:

Horizontal Expansion (infrastructure breadth): Discovering related domains, acquisitions, IP ranges, cloud assets, and associated infrastructure owned or operated by the target organization.

Vertical Deepening (component depth): Identifying technologies, versions, services, open ports, endpoints, parameters, and exposed functionality within discovered assets.

The fundamental principle is that you cannot exploit what you cannot find. Professional bug bounty hunters spend 60-70% of their engagement time on reconnaissance because the quality of subsequent vulnerability discovery depends entirely on the completeness of the attack surface map.


II. Phase 1: Passive Reconnaissance & Intelligence Gathering

Passive reconnaissance involves extracting publicly available information without directly probing the target. This phase establishes your baseline understanding of the organization's digital footprint.

A. Subdomain Enumeration (Passive)

Primary Tools:

Subfinder is the industry-standard for rapid passive subdomain discovery. It aggregates data from 30+ passive sources including Certificate Transparency logs, DNS APIs, web archives, and threat intelligence feeds. Subfinder excels due to its speed—capable of resolving thousands of subdomains per minute.

Command: subfinder -d example.com -all

Amass (OWASP) provides the most comprehensive results by combining passive sources (DNS records, CT logs, archives, reverse DNS) with graph-based relationship tracking. While slower than Subfinder, Amass discovers 20-30% more subdomains on average, particularly for large organizations.

Command: amass enum -d example.com -passive

BBOT is an emerging alternative that reports 20-50% higher subdomain discovery rates than Amass/Subfinder due to NLP-powered mutation techniques and 100+ integrated modules. It's event-driven and recursive, meaning discovered subdomains automatically trigger additional enumeration.

Command: bbot -t example.com -p subdomain-enum

Findomain prioritizes speed with parallel processing across multiple data sources and supports API integration for automation at scale.

B. Certificate Transparency Log Mining

SSL/TLS certificates are mandatory for HTTPS domains and are logged publicly in Certificate Transparency (CT) logs. These logs reveal domain names, subdomains, and email addresses associated with certificates.

Primary Sources:

  • Censys (censys.io) – Queries CT logs in real-time, supports bulk searches
  • Crt.sh (free CT log aggregator) – Simplest interface, no authentication required
  • SecurityTrails – Historical certificate data with API access

CT logs are particularly valuable because they reveal:

  • Development/staging subdomains (dev.example.com, staging-api.example.com)
  • Acquisition targets (recently acquired domains)
  • Internal naming schemes and architecture hints
  • Email domains and addresses

Command: curl "https://crt.sh/?q=%.example.com&output=json" | jq '.[] | .name_value' | tr ',' '\n' | sort -u

C. GitHub Dorking & Code Repository Intelligence

Developer mistakes are one of the richest sources of reconnaissance data. GitHub repositories frequently leak API endpoints, internal domain names, configuration files, and credentials.

Techniques:

  1. Organization targeting: Use org:targetcompany in GitHub's code search to find all public repositories
  2. File discovery: Search for configuration patterns like .env, config.json, settings.xml, docker-compose.yml
  3. Keyword hunting: Search for company-specific patterns, internal API names, or infrastructure terms
  4. Endpoint harvesting: Look for API URLs, webhook targets, and service endpoints in code

Tools:

  • GitHub CLI with advanced search: gh search code --query="org:examplecorp api_key"
  • GitDorker (automated dork execution against org code)
  • Gitleaks / TruffleHog (scan repositories for exposed secrets)

Key Dorks:

  • org:targetcompany filename:.env
  • org:targetcompany "api_key"
  • org:targetcompany "internal"
  • user:employee_name filename:config

D. Internet-Wide Scanning Platforms (Passive Querying)

Platforms like Shodan, ZoomEye, Censys, and Netlas maintain databases of internet-wide scans. Querying these platforms (without triggering active scans) provides:

  • Open ports and services on company IP addresses
  • Technology fingerprints and versions
  • Certificate information and SSL/TLS details
  • Historical data on previously exposed assets

Platform Comparison (2026):

Platform Ports Scanned Strength Use Case
Shodan 1237 Largest dataset, best UI Initial discovery
ZoomEye 3828 Asian infrastructure, historical data Comprehensive coverage
Censys N/A Certificate focus, API-first Structured searches
Netlas 146 public / 1300+ private Consistent freshness, ASM Attack surface monitoring

E. WHOIS, DNS & ASN Reconnaissance

ASN Mapping identifies the IP space owned or operated by the target organization, enabling you to discover forgotten or unmaintained infrastructure.

Command: asnmap -i 17.0.0.0 -o output.json (discovers all IP prefixes for an ASN)

Horizontal domain correlation uses WHOIS records, domain registrations, and business registries to identify:

  • Related companies (acquisitions, subsidiaries)
  • Shared infrastructure (same IP ranges, name servers, registrars)
  • Associated domain owners

Tools: bgp.he.net (ASN lookup), WHOIS queries via whois -h whois.radb.net, WhoisXMLAPI

F. Favicon Hashing for Infrastructure Discovery

Favicon hashes (MMurMurHash3) identify services across the internet that share the same application or icon. This technique reveals:

  • All publicly accessible instances of internal tools (Jira, Jenkins, etc.)
  • Services behind load balancers or CDNs with the same favicon
  • Related infrastructure operated by the same team

Command: Extract favicon hash, query Shodan: http.favicon.hash:[hash]

Tools: Huntrs (automated favicon extraction + reverse IP lookup), OWASP Favicon Database


III. Phase 2: Active Enumeration & Service Discovery

Active enumeration involves direct probing of discovered assets and their network infrastructure. This phase requires careful scoping to avoid triggering WAF/IDS alerts.

A. Comprehensive Subdomain Validation & HTTP Probing

Discovered subdomains must be validated (some may not resolve or respond). The tool HTTPX efficiently probes thousands of subdomains in parallel, returning only those with active HTTP/HTTPS services.

Command: cat subdomains.txt | httpx -status-code -title -tech-detect -o alive_hosts.txt

HTTPX output includes:

  • HTTP status codes (identify redirects, errors, accessible endpoints)
  • Page titles (useful for identifying admin panels, login pages)
  • Technology fingerprinting (detects Wordpress, nginx versions, etc.)
  • Response headers (reveals CDN, WAF, and service information)

B. DNS Resolution & Wildcard Detection

DNS resolution validates that discovered subdomains actually resolve to IP addresses. PureDNS combines high-speed DNS resolution (via MassDNS) with sophisticated wildcard subdomain filtering:

Command: puredns bruteforce wordlist.txt example.com -r resolvers.txt --wildcard-tests 50

PureDNS filters out false positives from wildcard DNS records that respond to any subdomain query, saving time and reducing noise.

C. Port Scanning & Service Enumeration

Naabu (by Project Discovery) provides fast network scanning across thousands of IPs/subdomains, identifying open ports and services.

Command: naabu -l hosts.txt -p 80,443,8080,8443,3306,5432,27017 -o ports.txt

Critical ports to scan:

  • Web services: 80, 443, 8080, 8443, 8888, 9000
  • Databases: 3306 (MySQL), 5432 (PostgreSQL), 27017 (MongoDB), 6379 (Redis)
  • Remote access: 22 (SSH), 3389 (RDP), 5900 (VNC)
  • APIs & services: 8000-8100 (common API ranges)

D. Web Technology Fingerprinting

Technology detection reveals:

  • CMS and framework versions (WordPress 5.8.1, etc.)
  • Outdated or vulnerable libraries
  • Custom applications and internal tools
  • Infrastructure components (load balancers, WAFs)

Tools:

  • Wappalyzer (passive technology detection)
  • Nuclei with tech-detect templates
  • Shodan/Censys queries for specific technologies

E. Cloud Infrastructure & Storage Enumeration

Cloud platforms (AWS, GCP, Azure) often expose storage buckets, databases, and services through misconfigurations.

S3 Bucket Discovery:

  1. Identify cloud usage (DNS CNAME to s3.amazonaws.com, GCP records, etc.)
  2. Enumerate bucket name permutations: example, example-prod, example-backup, example-data
  3. Test bucket accessibility via HTTP: http://bucket-name.s3.amazonaws.com/
  4. Assess permissions using AWS CLI or testPermissions APIs

Command: aws s3 ls s3://bucket-name/ --region us-east-1 (if accessible)

Tools:

  • cloud_enum (multi-cloud enumeration for AWS/GCP/Azure)
  • S3Scanner / GCPBucketBrute
  • Nuclei cloud templates (detect misconfigured cloud resources)

GCP Buckets: https://storage.googleapis.com/bucket-name/ Azure Blobs: https://accountname.blob.core.windows.net/container-name/


IV. Phase 3: Deep Reconnaissance & Hidden Asset Discovery

A. JavaScript Analysis & Endpoint Extraction

Modern web applications load functionality from JavaScript files. These files often contain:

  • API endpoints (hardcoded URLs to /api/v1/, /graphql, etc.)
  • Authentication tokens and service URLs
  • Frontend routing and application structure
  • References to internal services and domains

Extraction Methods:

  1. Burp Suite + JS Link Finder: Crawl the application via Burp proxy; JS Link Finder automatically extracts endpoints from intercepted JavaScript files
  2. Manual parsing: Search JavaScript for patterns like fetch(), axios.post(), /api/
  3. Automated tools: JSpector, JS Miner parse all JavaScript in scope

Command: grep -r "fetch(\|axios\.\|api\/" *.js | grep -oP '(https?:)?//[^\s"]+' | sort -u

B. Parameter Discovery & Hidden Input Fields

Web applications often have undocumented or hidden parameters that accept user input. Discovering these parameters reveals:

  • Debugging endpoints (debug=true, verbose=1)
  • Undocumented features (internal flags, admin features)
  • Legacy parameter support that maintains backward compatibility
  • Sensitive features protected by weak validation

Parameter Discovery Tools:

  1. Arjun (Python): Brute-force parameter names from wordlists against target endpoints Command: arjun -u https://example.com -o found_params.txt

  2. ParamSpider: Harvest parameters from Wayback Machine archives and web crawls Command: python3 paramspider.py -d example.com --stream

  3. Param Miner (Burp extension): Uses advanced diffing techniques to guess up to 65,000 parameter names per request

  4. FFUF: Fast parameter fuzzing with custom wordlists Command: ffuf -w wordlist.txt -u "https://example.com/api/user?FUZZ=test" -fw 20

C. Directory & Path Enumeration

Common paths and directories often exist across web applications:

Command: ffuf -w common_paths.txt -u https://example.com/FUZZ -o dirs.txt

High-value paths to fuzz:

  • /admin/, /administrator/, /wp-admin/ (admin panels)
  • /.env, /config/, /settings/ (configuration files)
  • /.git/, /.svn/ (version control exposure)
  • /api/, /api/v1/, /api/v2/ (API endpoints)
  • /backup/, /old/, /test/ (legacy/backup data)
  • /.well-known/ (service discovery)

D. API Enumeration & Hidden Endpoints

APIs are frequently targeted in bug bounties due to weak authentication and authorization. Discovering the full API surface is critical.

API Discovery Workflow:

  1. Identify API patterns in traffic: /api/, /graphql, /rest/
  2. Extract endpoints from JavaScript files
  3. Use Burp Intruder to fuzz path segments
  4. Test different HTTP methods (GET, POST, PUT, DELETE, PATCH)
  5. Discover parameters specific to each endpoint

Command: httpx -l subdomains.txt -path /api/ -status-code (identifies APIs)

GraphQL Detection: Check for /graphql, /graphql/query, /query endpoints and attempt introspection queries


V. Phase 4: Attack Surface Automation & Orchestration

A. Integrated Scanning Frameworks

BBOT is the most advanced reconnaissance automation platform (2026), combining 100+ modules with recursive, event-driven triggering.

Example: Complete Subdomain + Cloud + Ports + Vulnerabilities Scan

bbot -t example.com \
  -p subdomain-enum cloud-enum web-basic \
  -m nmap gowitness nuclei \
  --allow-deadly

BBOT modules:

  • subdomain-enum: Multiple passive and active subdomain sources
  • cloud-enum: AWS, GCP, Azure bucket and resource discovery
  • web-basic: Technology fingerprinting, robots.txt, certificate analysis
  • nmap: Port scanning and service enumeration
  • gowitness: Web screenshots for visual asset mapping
  • nuclei: Template-based vulnerability scanning

Output is stored in Neo4j graph database for relationship visualization and filtering.

B. Nuclei Template-Based Scanning

Nuclei is a template-based vulnerability scanner designed for automation. Templates define:

  • Target matching conditions (technology-specific scans)
  • HTTP requests (fuzzing payloads, custom headers)
  • Response analysis (matchers for success/failure)

Nuclei Workflow:

# Update templates daily
nuclei -update-templates

# Scan for specific vulnerability class
nuclei -l live_hosts.txt -tags sqli,xss,rce -o vulnerabilities.txt

# Run custom templates
nuclei -l hosts.txt -t custom_templates/ -config nuclei.conf

Critical Nuclei Templates for Reconnaissance:

  • http/cves/ (CVE-specific checks)
  • http/technologies/ (app fingerprinting)
  • cloud-config/ (cloud misconfiguration)
  • dns/ (DNS-based checks)
  • ssl/ (certificate issues)

C. HTTPX Integration Pipeline

Chain multiple Project Discovery tools for efficiency:

# Discover subdomains
subfinder -d example.com | \
# Validate HTTP services
httpx -status-code -title -tech-detect | \
# Screenshot web interfaces
gowitness screenshot --web-screenshot -i - | \
# Scan for vulnerabilities
nuclei -l - -o final_report.txt

D. Parameter & Endpoint Discovery at Scale

# Extract all JavaScript endpoints
cat alive_hosts.txt | \
while read url; do
  curl -s "$url" | grep -oP "(https?://[^\s\"']+)" | sort -u >> all_endpoints.txt
done

# Fuzz parameters on discovered endpoints
ffuf -w params_wordlist.txt -u "[ENDPOINT]?FUZZ=test" -fc 404

VI. Threat Intelligence & External Datasources

A. Shodan Queries for Reconnaissance

Shodan provides immediate visibility into exposed services without active scanning.

High-Value Queries:

org:"Company Name" port:80,443,8080
org:"Company Name" product:nginx
org:"Company Name" "X-Powered-By"
org:"Company Name" http.title:"Admin"
org:"Company Name" os:Linux
org:"Company Name" mongodb

Favicon Hash Search:

http.favicon.hash:[hash_value]  # Find all instances of a specific app

B. Censys & SecurityTrails

  • Censys: Query CT logs, IP space ownership, certificate timelines
  • SecurityTrails: Historical DNS records, WHOIS changes, subdomain history

VII. Organizing & Prioritizing Results

A. Data Consolidation

Aggregate results into structured files:

# All discovered subdomains
cat subfinder.txt amass.txt bbot.txt | sort -u > all_subdomains.txt

# Alive hosts with technologies
httpx -l all_subdomains.txt -json > live_hosts.json

# Merge port scan and service data
jq -r '.host + ":" + .port' nmap_output.json > services.txt

B. Risk Prioritization

Focus on highest-risk assets:

  1. Exposed admin panels (identify by title, path, technology)
  2. Known vulnerable technologies (outdated Wordpress, Joomla, etc.)
  3. Cloud storage (S3 buckets, GCP storage, Azure blobs)
  4. API endpoints without apparent authentication
  5. Development/staging subdomains (weaker security controls)
  6. Database services exposed on non-standard ports

VIII. Detection Evasion & Operational Security

A. Rate Limiting & Stealth

Professional scanners implement rate limiting to avoid triggering WAF/IDS:

# HTTPX with reduced concurrency
httpx -l hosts.txt -c 10 -timeout 10

# FFUF with delay between requests
ffuf -w wordlist.txt -u "https://example.com/FUZZ" -p "0.1-0.2"

B. WAF Detection & Evasion

Identify and bypass Web Application Firewalls:

WAF Detection:

  • Use WAFW00F to identify WAF type
  • Analyze error messages, status codes, and response patterns
  • Check for common WAF headers (X-Sucuri, X-Fortinet, etc.)

Evasion Techniques:

  • Randomize User-Agent headers
  • Distribute traffic across multiple IPs
  • Use slow scanning rates
  • Encode payloads (Base64, URL encoding)
  • Leverage CNAME/redirect chains to bypass IP-based blocking

IX. Complete 2026 Toolchain Summary

Phase Primary Tools Alternative Tools
Passive Subdomain Subfinder, Amass, BBOT Findomain, Sublist3r
DNS Resolution PureDNS, MassDNS DNSRecon, Massdns
HTTP Probing HTTPX Httprobe, Curl
Port Scanning Naabu Nmap, Masscan
Technology Fingerprint Wappalyzer, Nuclei Shodan queries
Cloud Enumeration cloud_enum, BBOT S3Scanner, GCPBucketBrute
API Discovery Burp Suite, JS Link Finder Manual JavaScript analysis
Parameter Discovery Arjun, ParamSpider, FFUF Param Miner, x8
Vulnerability Scanning Nuclei Burp Suite, OWASP ZAP
Automation/Orchestration BBOT, Hakscale Custom bash/Python
Visualization Neo4j (BBOT output) Maltego, Shodan UI

X. Workflow Example: Complete Reconnaissance

Target: example.com (unknown scope)

Day 1: Passive Intelligence

# Subdomain enumeration (parallel)
subfinder -d example.com -o subfinder.txt &
amass enum -d example.com -passive -o amass.txt &
echo "example.com" | bbot -p subdomain-enum -o bbot/ &

# CT log mining
curl "https://crt.sh/?q=%.example.com&output=json" | jq -r '.[].name_value' | tr ',' '\n' > ct.txt

# GitHub dorking
gh search code --query="org:example-org api" > github_endpoints.txt

# Consolidate
cat subfinder.txt amass.txt bbot/scanned_subdomains.txt ct.txt | sort -u > all_subdomains.txt

Day 2: Validation & Enumeration

# HTTP probing with technology detection
httpx -l all_subdomains.txt -status-code -title -tech-detect -json > live_hosts.json

# Extract high-value targets
jq -r 'select(.status_code | [200,301,302,403] | contains([.])) | .url' live_hosts.json > targets.txt

# Port scanning
naabu -l targets.txt -p 80,443,8080,8443,3306,5432,27017 -json > ports.json

# Cloud enumeration
bbot -t example.com -p cloud-enum -o cloud_results.txt

Day 3: Deep Reconnaissance & Vulnerability Assessment

# JavaScript endpoint extraction & analysis
for url in $(cat targets.txt); do
  curl -s "$url" | grep -oP "(https?://[^\s\"'<>]+)" | grep -E "api|graphql" >> api_endpoints.txt
done

# Parameter discovery
arjun -u "$(head -1 targets.txt)" -o found_params.txt

# Nuclei vulnerability scanning
nuclei -l targets.txt -t templates/ -tags sqli,xss,ssrf,rce -o vulns.txt

# Custom template scans for identified technologies
nuclei -l targets.txt -t custom_templates/wordpress/ -o wordpress_vulns.txt

Day 4: Reporting & Prioritization

# Consolidate findings
{
  echo "=== Discovered Assets ==="
  wc -l all_subdomains.txt
  echo "=== Live Services ==="
  jq '.status_code' live_hosts.json | sort | uniq -c
  echo "=== High-Risk Technologies ==="
  jq -r '.technologies[].name' live_hosts.json | sort | uniq -c | sort -rn
  echo "=== Vulnerabilities Found ==="
  wc -l vulns.txt
} > reconnaissance_summary.txt

XI. Advanced Techniques & 2026 Trends

A. AI-Powered OSINT

Modern bug bounty professionals leverage AI/LLM tools for:

  • Automated script generation for custom reconnaissance
  • Payload crafting and optimization
  • Analyzing error messages for vulnerability clues
  • Summarizing large datasets and identifying patterns

Tools: Claude, GPT-4 (for payload generation and analysis), Nuclei + AI plugins

B. Agentic & Continuous Reconnaissance

2025-2026 tools implement autonomous agents that:

  • Run continuous, automated scanning
  • Trigger dependent scans based on discoveries
  • Learn from previous findings
  • Adapt scanning strategies in real-time

BBOT exemplifies this approach with event-driven modules that automatically cascade discoveries.

C. Multicloud & Serverless Reconnaissance

Focus areas:

  • Lambda function enumeration
  • API Gateway discovery
  • Serverless database exposure
  • Container registry scanning
  • Infrastructure-as-Code repository mining

XII. Ethical & Legal Considerations

  1. Always verify scope against the bug bounty program rules
  2. Avoid resource exhaustion (rate limit scanning)
  3. Do not access or download data from misconfigured storage
  4. Report findings responsibly with clear proof-of-concept
  5. Respect rate limits on third-party OSINT platforms
  6. Use VPN/proxy to avoid exposing your IP during scans

Conclusion

Attack surface mapping in 2026 is a discipline combining automated discovery, intelligent filtering, and manual verification. Success depends on:

  1. Comprehensive tooling (BBOT, Nuclei, HTTPX + specialized tools)
  2. Layered approach (passive → active → deep enumeration)
  3. Integration (automated pipelines combining multiple data sources)
  4. Context awareness (prioritizing findings based on risk)
  5. Continuous learning (staying current with new tools and techniques)

The reconnaissance phase determines the quality of subsequent vulnerability discovery. Invest 60-70% of your time in thorough, systematic attack surface mapping, and the exploitation phase becomes exponentially more productive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment