name	wpdoc
description	Research and document a WordPress website. Crawls the public site HTML, logs into the WP admin via Playwright, and produces a comprehensive inventory of page types, content counts, plugins, media, frontend libraries, animations, and more. All scraped HTML and documentation are saved to a wpdocs/ subfolder in the current working directory. Use when asked to document a WordPress site, audit WP content, inventory a WP site, research a WordPress site before migration, catalogue what a WordPress site contains, or gather intel on a WP install.

name

wpdoc

description

Research and document a WordPress website. Crawls the public site HTML, logs into the WP admin via Playwright, and produces a comprehensive inventory of page types, content counts, plugins, media, frontend libraries, animations, and more. All scraped HTML and documentation are saved to a wpdocs/ subfolder in the current working directory. Use when asked to document a WordPress site, audit WP content, inventory a WP site, research a WordPress site before migration, catalogue what a WordPress site contains, or gather intel on a WP install.

WordPress Site Documenter

Research a WordPress website and save a complete inventory to wpdocs/ in the current working directory. This skill is research-only — it gathers and documents, it does not migrate or modify anything.

Output structure

Everything goes into wpdocs/ relative to the current working directory:

wpdocs/
├── html/                  # raw crawled HTML from the public site
│   └── <domain>/          # httrack mirror (HTML/CSS/JS only)
├── screenshots/           # admin dashboard screenshots
├── site-overview.md       # high-level summary of the site
├── pages-and-posts.md     # page types, post types, counts
├── plugins.md             # installed plugins with descriptions
├── theme.md               # active theme details
├── media.md               # uploaded media inventory
├── frontend-libraries.md  # CSS/JS frameworks and libraries
├── animations.md          # animations and interactive features
├── forms.md               # forms found on the site
└── headers.md             # HTTP response header notes

Prerequisites

Before starting, confirm the working directory contains a .env file with:

WP_LOGIN_PAGE=https://example.com/wp-login.php
WP_USERNAME=...
WP_PASSWORD=...

If .env is missing or incomplete, ask the user for credentials before proceeding. Never echo credentials in shell output.

Phase 1 — Crawl the Public Site

Create the output directory first:

mkdir -p wpdocs/html wpdocs/screenshots

Use httrack to mirror the site HTML (no images or media):

httrack "<site-url>" \
  -O "wpdocs/html" \
  "+*.<domain>/*" \
  "-*?*" \
  "-*.jpg" "-*.jpeg" "-*.png" "-*.gif" "-*.svg" "-*.webp" \
  "-*.mp4" "-*.mp3" "-*.woff" "-*.woff2" "-*.ttf" "-*.ico" \
  -v

Replace <site-url> with the public URL and <domain> with the bare domain (e.g. example.com).

After the crawl finishes, analyse the downloaded HTML to extract:

Total page count — unique URLs crawled.
Page type classification — inspect <body> class attributes, URL path patterns, and structural HTML to identify distinct page templates (e.g. home, about, blog listing, single post, portfolio item, landing page, contact, 404).
Collection content — repeating URL patterns with similar HTML (blog posts, portfolio items, team members, products, testimonials).
Frontend libraries — scan <link> and <script> tags across pages for CSS/JS frameworks (Bootstrap, Tailwind, Foundation, jQuery, React, Vue, GSAP, AOS, Slick, Swiper, Lottie, etc.). Note CDN URLs and local paths. Check for bundled/minified files and try to identify what they contain.
Animations & interactive features — look for: CSS animations and transitions (keyframes, transition properties), JS animation libraries (GSAP, AOS, Animate.css, ScrollMagic, Lottie), parallax effects, scroll-triggered animations, hover effects, page transition effects, carousels/sliders, modals/lightboxes, accordions, tabs, sticky elements, lazy loading, infinite scroll, AJAX content loading.
Forms — count <form> elements, note fields and likely purpose (contact, newsletter, search, login).
Embedded content — iframes (maps, videos, calendars, social feeds), third-party widgets, chat widgets.
CSS frameworks — identify any grid/utility frameworks in use.
Meta & SEO — check for structured data (JSON-LD, microdata), Open Graph tags, canonical URLs, sitemap references.

Phase 2 — Scrape the WP Admin via Playwright

Use the Playwright MCP tools. Read .env to get credentials:

source .env

Then use Playwright MCP to visit each admin page below. For every page, take a screenshot (save to wpdocs/screenshots/) and a snapshot to extract text data.

Login

Navigate to $WP_LOGIN_PAGE
Fill username and password fields, click Log In

Dashboard

Snapshot + screenshot the Dashboard home. Note:
- WordPress version
- At-a-glance widget: post count, page count, comment count
- Any update notices or warnings
- Site health status if shown

Plugins

Navigate to /wp-admin/plugins.php — snapshot the full plugin list. For each plugin record:
- Name
- Version
- Active or inactive
- Whether an update is available
- Brief description of what the plugin does (from the description shown on the page, or infer from the plugin name if not visible)

Theme

Navigate to /wp-admin/themes.php — snapshot. Record:
- Active theme name, version, author
- Whether it's a custom, vendor, or child theme
- Any inactive themes installed

Posts

Navigate to /wp-admin/edit.php — snapshot. Record:
- Total published posts
- Draft count if visible
- Categories and tags in use (navigate to /wp-admin/edit-tags.php?taxonomy=category and /wp-admin/edit-tags.php?taxonomy=post_tag if needed)

Custom Post Types

Inspect the admin sidebar for menu items beyond Posts / Pages / Media / Comments. Navigate to each custom post type list page and record:
- Post type name/slug
- Number of published items
- Sample of field names if visible (ACF, custom fields, etc.)

Media Library

Navigate to /wp-admin/upload.php — snapshot. Record:
- Total number of media items (shown in the media library header)
- If possible, note the breakdown by type (images, documents, audio, video) by checking the filter dropdown
- Get the total disk usage for all media

Users

Navigate to /wp-admin/users.php — snapshot. Note:
- Total user count
- Roles in use (admin, editor, author, etc.)

Settings

Navigate to /wp-admin/options-general.php — snapshot. Note:
- Site title, tagline
- WordPress address and site address URLs
- Timezone, date/time format

Site Health

Navigate to /wp-admin/site-health.php — snapshot. Note:
- PHP version
- Database version (MySQL/MariaDB)
- Web server
- Any critical issues or recommendations

Updates

Navigate to /wp-admin/update-core.php — snapshot. Note:
- Pending core updates
- Pending plugin updates (count)
- Pending theme updates (count)

If login fails or any page returns an error, report what is and isn't accessible and proceed with whatever data you have.

Phase 3 — HTTP Response Headers

Run:

curl -sI "<site-url>" > wpdocs/headers-raw.txt

Note anything interesting: server software, CDN/WAF headers (Cloudflare, Sucuri, etc.), caching headers, PHP version exposure, X-Powered-By, etc.

Phase 4 — Write Documentation

Using all the data gathered, write the following markdown files into wpdocs/. Each file should be clear, scannable, and factual — just report what you found. Use headings, lists, and counts. Don't editorialize.

site-overview.md

Site URL
Date of audit
WordPress version
PHP version
Server / hosting clues (from headers)
Active theme (name, version, type)
Total pages, total posts, total custom post type items
Total plugins (active / inactive)
Total media files
Number of forms found
Number of distinct page templates identified
Quick note on overall site complexity (simple / moderate / complex)

pages-and-posts.md

For each content type (pages, posts, each custom post type):

Type name
Total count (published / draft)
Sample titles (up to 10)
URL pattern
Template used (if identifiable)
Notable fields or structured data

Also include a section on page templates — list each distinct template found from the HTML crawl with a description of its layout and which pages use it.

plugins.md

A list of every plugin, formatted as:

## Plugin Name (vX.X.X) — Active/Inactive
Brief description of what the plugin does.
Update available: Yes/No

Group plugins by category where it makes sense:

SEO
Security
Performance / Caching
Forms
Page builders
E-commerce
Media / Gallery
Social
Analytics
Backup
Other / Utility

theme.md

Active theme name, version, author, URL
Parent theme (if child theme)
Customiser settings observed
Notable template files or custom functionality
Any theme-specific plugins or requirements

media.md

Total media file count
Breakdown by type if available (images, documents, video, audio)
Estimated volume (if inferable from media library pagination)
Notable observations (e.g. very large library, unoptimised images)

frontend-libraries.md

List every CSS and JS library/framework detected:

## Library Name (vX.X.X)
- Source: CDN / local / bundled
- URL or file path
- Purpose: what it does on the site

Include: CSS frameworks, JS frameworks, animation libraries, utility libraries, font loading, icon sets (Font Awesome, etc.).

animations.md

Document every animation or interactive behaviour found:

CSS animations/transitions (describe the effect and which elements)
JS-driven animations (library used, trigger, effect)
Scroll-triggered effects
Parallax layers
Hover states (beyond basic color changes)
Page/route transitions
Loading animations
Carousels/sliders (library, number of instances)
Modals/lightboxes
Accordions, tabs, toggles
Any other notable interactive UI

forms.md

For each form found:

Location (which page)
Likely purpose (contact, newsletter, search, login, registration)
Fields list (name, email, phone, message, file upload, etc.)
Submission handler (Contact Form 7, Gravity Forms, WPForms, custom, mailto, etc.)
Notable features (CAPTCHA, conditional fields, multi-step)

headers.md

Paste the raw response headers and annotate anything notable:

Server software
CDN / WAF / proxy
Caching behaviour
Security headers (CSP, HSTS, X-Frame-Options, etc.)
PHP version exposure
Any custom headers

reidransom/SKILL.md

Select an option

No results found