Skip to content

Instantly share code, notes, and snippets.

@eevmanu
Created August 4, 2025 02:25
Show Gist options
  • Select an option

  • Save eevmanu/31a529a6aab587db31c7d594d79dd293 to your computer and use it in GitHub Desktop.

Select an option

Save eevmanu/31a529a6aab587db31c7d594d79dd293 to your computer and use it in GitHub Desktop.
deepresearch cli plan spec prd prompt
You are an expert software architect tasked with creating detailed technical specifications for software development projects.
Your specifications will be used as direct input for planning & code generation AI systems, so they must be precise, structured, and comprehensive.
First, carefully review the project request:
<project_request>
ROLE
You are a system architect and AI engineer. Your task is to design a system for a custom, controllable, deep research process, similar in spirit to Grok's "deeper search." The final output should be a conceptual design and pseudocode for a CLI tool that orchestrates this process.
OBJECTIVE
Design a CLI tool that performs deep, multi-source, multi-layer research on a given query. The system must be highly configurable, allowing the user to adapt the "budget" in every aspect: thinking time, tool call limits, and iteration cycles. It should be optimized for both cost and effectiveness.
CORE CONCEPTS
- Knobs & Budget: The user must be able to control the research process through parameters. This includes:
- `depth`: Number of iterative refinement loops.
- `breadth`: Number of parallel search queries or draft generations.
- `tool_calls`: Maximum number of calls to external tools (e.g., web search).
- `thinking_budget`: If the underlying model supports it (like extended thinking in Claude), use this to control processing time per step. If not, use the model's maximum capacity.
- Model Selection: Assume we can use two types of models:
- "Thinking Model": A powerful, state-of-the-art model for complex tasks like synthesis, critique, and in-depth analysis. (e.g., GPT-4, Claude 3 Opus).
- "Non-Thinking Model": A smaller, faster, and cheaper model for simple, routine tasks like query generation, planning, and ranking. (e.g., Haiku, a fine-tuned local model).
- Tool Integration: The web search tool can be the one provided natively by the model OR a third-party tool accessed via an API (what I call an "mcp server").
PROPOSED WORKFLOW & CHOREOGRAPHY
The system's logic must be inspired by the following multi-step choreography. Design the system to execute these steps:
Step 0: Plan
- The 'Planner/Orchestrator' (a "Non-Thinking Model") receives the user query.
- It performs a cheap internal 'extended thinking' pass to decide if web access is even necessary. If not, it can answer directly.
Step 1: Breadth Pass (Fan-Out Retrieval)
- The Planner generates `n` sub-queries or rewritten variants of the original query in parallel (`n` is controlled by the `breadth` parameter).
- It calls a search API for each sub-query, fetching the top-k URLs per query to ensure wide coverage.
Step 2: Ranking & Pruning
- A "Non-Thinking Model" (acting as a cross-encoder or LLM judge) scores all the retrieved snippets/sources for relevance to the original query.
- It prunes the list, keeping only the best 30-50 candidate sources.
Step 3: Depth Loops (Iterative Gap-Filling)
- This is a loop that runs up to `depth` times or until the budget is exhausted.
- In each iteration:
- The Planner (a "Thinking Model" this time) reads the currently synthesized information and the ranked sources.
- It identifies gaps, missing evidence, or contradictions.
- It issues new, targeted follow-up search queries to fill these gaps.
- It can optionally delegate tasks, like asking a 'specialist' LLM to extract specific facts from a long PDF.
- The loop terminates when the `depth` limit, `tool_calls` limit, or a confidence threshold is met.
Step 4: Synthesis & Self-Critique
- A "Thinking Model" drafts a comprehensive report based on all verified information, including inline citations.
- It then enters a self-critique loop to check for and repair contradictions, hallucinations, or logical flaws.
- Optional Breadth: For maximum quality, the system can generate 2-3 candidate drafts in parallel and use a "Non-Thinking Model" as a judge to select the best one.
Step 5: Final Output
- The system returns the final, polished report along with a full bibliography of the sources used.
CLI TOOL DESIGN
Propose a design for this as a CLI tool. Include pseudocode for the main orchestration loop and define the command-line arguments. For example:
`python deep_research.py "What are the latest advancements in solid-state battery technology?" --depth 3 --breadth 5 --tool-calls 15`
- `--depth`: Controls the number of iterative gap-filling cycles.
- `--breadth`: Controls the number of initial parallel sub-queries.
- `--tool-calls`: Sets the hard limit for external API calls.
- `--model-fast`: Specifies the "non-thinking" model to use.
- `--model-smart`: Specifies the "thinking" model to use.
Your final output should be a complete conceptual breakdown and the requested pseudocode that brings this entire vision to life.
</project_request>
Next, carefully review the project rules:
<project_rules>
- use python when possible
- if you can base from good tools like llm cli from simonw that's great
</project_rules>
Finally, carefully review the starter template:
<starter_template>
...
</starter_template>
Your task is to generate a comprehensive technical specification based on this information.
Before creating the final specification, analyze the project requirements and plan your approach. Wrap your thought process in <specification_planning> tags, considering the following:
Core system architecture and key workflows
Project structure and organization
Detailed feature specifications
Database schema design
Server actions and integrations
Design system and component architecture
Authentication and authorization implementation
Data flow and state management
Payment implementation
Analytics implementation
Testing strategy
For each of these areas:
Provide a step-by-step breakdown of what needs to be included
List potential challenges or areas needing clarification
Consider potential edge cases and error handling scenarios
In your analysis, be sure to:
Break down complex features into step-by-step flows
Identify areas that require further clarification or have potential risks
Propose solutions or alternatives for any identified challenges
After your analysis, generate the technical specification using the following markdown structure:
# {Project Name} Technical Specification
## 1. System Overview
- Core purpose and value proposition
- Key workflows
- System architecture
## 2. Project Structure
- Detailed breakdown of project structure & organization
## 3. Feature Specification
For each feature:
### 3.1 Feature Name
- User story and requirements
- Detailed implementation steps
- Error handling and edge cases
## 4. Database Schema
### 4.1 Tables
For each table:
- Complete table schema (field names, types, constraints)
- Relationships and indexes
## 5. Server Actions
### 5.1 Database Actions
For each action:
- Detailed description of the action
- Input parameters and return values
- SQL queries or ORM operations
### 5.2 Other Actions
- External API integrations (endpoints, authentication, data formats)
- File handling procedures
- Data processing algorithms
## 6. Design System
### 6.1 Visual Style
- Color palette (with hex codes)
- Typography (font families, sizes, weights)
- Component styling patterns
- Spacing and layout principles
### 6.2 Core Components
- Layout structure (with examples)
- Navigation patterns
- Shared components (with props and usage examples)
- Interactive states (hover, active, disabled)
## 7. Component Architecture
### 7.1 Server Components
- Data fetching strategy
- Suspense boundaries
- Error handling
- Props interface (with TypeScript types)
### 7.2 Client Components
- State management approach
- Event handlers
- UI interactions
- Props interface (with TypeScript types)
## 8. Authentication & Authorization
- Clerk implementation details
- Protected routes configuration
- Session management strategy
## 9. Data Flow
- Server/client data passing mechanisms
- State management architecture
## 10. Stripe Integration
- Payment flow diagram
- Webhook handling process
- Product/Price configuration details
## 11. PostHog Analytics
- Analytics strategy
- Event tracking implementation
- Custom property definitions
## 12. Testing
- Unit tests with Jest (example test cases)
- e2e tests with Playwright (key user flows to test)
Ensure that your specification is extremely detailed, providing specific implementation guidance wherever possible. Include concrete examples for complex features and clearly define interfaces between components.
Begin your response with your specification planning, then proceed to the full technical specification in the markdown output format.
Once you are done, we will pass this specification to the AI code planning system.

<specification_planning>

  1. Core system architecture & key workflows
    a. CLI entry ➔ parse args (depth, breadth, tool_calls, models, etc.).
    b. Instantiate Orchestrator (state machine).
    c. Step 0 – cheap plan: use fast model to decide whether direct answer possible; otherwise, prepare sub-queries.
    d. Step 1 – breadth retrieval fan-out: fire breadth async search jobs, collect results, persist in cache (SQLite + FTS).
    e. Step 2 – rank & prune: fast cross-encoder LLM → top K (≤50) docs retained.
    f. Step 3 – depth loop: for i in range(depth) while calls_left & confidence<τ:
    • smart model reads synthesis_so_far, gaps → targeted queries.
    • run search / specialty tasks.
    • update doc store; decrement budgets.
    g. Step 4 – synthesis & critique with smart model (optionally k candidates).
    h. Step 5 – final output (markdown + bibliography, JSON side-car).

    Challenges / clarifications
    • Budget accounting across parallel async calls.
    • LLM context window: need summarisation & chunking.
    • Source deduplication, URL normalisation.
    • Citation alignment in generated text (token ↔ doc mapping).
    • API key management (openai, anthropic, search provider).
    • Resilience to network failures / HTTP 429 – back-off & retry.
    • Local vs cloud execution; offline mode.

  2. Project structure & organisation
    • src/cli.py (entry)
    • src/core/orchestrator.py
    • src/core/budget.py
    • src/llm/fast.py, src/llm/smart.py (adapters)
    • src/retrieval/search.py (pluggable providers)
    • src/retrieval/ranker.py
    • src/storage/doc_store.py (SQLite FTS)
    • src/synthesis/synthesiser.py
    • src/utils/…
    Tests/, docs/, examples/

  3. Feature specifications
    • Budget knobs with hard ceilings and soft warnings.
    • Multiple search back-ends (SerpAPI, Bing, mcp).
    • Iterative depth loops.
    • Multiple candidate drafts & adjudication.
    • Config file override + CLI override precedence.
    • Caching layer to avoid duplicate tool calls.
    • Streaming progress bar (rich).
    Edge cases: 0 depth, no results, exceeded budget mid-iteration.

  4. DB schema
    SQLite: tables for documents (id, url, title, snippet, content, vector, added_at), searches, budgets, runs. Indexes on url, FTS on content.

  5. Server actions & integrations
    Not a web service but external APIs: search, OpenAI/Anthropic chat completions. Abstract adaptor pattern.

  6. Design system
    CLI colours (rich); docs website (mkdocs) if future UI is added.

  7. Component architecture
    Pure Python modules; orchestrator coordinates.

  8. Auth & authorisation
    API keys via env vars or ~/.deep_research/config.toml; file permissions.

  9. Data flow & state
    Orchestrator context object carries query, budgets, doc_store ref; messages flow through.

  10. Payment / billing
    Stripe out-of-scope for CLI; placeholder to support future SaaS daemon.

  11. Analytics
    Optional: PostHog self-host URL; events: run_started, search_called, draft_generated.

  12. Testing strategy
    Unit: budget accounting, search adaptor mocking, ranker scoring.
    e2e: fixtures with deterministic llm stub.

Potential risks
• Rapid API changes; mitigate with interface layer.
• LLM hallucination; self-critique + citations.
• Cost overruns; strict tool_call ledger.

Open questions
• Preferred search provider? default to SerpAPI.
• PDF extraction needed now or future plug-in? Provide interface stub.
• Concurrency model: asyncio vs multiprocessing? Choose asyncio.

</specification_planning>

DeepResearchCLI Technical Specification

1. System Overview

• Purpose: A Python CLI that orchestrates configurable, cost-aware, multi-stage deep research on any user query.
• Value proposition: Combines fast/cheap LLMs for orchestration with smart/expensive LLMs for reasoning, enforcing strict budgets while yielding high-quality, citation-rich reports.
• High-level workflow: Plan → Breadth retrieval → Rank & prune → Depth loops → Synthesis & critique → Final output.
• System architecture:

  • CLI (Click) → Orchestrator (state machine)
  • LLM Adapters (fast/smart)
  • Search Adapters (SerpAPI, Bing, mcp)
  • Ranker (fast LLM)
  • Document Store (SQLite w/ FTS5 + optional FAISS vector index)
  • Synthesiser (smart LLM)
  • Budget Ledger

2. Project Structure

deepresearch/
 ├─ src/
 │   ├─ cli.py
 │   ├─ core/
 │   │   ├─ orchestrator.py
 │   │   ├─ budget.py
 │   │   └─ config.py
 │   ├─ llm/
 │   │   ├─ base.py
 │   │   ├─ fast.py
 │   │   └─ smart.py
 │   ├─ retrieval/
 │   │   ├─ search.py
 │   │   ├─ ranker.py
 │   │   └─ pdf_extractor.py
 │   ├─ storage/
 │   │   ├─ doc_store.py
 │   │   └─ schema.sql
 │   ├─ synthesis/
 │   │   └─ synthesiser.py
 │   └─ utils/
 │       ├─ logging.py
 │       └─ text.py
 ├─ tests/
 ├─ examples/
 └─ pyproject.toml

3. Feature Specification

3.1 CLI Invocation & Config

• User story: As a researcher, I run deepresearch "prompt" --depth 3 --breadth 5 ... to obtain a report.
• Implementation steps:

  1. Parse CLI flags with Click.
  2. Merge with ~/.deep_research/config.toml defaults.
  3. Validate budgets; error if any negative or conflicting.
  4. Instantiate Orchestrator with a RunContext.

Edge cases: missing API key → instruct user; invalid numeric flag → exit code 2.

3.2 Budget Ledger

• Tracks counts for tool_calls, elapsed tokens (thinking_budget), wall-time.
• Each external call passes through ledger.reserve(cost); raises BudgetExceeded.

Edge cases: concurrent async calls double-booking → use asyncio.Lock.

3.3 Planner (Step 0)

• Fast model prompt template decides:
“Given QUERY, can you answer from prior knowledge w/out web? respond YES/NO and short answer if YES.”
• If YES, skip to Step 4 with answer as synthesis.

Error handling: ambiguous answer → treat as NO.

3.4 Breadth Retrieval (Step 1)

• Fast model generates breadth query variants (temperature=0.7).
• Launch async search tasks: search_api.search(q, top_k); each returns {url,title,snippet}.
• Persist into doc_store.

Challenges: provider rate limits → back-off exponential (max 3 retries).

3.5 Rank & Prune (Step 2)

• For each snippet, call fast cross-encoder LLM: “Score 0-10 relevance to original query.”
• Keep top 50 by score.
• Optionally compute embedding & vector similarity for tie-breakers.

Edge: if <5 docs retrieved → skip prune.

3.6 Depth Loop (Step 3)

Pseudocode outline:

for i in range(depth):
    gaps = smart_llm.identify_gaps(context_summary, citations)
    if not gaps: break
    queries = smart_llm.formulate_queries(gaps, max_q=3)
    for q in queries:
        if ledger.remaining('tool_calls') < 1: break
        results = search_api.search(q, top_k=5)
        doc_store.upsert(results)
        ledger.decrement('tool_calls')
    update_context()
    if confidence > τ: break

Edge cases: no new docs added → break loop early.

3.7 Synthesis & Critique (Step 4)

• Option --drafts N (default 1).
• For each draft: smart model prompted with curated sources → produce markdown with [n] citations.
• Self-critique: smart_llm.critique(draft) → list issues → smart_llm.revise(draft, issues).
• If N>1: fast judge scores factuality/coherence and selects best.

Error: missing citation marker → regex pass to fix.

3.8 Final Output (Step 5)

• Save report.md and report.json ({body, citations:[{id,url}]}) under ./runs/{timestamp}/
• Print summary stats: tokens used, tool calls, elapsed cost.

4. Database Schema

4.1 Tables

documents

  • id INTEGER PK
  • url TEXT UNIQUE NOT NULL
  • title TEXT
  • snippet TEXT
  • content TEXT
  • vector BLOB NULL
  • added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP

searches

  • id INTEGER PK
  • query TEXT
  • provider TEXT
  • run_id TEXT
  • executed_at TIMESTAMP

runs

  • id TEXT PK (uuid)
  • root_query TEXT
  • config JSON
  • started_at TIMESTAMP
  • ended_at TIMESTAMP
  • total_tokens INTEGER
  • total_tool_calls INTEGER

Indexes:
• UNIQUE(url) on documents.
• FTS5 virtual table documents_fts(content, title, snippet) linked to documents.

5. Server Actions

5.1 Database Actions

doc_store.upsert(docs) – bulk insert; on conflict update snippet/content.
doc_store.search_fts(query, k) – SQL: SELECT * FROM documents JOIN documents_fts ... ORDER BY rank LIMIT ?.

5.2 Other Actions

• Search API: GET /search (SerpAPI) with params q, num, api_key.
• LLM API: POST /v1/chat/completions (OpenAI) or /v1/complete (Anthropic).
• PDF extraction: pdftotext CLI wrapper or Unstructured lib.

6. Design System (CLI)

6.1 Visual Style

• Colors: primary cyan (#00BCD4), secondary grey (#B0BEC5), error red (#FF5252).
• Typography: monospace font inherits terminal; headings bold.
• Layout: Rich Panels, aligned tables, progress bars.

6.2 Core Components

• ProgressBar(id, total)
• Panel(title, body, style)
• Table(headers, rows, highlight)
Interactive states: spinner (searching), checkmark (completed), warning (budget almost exceeded).

7. Component Architecture

7.1 Server Components

(Not applicable; CLI only)

7.2 Client Components (Python)

• State stored in RunContext dataclass.
• Event callbacks (on_search_complete, on_draft_ready).
• Concurrency: asyncio tasks with asyncio.gather.

8. Authentication & Authorization

• API keys read in order: CLI flag → ENV (OPENAI_API_KEY, SERPAPI_KEY) → config file.
• Permission check: config file chmod 600.
• If missing → prompt once (hidden input) and offer to save.

9. Data Flow

CLI → Orchestrator
         ↘ search_api → internet
         ↘ fast_llm  → OpenAI/Anthropic
         ↘ smart_llm → OpenAI/Anthropic
         ↘ doc_store → SQLite

RunContext mutable object passed down, gathers tokens and docs.

10. Stripe Integration

(Currently N/A, but planned)
• Future daemon could expose /pay endpoints; Stripe checkout session; webhook payment_succeeded marks runs.paid = true.

11. PostHog Analytics

• Optional --analytics flag.
• Events: run_started, search_performed, llm_call, run_completed.
• Identify by anonymised hash of machine id.

12. Testing

12.1 Unit (pytest + pytest-asyncio)

test_budget.py – ensure exception when limit crossed.
test_ranker.py – deterministic stub returns ordered scores.
test_orchestrator_skip_web.py – fast model stub answers YES.

12.2 e2e (Playwright ➔ node script calling CLI)

• Scenario: “Who is Ada Lovelace?” depth=0 – expect no web calls.
• Scenario: Solid state batteries query depth=1 – expect output file with ≥5 citations.


Pseudocode – Main Orchestration Loop (orchestrator.py)

class Orchestrator:
    def __init__(self, ctx: RunContext):
        self.ctx = ctx
        self.ledger = BudgetLedger(ctx.tool_calls, ctx.thinking_budget)
        self.fast = FastLLM(ctx.model_fast)
        self.smart = SmartLLM(ctx.model_smart)
        self.search = SearchProvider(ctx.search_backend, self.ledger)

    async def run(self):
        if await self._cheap_plan():
            return await self._output_and_exit()

        await self._breadth_retrieval()
        await self._rank_and_prune()
        await self._depth_loop()
        await self._synthesise()
        await self._finalise()

    async def _cheap_plan(self):
        resp = await self.fast.chat(prompt_plan(self.ctx.query))
        self.ledger.count(resp)
        if resp.answerable:
            self.ctx.drafts = [resp.answer]
            return True
        return False

    async def _breadth_retrieval(self):
        variants = await self.fast.chat(prompt_variants(self.ctx.query, self.ctx.breadth))
        tasks = [self.search.search(v) for v in variants]
        for docs in await asyncio.gather(*tasks):
            self.ctx.doc_store.upsert(docs)

    async def _rank_and_prune(self):
        scored = []
        for doc in self.ctx.doc_store.all():
            score = await self.fast.chat(prompt_score(doc, self.ctx.query))
            scored.append((score, doc))
        top = sorted(scored, key=lambda t: t[0], reverse=True)[:50]
        self.ctx.doc_store.keep([d for _, d in top])

    async def _depth_loop(self):
        for i in range(self.ctx.depth):
            gaps = await self.smart.chat(prompt_identify_gaps(self.ctx))
            if not gaps: break
            queries = await self.smart.chat(prompt_queries(gaps))
            for q in queries:
                if not self.ledger.reserve(1): return
                docs = await self.search.search(q)
                self.ctx.doc_store.upsert(docs)
            if self._confidence_high(): break

    async def _synthesise(self):
        drafts = []
        for _ in range(self.ctx.drafts_n):
            draft = await self.smart.chat(prompt_synth(self.ctx))
            critique = await self.smart.chat(prompt_critique(draft))
            draft = await self.smart.chat(prompt_revise(draft, critique))
            drafts.append(draft)
        if len(drafts) == 1:
            self.ctx.final = drafts[0]
        else:
            scores = [await self.fast.chat(prompt_judge(d, self.ctx.query)) for d in drafts]
            self.ctx.final = drafts[scores.index(max(scores))]

    async def _finalise(self):
        save_report(self.ctx.final, self.ctx.doc_store.citations())
        print_summary(self.ledger, self.ctx)

The specification meets all outlined requirements and is ready for hand-off to the code-generation phase.

decoding = how i turn logits into a final string
- how many forward passes (total compute)?
- do the passes talk to each other or only to the prompt?
- who or what decides which text the user finally sees?
deep serial refinement - step feeds on the single previous draft
- iterative refinement
- any loop that feeds the same answer back to the lm for improvement.
- condition types / budget limit :
- time (ex: 15 min)
- tool calls (3)
- confidence threshold (ask it via prompt and parse from llm prompt response)
- self-critique / reflection (reflexion, self-refine),
- self-refinement / self-reflective decoding (madaan et al., 2023)
- answer the question.
- critique your answer; list flaws.
- improve it based on your critique.
- iterative refinement where the lm explicitly critiques itself in natural language
- chain-of-thought with reflection
- refinement prompting
- multi-pass or two-pass decoding (deliberation networks)
- deliberate decoding (shazeer, 2019)
- draft pass (cheap, low-temperature, maybe truncated).
- refinement pass conditioned on the entire draft.
- iterative refinement with exactly one draft → polish hop
- iterative editing (model alternates writer ↔ editor roles)
- depth: 10 sequential passes over a growing context (draft + critiques)
- quadratic-ish context cost and no parallelism.
wide parallel generation + final judge
- diverse beam search + rerank in older mt literature
- generate-then-rank (a.k.a. n-best list + reranker)
- self-consistency decoding (diverse chain-of-thought sampling + majority vote)
- self-consistency with llm-as-judge
- self-consistency sampling (wang et al., 2022)
- sample k independent chain-of-thoughts.
- strip the final answers.
- let the plurality vote decide, or pick via lm-judge. key point: each branch never sees the other branches.
- run k independent chain-of-thoughts, aggregate answers by vote or judge
- aka parallel fan-out
- ensemble-of-reasoners with an llm-as-a-judge
- many models or many stochastic copies of one model
- marketing phrase for self-consistency with different temperature seeds or different model checkpoints.
- multi-agent debate
- differs from plain self-consistency because the agents can reference and attack the other branches, not just vote blindly.
- k agents attack each other’s answers; a judge (often another lm) decides
- ensemble multi-agent methods
- judge-synthesize patterns
- same as debate, but instead of picking a winner the judge writes a new composite answer
- breadth: 10 independent 1× calls + 1 extra aggregation pass
- linear context cost and full parallelism.
both are techniques that could be use in reasoning / thinking models
deep research
step 0 user query ➜ 'planner / orchestrator' lm
- decides whether it needs the web at all
(cheap internal 'extended thinking' pass).
step 1 breadth pass: 'fan-out retrieval'
- generate n sub-queries or rewrite variants in parallel
(gemini literally calls this _query fan-out_).
- hit a search api or internal index;
fetch top-k urls per sub-query (wide coverage).
step 2 ranking / pruning
- cross-encoder or llm judge scores snippets
- keep the best 30-50 candidates.
step 3 depth loops: 'iterative gap-filling'
- planner reads what it has, spots missing evidence,
issues follow-up searches, or
asks a 'specialist' lm to extract facts from long pdfs.
- this repeats until a budget limit
(time, tool calls, or confidence threshold) is hit.
step 4 synthesis & self-critique
- draft answer with inline citations
- self-critique loop (serial depth)
to repair contradictions / hallucinations.
- optional parallelism: generate 2-3 candidate drafts and
let an internal judge choose (breadth at the document level).
step 5 return report + bibliography.
choreography is:
- plan
- fan-out search
- rank
- iterative gap-fill
- self-critique
- answer
breadth (parallel fan-out) and
depth (serial gap-filling + self-critique)
iterative refinement w/o tools
-> re-explore just latent knowledge
iterative refinement w/ tools
-> evidence (new tokens)
https://grokaimodel.com/how-to-use/#%f0%9f%94%8d_deepsearch_vs_deepersearch
deepersearch
multi-source, multi-layer research. longer to run but yields more analytical depth. best for technical, legal, or market analysis.
https://support.anthropic.com/en/articles/11095361
extended thinking + research
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment