The Open Deep Research application currently uses configuration-driven search API selection:
# Current approach in utils.py
async def select_and_execute_search(search_api: str, query_list: list[str], params_to_pass: dict):
if search_api == "tavily":
return await tavily_search.ainvoke({'queries': query_list}, **params_to_pass)
elif search_api == "arxiv":
search_results = await arxiv_search_async(query_list, **params_to_pass)
# ... etc
To add intelligent search tool reasoning, implement a search router node:
graph LR
A[Query Analysis] --> B{Search Router}
B --> |Academic| C[ArXiv/PubMed]
B --> |Medical| D[PubMed]
B --> |Recent News| E[Tavily/Perplexity]
B --> |Technical| F[Exa + Domain Filtering]
B --> |General| G[Tavily]
Implementation Approach:
async def route_search_apis(queries: list[str], topic: str) -> dict[str, list[str]]:
"""Route queries to appropriate search APIs based on content analysis"""
# Analyze query characteristics
academic_keywords = ["research", "study", "paper", "publication"]
medical_keywords = ["disease", "treatment", "medical", "clinical"]
recent_keywords = ["2024", "2025", "latest", "recent", "current"]
api_routing = {}
for query in queries:
query_lower = query.lower()
# Rule-based routing (could be replaced with LLM classification)
if any(keyword in query_lower for keyword in academic_keywords):
api_routing.setdefault("arxiv", []).append(query)
elif any(keyword in query_lower for keyword in medical_keywords):
api_routing.setdefault("pubmed", []).append(query)
elif any(keyword in query_lower for keyword in recent_keywords):
api_routing.setdefault("tavily", []).append(query)
else:
api_routing.setdefault("tavily", []).append(query)
return api_routing
Advanced LLM-Based Routing:
class SearchAPIRouter(BaseModel):
api_selections: Dict[str, List[str]] = Field(
description="Mapping of search API to queries that should use it"
)
reasoning: str = Field(description="Explanation of routing decisions")
async def llm_route_searches(queries: list[str], topic: str) -> SearchAPIRouter:
"""Use LLM to intelligently route searches based on query analysis"""
router_prompt = f"""
Analyze these search queries for topic '{topic}' and route them to appropriate search APIs:
Available APIs:
- arxiv: Academic papers and research
- pubmed: Medical and life sciences literature
- tavily: General web search with recent content
- perplexity: AI-powered search with analysis
- exa: Semantic search with domain filtering
Queries: {queries}
Route each query to the most appropriate API based on:
- Content domain (academic, medical, general)
- Recency requirements
- Source quality needs
"""
# Implementation with structured output...
The Open Deep Research application uses Command
instead of conditional edges for sophisticated routing control:
graph TB
subgraph "Command Approach"
A[human_feedback] --> B{Feedback Type}
B --> |Boolean True| C[Parallel Dispatch<br/>Send to Multiple Sections]
B --> |String Feedback| D[Update State +<br/>Route to Regenerate]
end
subgraph "Conditional Edge Limitation"
E[Node] --> F{Simple Function}
F --> |static return| G[Single Next Node]
end
1. Dynamic Parallel Dispatch:
# Command enables this complex routing:
return Command(goto=[
Send("build_section_with_web_research", {
"topic": topic,
"section": s,
"search_iterations": 0
})
for s in sections if s.research # Dynamic list creation
])
2. State Updates During Routing:
# Command allows state mutations with routing decisions:
return Command(
goto="generate_report_plan",
update={"feedback_on_report_plan": [feedback]} # State update
)
3. Complex Decision Logic:
- Commands can contain arbitrary Python logic
- Conditional edges are limited to simple function returns
- Commands support both single destinations and parallel dispatch
Feature | Conditional Edge | Command |
---|---|---|
Routing Logic | Simple function return | Full Python logic |
State Updates | No | Yes, with update parameter |
Parallel Dispatch | No | Yes, via Send() list |
Dynamic Destinations | No | Yes, computed at runtime |
Error Handling | Limited | Full exception handling |
The Open Deep Research application has limited conflict resolution:
- No Explicit Conflict Detection: The system doesn't identify contradictory information
- LLM-Dependent Synthesis: Relies on the LLM to handle conflicts during writing
- No Consistency Validation: Quality grading doesn't check for internal contradictions
graph LR
A[Multiple Sources] --> B[Deduplication by URL]
B --> C[Format & Present to LLM]
C --> D[LLM Synthesis]
D --> E[Quality Grade Pass/Fail]
E --> |Pass| F[Accept]
E --> |Fail| G[More Research]
Add Conflict Detection Node:
graph LR
A[Research Results] --> B[Conflict Detector]
B --> |No Conflicts| C[Standard Synthesis]
B --> |Conflicts Found| D[Conflict Resolver]
D --> E[Reconciled Content]
E --> F[Enhanced Section Writer]
Implementation Approach:
class ConflictAnalysis(BaseModel):
has_conflicts: bool = Field(description="Whether conflicting information was found")
conflict_details: List[str] = Field(description="Specific conflicts identified")
resolution_strategy: str = Field(description="How to handle the conflicts")
async def detect_research_conflicts(source_str: str, topic: str) -> ConflictAnalysis:
"""Analyze research sources for conflicting information"""
conflict_prompt = f"""
Analyze the following research sources for topic '{topic}' and identify any conflicting information:
{source_str}
Look for:
- Contradictory facts or statistics
- Opposing viewpoints on the same aspect
- Different conclusions from similar studies
- Inconsistent timelines or dates
"""
# LLM analysis for conflict detection...
class ConflictResolution(BaseModel):
resolution_type: Literal["present_both", "weight_by_credibility", "seek_additional_sources"]
resolved_content: str = Field(description="Content with conflicts addressed")
confidence_level: float = Field(description="Confidence in the resolution")
async def resolve_conflicts(conflicts: ConflictAnalysis, source_str: str) -> ConflictResolution:
"""Resolve identified conflicts in research content"""
if conflicts.resolution_strategy == "present_both_perspectives":
# Create balanced presentation of conflicting viewpoints
pass
elif conflicts.resolution_strategy == "weight_by_source_credibility":
# Prioritize more authoritative sources
pass
elif conflicts.resolution_strategy == "flag_for_additional_research":
# Request more targeted research to resolve conflicts
pass
Enhanced Section Writer with Conflict Awareness:
async def write_section_with_conflict_resolution(state: SectionState, config: RunnableConfig):
# 1. Detect conflicts in source material
conflicts = await detect_research_conflicts(state["source_str"], state["topic"])
# 2. Resolve conflicts if found
if conflicts.has_conflicts:
resolution = await resolve_conflicts(conflicts, state["source_str"])
enhanced_source_str = resolution.resolved_content
else:
enhanced_source_str = state["source_str"]
# 3. Write section with conflict-aware content
# ... existing section writing logic
- Search Intelligence: Move from configuration-driven to reasoning-based API selection using query analysis or LLM routing
- Command Architecture: Commands provide necessary flexibility for complex routing, state updates, and parallel dispatch that conditional edges cannot support
- Conflict Resolution: Current system lacks explicit conflict handling - enhancement opportunity through dedicated conflict detection and resolution nodes
These architectural decisions reflect the complexity of orchestrating multi-source research with quality control and human oversight.