A complete setup guide for integrating Google's Gemini CLI with Claude Code through an MCP (Model Context Protocol) server. This provides automatic second opinion consultation when Claude expresses uncertainty or encounters complex technical decisions.
# Switch to Node.js 22.16.0
nvm use 22.16.0
# Install Gemini CLI globally
npm install -g @google/gemini-cli
# Test installation
gemini --help
# Authenticate with Google account (free tier: 60 req/min, 1,000/day)
# Authentication happens automatically on first use
# Direct consultation (no container setup needed)
echo "Your question here" | gemini
# Example: Technical questions
echo "Best practices for microservice authentication?" | gemini -m gemini-2.5-pro
- Host-Based Setup: Both MCP server and Gemini CLI run on host machine
- Why Host-Only: Gemini CLI requires interactive authentication and avoids Docker-in-Docker complexity
- Auto-consultation: Detects uncertainty patterns in Claude responses
- Manual consultation: On-demand second opinions via MCP tools
- Response synthesis: Combines both AI perspectives
- Singleton Pattern: Ensures consistent state management across all tool calls
βββ mcp-server.py # Enhanced MCP server with Gemini tools
βββ gemini_integration.py # Core integration module with singleton pattern
βββ gemini-config.json # Gemini configuration
βββ setup-gemini-integration.sh # Setup script
All files should be placed in the same directory for easy deployment.
# Start MCP server directly on host
cd your-project
python3 mcp-server.py --project-root .
# Or with environment variables
GEMINI_ENABLED=true \
GEMINI_AUTO_CONSULT=true \
GEMINI_CLI_COMMAND=gemini \
GEMINI_TIMEOUT=200 \
GEMINI_RATE_LIMIT=2 \
python3 mcp-server.py --project-root .
Create mcp-config.json
:
{
"mcpServers": {
"project": {
"command": "python3",
"args": ["mcp-server.py", "--project-root", "."],
"cwd": "/path/to/your/project",
"env": {
"GEMINI_ENABLED": "true",
"GEMINI_AUTO_CONSULT": "true",
"GEMINI_CLI_COMMAND": "gemini"
}
}
}
}
Automatically detects patterns like:
- "I'm not sure", "I think", "possibly", "probably"
- "Multiple approaches", "trade-offs", "alternatives"
- Critical operations: "security", "production", "database migration"
consult_gemini
- Manual consultation with contextgemini_status
- Check integration status and statisticstoggle_gemini_auto_consult
- Enable/disable auto-consultation
- Identifies agreement/disagreement between Claude and Gemini
- Provides confidence levels (high/medium/low)
- Generates combined recommendations
GEMINI_ENABLED=true # Enable integration
GEMINI_AUTO_CONSULT=true # Auto-consult on uncertainty
GEMINI_CLI_COMMAND=gemini # CLI command to use
GEMINI_TIMEOUT=200 # Query timeout in seconds
GEMINI_RATE_LIMIT=5 # Delay between calls (seconds)
GEMINI_MAX_CONTEXT= # Max context length
GEMINI_MODEL=gemini-2.5-flash # Model to use
GEMINI_API_KEY= # Optional (blank for free tier, keys disable free mode!)
Create gemini-config.json
:
{
"enabled": true,
"auto_consult": true,
"cli_command": "gemini",
"timeout": 300,
"rate_limit_delay": 5.0,
"log_consultations": true,
"model": "gemini-2.5-flash",
"sandbox_mode": true,
"debug_mode": false,
"uncertainty_thresholds": {
"uncertainty_patterns": true,
"complex_decisions": true,
"critical_operations": true
}
}
UNCERTAINTY_PATTERNS = [
r"\bI'm not sure\b",
r"\bI think\b",
r"\bpossibly\b",
r"\bprobably\b",
r"\bmight be\b",
r"\bcould be\b",
# ... more patterns
]
COMPLEX_DECISION_PATTERNS = [
r"\bmultiple approaches\b",
r"\bseveral options\b",
r"\btrade-offs?\b",
r"\balternatives?\b",
# ... more patterns
]
CRITICAL_OPERATION_PATTERNS = [
r"\bproduction\b",
r"\bdatabase migration\b",
r"\bsecurity\b",
r"\bauthentication\b",
# ... more patterns
]
class GeminiIntegration:
def __init__(self, config: Optional[Dict[str, Any]] = None):
self.config = config or {}
self.enabled = self.config.get('enabled', True)
self.auto_consult = self.config.get('auto_consult', True)
self.cli_command = self.config.get('cli_command', 'gemini')
self.timeout = self.config.get('timeout', 30)
self.rate_limit_delay = self.config.get('rate_limit_delay', 1)
async def consult_gemini(self, query: str, context: str = "") -> Dict[str, Any]:
"""Consult Gemini CLI for second opinion"""
# Rate limiting
await self._enforce_rate_limit()
# Prepare query with context
full_query = self._prepare_query(query, context)
# Execute Gemini CLI command
result = await self._execute_gemini_command(full_query)
return result
def detect_uncertainty(self, text: str) -> bool:
"""Detect if text contains uncertainty patterns"""
return any(re.search(pattern, text, re.IGNORECASE)
for pattern in UNCERTAINTY_PATTERNS)
# Singleton pattern implementation
_integration = None
def get_integration(config: Optional[Dict[str, Any]] = None) -> GeminiIntegration:
"""Get or create the global Gemini integration instance"""
global _integration
if _integration is None:
_integration = GeminiIntegration(config)
return _integration
The singleton pattern ensures:
- Consistent Rate Limiting: All MCP tool calls share the same rate limiter
- Unified Configuration: Changes to config affect all usage points
- State Persistence: Consultation history and statistics are maintained
- Resource Efficiency: Only one instance manages the Gemini CLI connection
from gemini_integration import get_integration
# Get the singleton instance
self.gemini = get_integration(config)
# In Claude Code
Use the consult_gemini tool with:
query: "Should I use WebSockets or gRPC for real-time communication?"
context: "Building a multiplayer application with real-time updates"
User: "How should I handle authentication?"
Claude: "I think OAuth might work, but I'm not certain about the security implications..."
[Auto-consultation triggered]
Gemini: "For authentication, consider these approaches: 1) OAuth 2.0 with PKCE for web apps..."
Synthesis: Both suggest OAuth but Claude uncertain about security. Gemini provides specific implementation details. Recommendation: Follow Gemini's OAuth 2.0 with PKCE approach.
@server.list_tools()
async def handle_list_tools():
return [
types.Tool(
name="consult_gemini",
description="Consult Gemini for a second opinion",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string", "description": "Question for Gemini"},
"context": {"type": "string", "description": "Additional context"}
},
"required": ["query"]
}
),
types.Tool(
name="gemini_status",
description="Check Gemini integration status"
),
types.Tool(
name="toggle_gemini_auto_consult",
description="Enable/disable automatic consultation",
inputSchema={
"type": "object",
"properties": {
"enable": {"type": "boolean", "description": "Enable or disable"}
}
}
)
]
Issue | Solution |
---|---|
Gemini CLI not found | Install Node.js 18+ and npm install -g @google/gemini-cli |
Authentication errors | Run gemini and sign in with Google account |
Node version issues | Use nvm use 22.16.0 |
Timeout errors | Increase GEMINI_TIMEOUT (default: 60s) |
Auto-consult not working | Check GEMINI_AUTO_CONSULT=true |
Rate limiting | Adjust GEMINI_RATE_LIMIT (default: 2s) |
- API Credentials: Store securely, use environment variables
- Data Privacy: Be cautious about sending proprietary code
- Input Sanitization: Sanitize queries before sending
- Rate Limiting: Respect API limits (free tier: 60/min, 1000/day)
- Host-Based Architecture: Both Gemini CLI and MCP server run on host for auth compatibility and simplicity
- Rate Limiting: Implement appropriate delays between calls
- Context Management: Keep context concise and relevant
- Error Handling: Always handle Gemini failures gracefully
- User Control: Allow users to disable auto-consultation
- Logging: Log consultations for debugging and analysis
- Caching: Cache similar queries to reduce API calls
- Architecture Decisions: Get second opinions on design choices
- Security Reviews: Validate security implementations
- Performance Optimization: Compare optimization strategies
- Code Quality: Review complex algorithms or patterns
- Troubleshooting: Debug complex technical issues