Developer Guide
Developer Guide
Learn how to build custom agents and extend Browser Operator’s capabilities.
GitHub Repository: github.com/tysonthomas9/browser-operator-devtools-frontend
Quick Start
Prerequisites
- Node.js 18+ and npm
- Python 3.8+
- Git
Download Pre-built Application
For most developers, we recommend using the pre-built application:
- Download Browser Operator
- Install and configure your AI provider
- Start building custom agents
Building from Source
Step 1: Set up depot_tools
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
export PATH="$PATH:/path/to/depot_tools"
Step 2: Fetch DevTools Frontend
mkdir devtools
cd devtools
fetch devtools-frontend
Step 3: Set up the project
cd devtools-frontend
gclient sync
npm run build
Step 4: Switch to Browser Operator fork
git remote add upstream [email protected]:tysonthomas9/browser-operator-devtools-frontend.git
git fetch upstream
git checkout upstream/main
Step 5: Build and run in development mode
# Terminal 1: Build with watch mode
npm run build -- --watch
# Terminal 2: Serve the built files
cd out/Default/gen/front_end
python3 -m http.server
# Terminal 3: Launch Chrome with custom DevTools
# For macOS:
/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary \
--disable-infobars \
--custom-devtools-frontend=http://localhost:8000/
# For Windows/Linux:
<path-to-devtools-frontend>/third_party/chrome/chrome-<platform>/chrome \
--disable-infobars \
--custom-devtools-frontend=http://localhost:8000/
Understanding the Architecture
Browser Operator uses a multi-agent framework where specialized agents work together:
User Request → Orchestrator → Specialized Agents → Tools → Web Actions
Key Components
- Orchestrator: Routes requests to appropriate agents
- Agents: Specialized for tasks (research, shopping, data extraction)
- Tools: Browser automation capabilities (navigate, click, extract)
- LLM Integration: Supports OpenAI, Claude, Gemini via LiteLLM
Creating Your First Agent
1. Simple Research Agent
// Example: Academic Research Agent
const researchAgentConfig = {
name: "academic_researcher",
description: "Searches and analyzes academic papers",
systemPrompt: `You are an academic research specialist.
Search for peer-reviewed papers, analyze findings,
and provide citations in APA format.`,
tools: ["navigate_url", "extract_content", "search_web"],
maxIterations: 15
};
2. E-commerce Agent
// Example: Price Comparison Agent
const shoppingAgentConfig = {
name: "price_tracker",
description: "Compares prices across multiple stores",
systemPrompt: `You are a shopping assistant. Find the best
prices for products across different websites. Include
shipping costs and availability.`,
tools: ["navigate_url", "extract_data", "take_screenshot"],
schema: {
type: "object",
properties: {
product: { type: "string" },
maxPrice: { type: "number" },
stores: { type: "array", items: { type: "string" } }
}
}
};
Available Tools
Browser Operator provides these built-in tools for agents:
Navigation & Interaction
navigate_url
- Go to any websiteperform_action
- Click, type, scrollnavigate_back
- Browser back buttontake_screenshot
- Capture page visuals
Data Extraction
get_page_content
- Extract text/HTMLschema_based_extractor
- Structured data extractionsearch_content
- Find specific informationget_accessibility_tree
- Page structure analysis
Advanced Features
bookmark_store
- Save pages for laterdocument_search
- Search saved contentcombined_extraction
- Multi-format data export
Agent Communication Patterns
Sequential Workflow
Research Agent → gathers data →
Analysis Agent → processes findings →
Report Agent → generates summary
Parallel Processing
Agent A: Monitor competitor prices
Agent B: Check product reviews } → Combine results
Agent C: Analyze shipping options
Conditional Handoffs
handoffs: [{
agent: "deep_analyzer",
condition: "when detailed analysis needed",
inputMapping: { data: "research_findings" }
}]
Testing Your Agents
Local Testing
- Open Browser Operator
- Access AI Chat panel
- Test your agent with various inputs
- Monitor browser actions in real-time
Example Test Cases
// Test research capability
"Research the latest developments in quantum computing"
// Test data extraction
"Extract all product prices from this page"
// Test multi-step workflow
"Compare Python vs JavaScript for web development"
Best Practices
1. Agent Design
- Single Responsibility: Each agent should excel at one task
- Clear Prompts: Be specific about expected behavior
- Error Handling: Account for website variations
2. Performance
- Limit Iterations: Set reasonable maxIterations (10-20)
- Efficient Tools: Use appropriate tools for the task
- Cache Results: Avoid redundant operations
3. User Experience
- Progress Updates: Provide clear status messages
- Structured Output: Format results clearly
- Actionable Results: Include links and next steps
Advanced Topics
Multi-Agent Orchestration
Create complex workflows by combining multiple agents:
// Planner agent delegates to specialists
const plannerConfig = {
name: "workflow_planner",
description: "Coordinates multi-agent workflows",
handoffs: [
{ agent: "researcher", condition: "needs research" },
{ agent: "analyst", condition: "needs analysis" },
{ agent: "writer", condition: "needs report" }
]
};
Custom Tool Development
Extend capabilities by creating custom tools:
class CustomDataProcessor {
name = "process_csv_data";
description = "Processes CSV files from web pages";
async execute(args) {
// Your custom logic here
return processedData;
}
}
Resources
Documentation
Community
- Discord - Get help and share agents
- GitHub Issues - Report bugs
- Twitter - Updates and news
Next Steps
- Download Browser Operator and configure your AI provider
- Try the example agents to understand capabilities
- Build your own agent for your specific use case
- Share with the community and get feedback
Ready to automate the web with AI? Start building your first agent today!