Developer Guide

Developer Guide

Learn how to build custom agents and extend Browser Operator’s capabilities.

GitHub Repository: github.com/tysonthomas9/browser-operator-devtools-frontend

Quick Start

Prerequisites

  • Node.js 18+ and npm
  • Python 3.8+
  • Git

Download Pre-built Application

For most developers, we recommend using the pre-built application:

  1. Download Browser Operator
  2. Install and configure your AI provider
  3. Start building custom agents

Building from Source

Step 1: Set up depot_tools

git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
export PATH="$PATH:/path/to/depot_tools"

Step 2: Fetch DevTools Frontend

mkdir devtools
cd devtools
fetch devtools-frontend

Step 3: Set up the project

cd devtools-frontend
gclient sync
npm run build

Step 4: Switch to Browser Operator fork

git remote add upstream [email protected]:tysonthomas9/browser-operator-devtools-frontend.git
git fetch upstream
git checkout upstream/main

Step 5: Build and run in development mode

# Terminal 1: Build with watch mode
npm run build -- --watch

# Terminal 2: Serve the built files
cd out/Default/gen/front_end
python3 -m http.server

# Terminal 3: Launch Chrome with custom DevTools
# For macOS:
/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary \
  --disable-infobars \
  --custom-devtools-frontend=http://localhost:8000/

# For Windows/Linux:
<path-to-devtools-frontend>/third_party/chrome/chrome-<platform>/chrome \
  --disable-infobars \
  --custom-devtools-frontend=http://localhost:8000/

Detailed build instructions →

Understanding the Architecture

Browser Operator uses a multi-agent framework where specialized agents work together:

User Request → Orchestrator → Specialized Agents → Tools → Web Actions

Key Components

  • Orchestrator: Routes requests to appropriate agents
  • Agents: Specialized for tasks (research, shopping, data extraction)
  • Tools: Browser automation capabilities (navigate, click, extract)
  • LLM Integration: Supports OpenAI, Claude, Gemini via LiteLLM

Creating Your First Agent

1. Simple Research Agent

// Example: Academic Research Agent
const researchAgentConfig = {
  name: "academic_researcher",
  description: "Searches and analyzes academic papers",
  systemPrompt: `You are an academic research specialist. 
    Search for peer-reviewed papers, analyze findings, 
    and provide citations in APA format.`,
  tools: ["navigate_url", "extract_content", "search_web"],
  maxIterations: 15
};

2. E-commerce Agent

// Example: Price Comparison Agent
const shoppingAgentConfig = {
  name: "price_tracker",
  description: "Compares prices across multiple stores",
  systemPrompt: `You are a shopping assistant. Find the best 
    prices for products across different websites. Include 
    shipping costs and availability.`,
  tools: ["navigate_url", "extract_data", "take_screenshot"],
  schema: {
    type: "object",
    properties: {
      product: { type: "string" },
      maxPrice: { type: "number" },
      stores: { type: "array", items: { type: "string" } }
    }
  }
};

Available Tools

Browser Operator provides these built-in tools for agents:

Navigation & Interaction

  • navigate_url - Go to any website
  • perform_action - Click, type, scroll
  • navigate_back - Browser back button
  • take_screenshot - Capture page visuals

Data Extraction

  • get_page_content - Extract text/HTML
  • schema_based_extractor - Structured data extraction
  • search_content - Find specific information
  • get_accessibility_tree - Page structure analysis

Advanced Features

  • bookmark_store - Save pages for later
  • document_search - Search saved content
  • combined_extraction - Multi-format data export

Agent Communication Patterns

Sequential Workflow

Research Agent → gathers data → 
Analysis Agent → processes findings → 
Report Agent → generates summary

Parallel Processing

Agent A: Monitor competitor prices
Agent B: Check product reviews     } → Combine results
Agent C: Analyze shipping options

Conditional Handoffs

handoffs: [{
  agent: "deep_analyzer",
  condition: "when detailed analysis needed",
  inputMapping: { data: "research_findings" }
}]

Testing Your Agents

Local Testing

  1. Open Browser Operator
  2. Access AI Chat panel
  3. Test your agent with various inputs
  4. Monitor browser actions in real-time

Example Test Cases

// Test research capability
"Research the latest developments in quantum computing"

// Test data extraction
"Extract all product prices from this page"

// Test multi-step workflow  
"Compare Python vs JavaScript for web development"

Best Practices

1. Agent Design

  • Single Responsibility: Each agent should excel at one task
  • Clear Prompts: Be specific about expected behavior
  • Error Handling: Account for website variations

2. Performance

  • Limit Iterations: Set reasonable maxIterations (10-20)
  • Efficient Tools: Use appropriate tools for the task
  • Cache Results: Avoid redundant operations

3. User Experience

  • Progress Updates: Provide clear status messages
  • Structured Output: Format results clearly
  • Actionable Results: Include links and next steps

Advanced Topics

Multi-Agent Orchestration

Create complex workflows by combining multiple agents:

// Planner agent delegates to specialists
const plannerConfig = {
  name: "workflow_planner",
  description: "Coordinates multi-agent workflows",
  handoffs: [
    { agent: "researcher", condition: "needs research" },
    { agent: "analyst", condition: "needs analysis" },
    { agent: "writer", condition: "needs report" }
  ]
};

Custom Tool Development

Extend capabilities by creating custom tools:

class CustomDataProcessor {
  name = "process_csv_data";
  description = "Processes CSV files from web pages";
  
  async execute(args) {
    // Your custom logic here
    return processedData;
  }
}

Resources

Documentation

Community

Next Steps

  1. Download Browser Operator and configure your AI provider
  2. Try the example agents to understand capabilities
  3. Build your own agent for your specific use case
  4. Share with the community and get feedback

Ready to automate the web with AI? Start building your first agent today!