CeSail

CeSail

πŸš€ The AI-native web automation engine – parse, understand, and act on any website with agent-friendly data.


🌟 Why CeSail?

Traditional automation tools like Playwright and Selenium are great at clicking buttons β€” but terrible at helping AI agents understand the meaning of web pages.

CeSail changes that:

Think of it as the missing bridge between the messy web and intelligent AI workflows.


πŸŽ₯ Demo

Here’s CeSail + Cursor MCP in action, searching flights on Expedia:

CeSail Demo - Flight Search on Expedia


πŸ”‘ Features

  1. 🌐 Web Page Analysis – Extracts DOM elements, forms, buttons, and interactive components
  2. 🧠 Agent-Friendly Parsing – Converts raw HTML into structured, semantic data
  3. 🎯 Actionable Intelligence – Identifies clickable items, input fields, and navigation paths
  4. πŸ“Š Structured Output – JSON-like objects that AI can instantly consume
  5. πŸ” Context Preservation – Maintains relationships between elements and their purposes
  6. πŸ“Έ Visual Overlays – Screenshots with highlighted action items

Quick Start

The easiest way to get started with CeSail is to install it from PyPI:

# Install CeSail
pip install cesail

# Install Playwright browsers
playwright install

Simple Example

Here’s a quick example that demonstrates CeSail’s core functionality:

import asyncio
from cesail import DOMParser, Action, ActionType

async def quick_demo():
    """Quick demonstration of CeSail's web automation capabilities."""
    async with DOMParser(headless=False) as parser:
        # Navigate to a website
        action = Action(
            type=ActionType.NAVIGATE,
            metadata={"url": "https://www.example.com"}
        )
        await parser._action_executor.execute_action(action)
        
        # Analyze the page and get structured data
        parsed_page = await parser.analyze_page()
        print(f"Found {len(parsed_page.important_elements.elements)} interactive elements")
        
        # Take a screenshot with overlays
        await parser.take_screenshot("demo_screenshot.png")
        
        # Show available actions
        print("Available actions:")
        for element in parsed_page.important_elements.elements[:3]:
            print(f"  - {element.type}: {element.text}")

# Run the demo
asyncio.run(quick_demo())

MCP (Model Context Protocol) Integration

CeSail provides a FastMCP server that enables AI assistants like Cursor to directly interact with web pages through standardized APIs. This allows you to give natural language commands to your AI assistant and have it execute web automation tasks.

Setting up MCP with Cursor

  1. Install CeSail MCP Server:
    pip install cesail fastmcp
    playwright install
    
  2. Configure MCP Settings:
    • Open Cursor
    • Go to Settings β†’ Extensions β†’ MCP
    • Add a new server configuration:
    • Note: Make sure to use the path to your Python executable. You can find it by running which python or which python3 in your terminal.
      {
      "mcpServers": {
        "cesail": {
          "command": "python3",
          "args": ["-m", "cesail.cesail_mcp.fastmcp_server"],
          "env": {
            "PYTHONUNBUFFERED": "1"
          },
          "description": "CeSail MCP Server for comprehensive web automation and DOM parsing",
          "capabilities": {
            "tools": {
              "listChanged": true
            }
          }
        }
      }
      }
      

    Note: This configuration has been tested with Cursor. For best performance, users should disable the get_screenshot capability as Cursor screenshots can take a while to process. To disable it, go to Cursor Settings β†’ Tools & Integrations β†’ MCP and disable the get_screenshot capability for the CeSail server. This should also work with other MCP-compatible agents, though it hasn’t been tested with them.

    For more help setting up Cursor MCP, see: https://docs.cursor.com/en/context/mcp

  3. Test the FastMCP Server:
    python3 -m cesail.cesail_mcp.fastmcp_server
    

    Run this command to ensure the server launches properly. You should see output indicating the server is starting up.

  4. Use in Cursor: Now you can ask Cursor to perform web automation tasks:
    "Using cesail, Navigate to example.com and do a certain task"
    "Using cesail, ..."
    

Why Agents Need This

Traditional web scraping provides raw HTML, which is difficult for AI agents to interpret. CeSail solves this by:

This transformation makes it possible for AI agents to:

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Cursor      β”‚    β”‚   MCP Server    β”‚    β”‚  DOM Parser     β”‚
β”‚   (AI Agent)    │◄──►│   (Python)      │◄──►│  (Python)       β”‚
β”‚                 β”‚    β”‚                 β”‚    β”‚                 β”‚
β”‚ β€’ Natural Lang. β”‚    β”‚ β€’ FastMCP APIs  β”‚    β”‚ β€’ Page Analyzer β”‚
β”‚ β€’ Task Planning β”‚    β”‚ β€’ Web Automationβ”‚    β”‚ β€’ Action Exec.  β”‚
β”‚ β€’ Execution     β”‚    β”‚ β€’ Screenshots   β”‚    β”‚ β€’ Idle Watcher  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
                                                        β”‚
                                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                        β”‚   Web Browser   β”‚
                                        β”‚  (Playwright)   β”‚
                                        β”‚                 β”‚
                                        β”‚ β€’ Page Control  β”‚
                                        β”‚ β€’ DOM Access    β”‚
                                        β”‚ β€’ Screenshots   β”‚
                                        β”‚ β€’ Actions       β”‚
                                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
                                                        β”‚
                                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                        β”‚  JavaScript     β”‚
                                        β”‚  Layer          β”‚
                                        β”‚                 β”‚
                                        β”‚ β€’ Element Ext.  β”‚
                                        β”‚ β€’ Selector Gen. β”‚
                                        β”‚ β€’ Text Analysis β”‚
                                        β”‚ β€’ Action Ext.   β”‚
                                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Architecture Points:

Components

1. DOM Parser JavaScript Layer (cesail/dom_parser/src/js/)

Core DOM parsing engine that transforms raw HTML into structured, agent-friendly data.

Language: JavaScript
Features:

Key Components:

Data Transformation Example:

// Raw HTML input
<button class="btn-primary" onclick="submit()">Submit Form</button>
<input type="text" placeholder="Enter email" id="email" />

// CeSail transforms to agent-friendly JSON
{
  "type": "BUTTON",
  "selector": "button.btn-primary",
  "text": "Submit Form",
  "action": "CLICK",
  "importance": 0.9,
  "context": "form submission",
  "metadata": {
    "aria-label": null,
    "disabled": false,
    "visible": true
  }
}

Documentation: See cesail/dom_parser/src/js/README.md

2. DOM Parser Python Layer (cesail/dom_parser/src/py/)

Orchestration layer that manages browser interactions and provides high-level APIs.

Language: Python
Features:

Key Components:

Integration Example:

async with DOMParser() as parser:
    # Navigate to page
    await parser.navigate("https://example.com")
    
    # Analyze page structure
    parsed_page = await parser.analyze_page()
    
    # Execute actions
    await parser.click("button.btn-primary")
    await parser.type("input#email", "user@example.com")

Documentation: See cesail/dom_parser/src/py/README.md

2. MCP Server (cesail/cesail_mcp/)

FastMCP server that provides standardized APIs for agents to interact with transformed web data.

Language: Python
Features:

Agent-Friendly API Example:

# Agent receives structured data from CeSail
parsed_page = await parser.analyze_page()

# Get the actions data (this is what agents typically work with)
actions = parsed_page.get_actions()

# Example actions data structure
actions_data = [
  {
    "type": "LINK",
    "selector": "2",
    "importantText": "Vintage vibesCreate your weekend moodboard | Vinta | /today/best/create-your-weekend-moodboard/128099/"
  },
  {
    "type": "LINK", 
    "selector": "3",
    "importantText": "Summer hobbiesTry bead embroidery | Summer hobbies | /today/best/try-bead-embroidery/128240/"
  },
  {
    "type": "SELECT",
    "selector": "5", 
    "importantText": "search-box-input | combobox | Search | Search"
  },
  {
    "type": "BUTTON",
    "selector": "8",
    "importantText": "vertical-nav-more-options-button | More options | More options"
  },
  {
    "type": "BUTTON",
    "selector": "10",
    "importantText": "Sign up"
  }
  ]
 

**Documentation**: See [cesail/dom_parser/src/py/README.md](/CeSail/cesail/dom_parser/src/py/) for more details about the parsed page data structure.

Usage: python3 -m cesail.cesail_mcp.fastmcp_server

3. Simple Agent (cesail/simple_agent/) - WIP

AI-powered web automation agent using LLM for task breakdown and execution.

Language: Python
Features:

Documentation: See cesail/simple_agent/README.md for more details.

Usage: python3 -m cesail.simple_agent.simple_agent

Testing

CeSail includes comprehensive test suites to validate functionality and demonstrate capabilities.

Test Categories

Quick Start

# Activate virtual environment
source venv/bin/activate

# Set PYTHONPATH
export PYTHONPATH=/Users/rachitapradeep/CeSail:$PYTHONPATH

# Run playground tests (great way to see CeSail in action!)
pytest cesail/dom_parser/tests/playground/test_page_analyzer_integration_pinterest.py -v -s

# Run all tests
pytest cesail/dom_parser/tests/ -v

Playground Tests

The playground tests are an excellent way to see CeSail navigate through real websites:

These tests demonstrate CeSail’s ability to:

Documentation: See cesail/dom_parser/tests/README.md for complete testing guide and examples.

Development Installation

For development or advanced usage:

Prerequisites:

Installation:

  1. Clone the repository:
    git clone https://github.com/AkilaJay/cesail.git
    cd cesail
    
  2. Set up Python environment:
    python3 -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    pip install -e .
    
  3. Set up DOM Parser (optional):
    cd cesail/dom_parser
    npm install
    npm run build
    cd ..
    
  4. Configure environment (for Simple Agent):
    # Create .env file in cesail/simple_agent/ directory
    echo "OPENAI_API_KEY=your_openai_api_key_here" > cesail/simple_agent/.env
    
  5. Playwright browsers are installed automatically during package installation. If you encounter any issues, you can manually install them:
    playwright install
    

Troubleshooting

Common Issues

1. Import Errors

Problem: ModuleNotFoundError: No module named 'dom_parser' Solution: Ensure you’re in the correct directory and virtual environment is activated

2. Playwright Browser Issues

Problem: Browser not found or crashes Solution: Reinstall Playwright browsers:

playwright install

3. OpenAI API Errors

Problem: API key invalid or rate limited Solution: Check your API key and usage limits in the OpenAI dashboard

4. Screenshot Failures

Problem: Screenshots fail with β€œTarget page closed” error Solution: Add proper error handling and retry logic

API Reference

For detailed API documentation, see the component-specific README files:

DOM Parser APIs

MCP Server API

Simple Agent API (WIP)

Contributing

We welcome contributions! Here’s how to get started:

Development Setup

  1. Fork the repository
  2. Create a feature branch:
    git checkout -b feature/your-feature-name
    
  3. Make your changes
  4. Add tests for new functionality
  5. Run tests to ensure everything works
  6. Submit a pull request

Code Style

Testing

Project Structure

cesail/
β”œβ”€β”€ cesail/                  # Python package
β”‚   β”œβ”€β”€ dom_parser/          # JavaScript DOM parser
β”‚   β”œβ”€β”€ src/                # Source code
β”‚   β”œβ”€β”€ dist/               # Built files
β”‚   β”œβ”€β”€ tests/              # JavaScript tests
β”‚   └── README.md           # Component documentation
β”‚   β”œβ”€β”€ cesail_mcp/         # FastMCP server
β”‚   β”œβ”€β”€ fastmcp_server.py   # Main server file
β”‚   β”œβ”€β”€ server.py           # Alternative server
β”‚   └── tests/              # MCP tests
β”‚   β”œβ”€β”€ simple_agent/       # AI web automation agent
β”‚   β”œβ”€β”€ simple_agent.py     # Main agent file
β”‚   β”œβ”€β”€ llm_interface.py    # LLM integration
β”‚   └── .env               # Environment variables
β”œβ”€β”€ venv/                   # Python virtual environment
β”œβ”€β”€ setup.py               # Python package configuration
β”œβ”€β”€ pyproject.toml         # Project configuration
└── README.md              # This file

Support

Roadmap

Help needed / Bugs

Contact

For questions, issues, or contributions:

License

This project is licensed under the MIT License - see the LICENSE file for details.