CeSail

DOM Parser JavaScript Layer

Core DOM parsing engine that transforms raw HTML into structured, agent-friendly data.

Overview

The JavaScript layer is the heart of CeSail’s DOM parsing capabilities. It runs directly in the browser context and provides comprehensive element extraction, analysis, and transformation functionality.

Key Components

Core Files

Features

Data Transformation

The JavaScript layer transforms raw HTML into structured, agent-friendly JSON:

// Raw HTML input
<button class="btn-primary" onclick="submit()">Submit Form</button>
<input type="text" placeholder="Enter email" id="email" />

// CeSail transforms to agent-friendly JSON
{
  "type": "BUTTON",
  "selector": "button.btn-primary",
  "text": "Submit Form",
  "action": "CLICK",
  "importance": 0.9,
  "context": "form submission",
  "metadata": {
    "aria-label": null,
    "disabled": false,
    "visible": true
  }
}

Usage

This layer is automatically injected into web pages by the Python DOM Parser and provides APIs for:

Architecture

The JavaScript layer operates as a browser-injected script that:

  1. Analyzes the current DOM structure
  2. Extracts actionable elements and metadata
  3. Scores elements by importance and visibility
  4. Filters and groups elements appropriately
  5. Generates reliable selectors for targeting
  6. Provides APIs for the Python layer to consume

Performance

Build Process

The JavaScript layer must be built before it can be used by the Python DOM Parser.

Building the Bundle

cd dom_parser/
npm install  # Installs Rollup and build dependencies
npm run build

This creates the bundled JavaScript file at dom_parser/dist/dom-parser.js which contains:

Integration with Python Layer

The Python DOMParser automatically injects the built JavaScript bundle into every browser page:

# The bundle is automatically loaded from:
bundle_path = Path(__file__).parent.parent / "dist" / "dom-parser.js"

# And injected as an init script:
await self.context.add_init_script(path=str(self.bundle_path))

Build Configuration

The JavaScript is built using Rollup with the following outputs:

Development Workflow

# Watch mode for development
npm run dev

# Clean and rebuild
npm run clean && npm run build

# Simple build (alternative)
npm run build:simple

Note: Always rebuild the JavaScript bundle after making changes to files in src/js/ before testing the Python layer.