Documentation
Web Intelligence

Extract Endpoint

Convert one or more pages into schema-shaped JSON for agent workflows.

Status: Planned.

POST /v1/extract applies a schema or natural-language instruction to page content and returns structured JSON. It builds on scrape/crawl output.

Planned Request

code
{
  "url": "https://vendor.example.com/pricing",
  "schema": {
    "plans": [
      {
        "name": "string",
        "price": "string",
        "includedSeats": "number",
        "limits": ["string"]
      }
    ]
  },
  "instruction": "Extract public pricing plans. Do not infer missing prices.",
  "source": "scrape"
}

Guardrails

  • Return warnings for missing, ambiguous, or inferred fields.
  • Keep source URLs and evidence snippets attached to extracted objects.
  • Bill extraction separately from page capture.
  • Let users save schemas in the dashboard for repeat jobs.