/scrape

Extract data from any URL

Send a URL, get back clean markdown, metadata, JSON-LD schemas, and AI-powered extractions. Handles JavaScript-rendered pages out of the box.

$ curl -X POST https://scrapix.meilisearch.dev/scrape \
-H "Authorization: Bearer sk_live_..." \
-d '{ "url": "https://example.com", "formats": ["markdown", "metadata", "schema"], "ai_summary": true }'

# Response — extracted content

{
"markdown": "# Example Domain\nThis domain is for use in...",
"metadata": { "title": "Example Domain", "language": "en" },
"credits_used": 1
}

From URL to structured data

A single API call handles fetching, parsing, and extraction.

01

Fetch

The page is fetched via HTTP or rendered with a headless browser for JS-heavy sites.

02

Parse

HTML is cleaned, content extracted, and metadata/schemas parsed automatically.

03

Return

Clean markdown, metadata, schemas, and optional AI enrichments are returned instantly.

Multiple output formats

Markdown

Clean, readable markdown with preserved heading structure, links, and formatting.

HTML

Raw or cleaned HTML. Great for custom parsing pipelines or archival.

Metadata

Title, description, language, Open Graph tags, Twitter cards, and more.

JSON-LD / Schema

Structured data extracted from JSON-LD, microdata, and RDFa embedded in the page.

AI Summary

LLM-generated summary of the page content. Concise and accurate.

AI Extraction

Extract structured data using a custom prompt. Define your own schema.

Built for reliability

JavaScript rendering with headless browser
Automatic retry on transient failures
Configurable timeout and wait conditions
Custom CSS selector extraction
Content block splitting
robots.txt compliance
Proxy rotation support
Custom headers and cookies

Scrape pricing

HTTP scrape (per page)1 cr
JS rendering (per page)2 cr
+ each feature (metadata, schema...)+1 cr
+ AI extraction (per page)+5 cr
+ AI summary (per page)+5 cr

View full pricing details

Start scraping in minutes

1,000 free credits. No credit card required.

Get started for free