/crawl

Crawl sites and index everything

Recursively crawl websites with configurable depth and page limits. Content is automatically parsed and indexed into Meilisearch for instant full-text search.

$ curl -X POST https://scrapix.meilisearch.dev/crawl \
-H "Authorization: Bearer sk_live_..." \
-d '{ "url": "https://docs.example.com", "max_pages": 500, "max_depth": 3, "index_uid": "docs" }'

# Response — job started

{
"job_id": "crawl_8f2a4b1c",
"status": "running",
"max_pages": 500,
"index_uid": "docs"
}

From URL to searchable index

A fully managed crawl pipeline in four stages.

01

Submit

POST a start URL with depth, page limits, and optional config.

02

Crawl

Pages are fetched in parallel, respecting domain rate limits and robots.txt.

03

Parse

HTML is cleaned, content extracted, and metadata/schemas parsed automatically.

04

Index

Documents are indexed into a Meilisearch engine for instant search.

Powerful crawl features

Auto-indexing

Crawled content is automatically indexed into Meilisearch. Search is available immediately.

Configurable depth

Set max depth, max pages, and allowed domains. Control exactly what gets crawled.

Real-time progress

Monitor pages crawled, errors, and progress via the API or the console dashboard.

AI enrichment

Add AI summaries and structured extraction to every crawled page.

Polite crawling

Automatic rate limiting per domain, robots.txt compliance, and configurable delays.

JS rendering

Optionally render pages with a headless browser for JavaScript-heavy sites.

Crawl pricing

HTTP crawl (per page)1 cr
JS rendering (per page)2 cr
+ each feature (metadata, schema...)+1 cr
+ AI extraction (per page)+5 cr
+ AI summary (per page)+5 cr
Search indexingFree

View full pricing details

Start crawling today

1,000 free credits. No credit card required.

Get started for free