Crawl sites and index everything
Recursively crawl websites with configurable depth and page limits. Content is automatically parsed and indexed into Meilisearch for instant full-text search.
# Response — job started
"job_id": "crawl_8f2a4b1c",
"status": "running",
"max_pages": 500,
"index_uid": "docs"
}
From URL to searchable index
A fully managed crawl pipeline in four stages.
Submit
POST a start URL with depth, page limits, and optional config.
Crawl
Pages are fetched in parallel, respecting domain rate limits and robots.txt.
Parse
HTML is cleaned, content extracted, and metadata/schemas parsed automatically.
Index
Documents are indexed into a Meilisearch engine for instant search.
Powerful crawl features
Auto-indexing
Crawled content is automatically indexed into Meilisearch. Search is available immediately.
Configurable depth
Set max depth, max pages, and allowed domains. Control exactly what gets crawled.
Real-time progress
Monitor pages crawled, errors, and progress via the API or the console dashboard.
AI enrichment
Add AI summaries and structured extraction to every crawled page.
Polite crawling
Automatic rate limiting per domain, robots.txt compliance, and configurable delays.
JS rendering
Optionally render pages with a headless browser for JavaScript-heavy sites.