API Documentation
CrawlRocket is a web scraping and real-time data API. Scrape any URL, look up people, or tap into live feeds for news, sports, markets, and alerts — all through a single API.
#Quick Start
Three steps to your first scrape. Get an API key from the dashboard, submit a job, poll for results.
curl -X POST https://api.crawlrocket.com/api/lookup \
-H "Authorization: Bearer sk_pro_your_key" \
-H "Content-Type: application/json" \
-d '{"name": "Jane Smith", "sources": ["linkedin", "github"]}'{
"job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "queued",
"poll_url": "/api/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}curl https://api.crawlrocket.com/api/jobs/a1b2c3d4-... \ -H "Authorization: Bearer sk_pro_your_key"
{
"id": "a1b2c3d4-...",
"type": "person",
"status": "completed",
"result": {
"name": "Jane Smith",
"headline": "Staff Engineer at Stripe",
"photo": "https://...",
"sources": {
"linkedin": { "url": "linkedin.com/in/janesmith", ... },
"github": { "url": "github.com/jsmith", ... }
},
"emails": ["jane@example.com"],
"phones": []
}
}#Authentication
All requests require a Bearer token. Get your API key from the dashboard.
Authorization: Bearer sk_pro_your_api_key_here
Keys are prefixed by tier: sk_free_, sk_pro_, sk_enterprise_. Missing or invalid keys return 401.
#Person Lookup
/api/lookupSearch for a person across LinkedIn, GitHub, and X. Results from all sources are merged into a single profile with contact info, photos, and headlines.
name*sourceslinkedin, github, twitter. Defaults to ["linkedin", "github"].curl -X POST https://api.crawlrocket.com/api/lookup \
-H "Authorization: Bearer sk_pro_..." \
-H "Content-Type: application/json" \
-d '{"name": "Amer Sarhan", "sources": ["linkedin", "github"]}'#Search & Scrape
/api/searchRun a Google search and scrape the top N result pages. Returns structured data for each page.
query*limit3.curl -X POST https://api.crawlrocket.com/api/search \
-H "Authorization: Bearer sk_pro_..." \
-H "Content-Type: application/json" \
-d '{"query": "best web scraping tools 2024", "limit": 5}'#URL Scrape
/api/scrapeScrape a single URL using a headless browser. Returns page title, meta, headings, body text, links, and extracted contact info.
url*curl -X POST https://api.crawlrocket.com/api/scrape \
-H "Authorization: Bearer sk_pro_..." \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'#Live Data Feeds
Real-time data feeds for news, sports, markets, and alerts. Data is fetched on-demand from 12 sources and cached for 1-3 minutes. Requires Pro or Enterprise plan.
| Feed | Sources | Cache |
|---|---|---|
/api/feeds/news | Al Jazeera, CNN, Sky News, Khaleej Times, Fox News | 2 min |
/api/feeds/sports | Goal.com (22+ leagues) | 1 min |
/api/feeds/markets | CNBC, CoinGecko, Exchange Rates | 3 min |
/api/feeds/alerts | USGS Earthquakes, Red Alert Israel | 1 min |
/api/feeds/:categoryFetch a feed by category. Returns items sorted by most recent first.
limit20.sourcealjazeera, coingecko.curl https://api.crawlrocket.com/api/feeds/news?limit=5 \ -H "Authorization: Bearer sk_pro_..."
{
"items": [
{
"id": "aje-breaking-4434655-0",
"source": "aljazeera",
"category": "breaking",
"title": "Breaking headline from Al Jazeera",
"summary": "Article excerpt...",
"url": "https://www.aljazeera.com/news/...",
"image": "https://www.aljazeera.com/wp-content/uploads/...",
"author": "Reporter Name",
"publishedAt": "2026-03-26T15:49:26Z",
"tags": ["breaking", "middle-east"]
}
],
"sources": [
{ "id": "aljazeera", "name": "Al Jazeera", "count": 5, "cached": false }
],
"fromCache": false,
"fetchedAt": "2026-03-26T17:22:57.607Z"
}/api/feeds/source/:idFetch from a single source by ID. Source IDs: aljazeera, cnn, sky-news, sky-news-arabia, khaleej-times, fox-news, goal-scores, cnbc, coingecko, exchange-rates, usgs, tzeva-adom.
curl https://api.crawlrocket.com/api/feeds/source/coingecko?limit=5 \ -H "Authorization: Bearer sk_pro_..."
/api/feedsList all available feeds with sources and cache TTLs. Public — no API key needed. Use this to discover available feeds.
#Job Polling
/api/jobs/:idAll endpoints return a job ID. Poll this endpoint to get results. Jobs typically complete in 5-20 seconds.
queued | Job is waiting to be processed |
running | Job is being processed |
completed | Results available in result field |
failed | Error occurred — check error field |
You can also list all your jobs with GET /api/jobs.
#Usage Stats
/api/usageReturns your current plan, rate limits, and request counts.
{
"tier": "pro",
"limits": { "rate_per_minute": 60, "monthly": 2000 },
"usage": {
"monthly": 142,
"today": 23,
"byEndpoint": [
{ "endpoint": "/api/lookup", "count": 89 },
{ "endpoint": "/api/search", "count": 41 },
{ "endpoint": "/api/scrape", "count": 12 }
]
}
}#Errors
Errors return a JSON body with an error field.
| Code | Meaning |
|---|---|
400 | Bad request — missing or invalid parameters |
401 | Unauthorized — missing or invalid API key |
404 | Not found — job ID doesn't exist or isn't yours |
429 | Rate limit exceeded — slow down or upgrade |
500 | Server error — try again or contact support |
#Rate Limits
| Tier | Per Minute | Per Month | Price |
|---|---|---|---|
| Free | 5 | 5 | $0 |
| Pro | 60 | 2,000 | $29/mo |
| Enterprise | 200 | 50,000 | $199/mo |
When you exceed a limit, you'll get a 429 with a retry_after field in seconds.
#Caching
Results are cached for 1 hour. If you look up the same person or scrape the same URL within that window, you get the cached result instantly — no additional request counted against your quota.
Cached results include "_cached": true in the response so you can tell.
#SDKs & Libraries
CrawlRocket is a REST API — use it from any language. Here are quick examples:
const res = await fetch("https://api.crawlrocket.com/api/lookup", {
method: "POST",
headers: {
"Authorization": "Bearer sk_pro_...",
"Content-Type": "application/json",
},
body: JSON.stringify({
name: "Jane Smith",
sources: ["linkedin", "github"],
}),
});
const { job_id } = await res.json();
// Poll for result
const result = await fetch(
`https://api.crawlrocket.com/api/jobs/${job_id}`,
{ headers: { "Authorization": "Bearer sk_pro_..." } }
).then(r => r.json());import requests, time
headers = {
"Authorization": "Bearer sk_pro_...",
"Content-Type": "application/json",
}
# Submit
r = requests.post("https://api.crawlrocket.com/api/lookup",
json={"name": "Jane Smith", "sources": ["linkedin", "github"]},
headers=headers)
job_id = r.json()["job_id"]
# Poll
while True:
r = requests.get(f"https://api.crawlrocket.com/api/jobs/{job_id}",
headers=headers)
data = r.json()
if data["status"] in ("completed", "failed"):
break
time.sleep(3)
print(data["result"])