Use the Firecrawl CLI from KosmoKrator to call Firecrawl tools headlessly, return JSON, inspect schemas, and automate workflows from coding agents, scripts, and CI.
Firecrawl can be configured headlessly with `kosmokrator integrations:configure firecrawl`.
# Install KosmoKrator first if it is not available on PATH.curl -fsSL https://raw.githubusercontent.com/OpenCompanyApp/kosmokrator/main/install.sh | bash# Configure and verify this integration.kosmokrator integrations:configure firecrawl --set api_key="$FIRECRAWL_API_KEY" --enable --read allow --write ask --jsonkosmokrator integrations:doctor firecrawl --jsonkosmokrator integrations:status --json
Credentials
Authentication type: API keyapi_key. Configure credentials once, then use the same stored profile from
scripts, coding CLIs, Lua code mode, and the MCP gateway.
Key
Env var
Type
Required
Label
api_key
FIRECRAWL_API_KEY
Secret secret
yes
API Key
url
FIRECRAWL_URL
URL url
no
API Base URL
Call Firecrawl Headlessly
Use the generic call form when another coding CLI or script needs a stable universal interface.
Every function below can be called headlessly. The generic form is stable across all integrations;
the provider shortcut is shorter but specific to Firecrawl.
firecrawl.firecrawl_scrape
Read read
Scrape a single URL and extract its content. Returns the page content in the requested format (markdown by default). Supports actions like waiting for JavaScript, taking screenshots, and extracting specific elements.
Start a crawl job to scrape all pages from a website starting at the given URL. Returns a crawl job ID — use firecrawl_get_crawl_status to check progress and retrieve results.
Check the status and retrieve results of a crawl job. Returns the current status (scraping, completed, failed, cancelled) and all scraped data once complete.
Map a website to discover all linked URLs. Returns a list of all URLs found on the site without scraping full content. Useful for understanding site structure before crawling.
Extract structured data from one or more URLs using AI. Provide a prompt describing what to extract, or a JSON schema for the expected output format. Ideal for pulling specific data points from web pages.
Get the authenticated user's account information, including plan details and usage statistics. Useful for verifying API key validity and checking remaining credits.
Use these parameter tables when building CLI payloads without calling integrations:schema first.
firecrawl.firecrawl_scrape
Scrape a single URL and extract its content. Returns the page content in the requested format (markdown by default). Supports actions like waiting for JavaScript, taking screenshots, and extracting specific elements.
Extract only the main content, removing navigation, footers, etc. Default: true.
includeTags
array
no
CSS selectors to include. Only these elements will be scraped.
excludeTags
array
no
CSS selectors to exclude. These elements will be removed from the result.
waitFor
integer
no
Time in milliseconds to wait for dynamic content to load before scraping.
timeout
integer
no
Timeout in milliseconds for the scrape request. Default: 30000.
actions
array
no
List of actions to perform before scraping (e.g., click, scroll, wait, screenshot).
firecrawl.firecrawl_crawl
Start a crawl job to scrape all pages from a website starting at the given URL. Returns a crawl job ID — use firecrawl_get_crawl_status to check progress and retrieve results.
The root URL to start crawling from (e.g., "https://example.com").
limit
integer
no
Maximum number of pages to crawl. Default: 10.
maxDepth
integer
no
Maximum crawl depth from the root URL. Default: based on plan.
formats
array
no
Output formats for each page. Options: "markdown", "html", "rawHtml", "content", "links". Default: ["markdown"].
excludePaths
array
no
URL path patterns to exclude from crawling (e.g., ["/blog/*"]).
includePaths
array
no
Only crawl URLs matching these path patterns (e.g., ["/docs/*"]).
allowBackwardLinks
boolean
no
Allow crawling links that go back to parent pages. Default: false.
allowExternalLinks
boolean
no
Allow crawling links to external domains. Default: false.
onlyMainContent
boolean
no
Extract only main content from each page. Default: true.
firecrawl.firecrawl_get_crawl_status
Check the status and retrieve results of a crawl job. Returns the current status (scraping, completed, failed, cancelled) and all scraped data once complete.
The crawl job ID returned by the firecrawl_crawl tool.
firecrawl.firecrawl_map
Map a website to discover all linked URLs. Returns a list of all URLs found on the site without scraping full content. Useful for understanding site structure before crawling.
The root URL to map (e.g., "https://example.com").
limit
integer
no
Maximum number of URLs to return. Default: based on plan.
includeSubdomains
boolean
no
Include URLs from subdomains. Default: false.
search
string
no
Filter URLs that match a search term (only returns URLs containing this string).
ignoreSitemap
boolean
no
Skip sitemap.xml discovery and only use on-page links. Default: false.
includePaths
array
no
Only include URLs matching these path patterns.
excludePaths
array
no
Exclude URLs matching these path patterns.
firecrawl.firecrawl_extract
Extract structured data from one or more URLs using AI. Provide a prompt describing what to extract, or a JSON schema for the expected output format. Ideal for pulling specific data points from web pages.
List of URLs to extract data from (e.g., ["https://example.com/about"]).
prompt
string
no
Natural language description of what data to extract from the pages.
schema
object
no
JSON schema defining the expected output structure. The response will conform to this schema.
systemPrompt
string
no
System prompt to guide the AI extraction behavior.
allowExternalLinks
boolean
no
Allow following links to external domains during extraction. Default: false.
enableWebSearch
boolean
no
Enable web search to supplement extraction with additional context. Default: false.
includeSubdomains
boolean
no
Include subdomains when following links. Default: false.
firecrawl.firecrawl_get_current_user
Get the authenticated user's account information, including plan details and usage statistics. Useful for verifying API key validity and checking remaining credits.
Headless calls still follow the integration read/write permission policy. Configure read/write defaults
with integrations:configure. Add --force only for trusted automation that should bypass that policy.