Can I use Firecrawl as a CLI?

Yes. KosmoKrator exposes a Firecrawl CLI through kosmo integrations:call and the provider shortcut command. Commands can return JSON for scripts, CI, and coding agents.

Can I expose Firecrawl through MCP?

Yes. The KosmoKrator MCP gateway can expose selected Firecrawl tools to clients such as Claude Code, Cursor, Codex, and other MCP-compatible tools.

Where are Firecrawl credentials stored?

Firecrawl credentials are configured in KosmoKrator's local integration settings and reused by CLI, Lua, and MCP gateway surfaces.

Can I deny write access for Firecrawl MCP tools?

Yes. Start the gateway with --write=deny for read-only clients, or use ask/allow only for trusted workspaces.

Firecrawl Lua API for KosmoKrator Agents

Agent-facing Lua documentation and function reference for the Firecrawl KosmoKrator integration.

6 functions 6 read 0 write API key auth

Submit Firecrawl feedback View package source

Lua Namespace

Agents call this integration through app.integrations.firecrawl.*. Use lua_read_doc("integrations.firecrawl") inside KosmoKrator to discover the same reference at runtime.

Agent-Facing Lua Docs

This is the rendered version of the full Lua documentation exposed to agents when they inspect the integration namespace.

Firecrawl — Lua API Reference

scrape

Scrape a single URL and extract its content in the requested format.

Parameters

Name	Type	Required	Description
`url`	string	yes	The URL to scrape (e.g., `"https://example.com"`)
`formats`	array	no	Output formats: `"markdown"`, `"html"`, `"rawHtml"`, `"content"`, `"links"`, `"screenshot"`. Default: `["markdown"]`
`onlyMainContent`	boolean	no	Extract only main content, remove nav/footers. Default: `true`
`includeTags`	array	no	CSS selectors to include
`excludeTags`	array	no	CSS selectors to exclude
`waitFor`	integer	no	Milliseconds to wait for dynamic content
`timeout`	integer	no	Timeout in ms (default: 30000)
`actions`	array	no	Actions before scraping (click, scroll, wait, screenshot)

Example

local result = app.integrations.firecrawl.scrape({
  url = "https://example.com",
  formats = {"markdown", "links"},
  onlyMainContent = true
})

print(result.data.markdown)

crawl

Start an asynchronous crawl job to scrape all pages from a website. Returns a job ID for status checking.

Parameters

Name	Type	Required	Description
`url`	string	yes	Root URL to crawl from
`limit`	integer	no	Max pages to crawl. Default: 10
`maxDepth`	integer	no	Max depth from root URL
`formats`	array	no	Output formats per page. Default: `["markdown"]`
`excludePaths`	array	no	URL path patterns to exclude
`includePaths`	array	no	Only crawl URLs matching these patterns
`allowBackwardLinks`	boolean	no	Allow crawling parent page links. Default: `false`
`allowExternalLinks`	boolean	no	Allow crawling external domains. Default: `false`
`onlyMainContent`	boolean	no	Extract only main content per page. Default: `true`

Example

local job = app.integrations.firecrawl.crawl({
  url = "https://example.com",
  limit = 50,
  formats = {"markdown"}
})

print("Crawl started with ID: " .. job.id)

-- Poll for results
local status = app.integrations.firecrawl.get_crawl_status({
  id = job.id
})

if status.status == "completed" then
  for _, page in ipairs(status.data) do
    print(page.metadata.sourceURL .. ": " .. #page.markdown .. " chars")
  end
end

get_crawl_status

Check the status and retrieve results of a crawl job.

Parameters

Name	Type	Required	Description
`id`	string	yes	The crawl job ID returned by `crawl`

Status Values

scraping, completed, failed, cancelled

Example

local result = app.integrations.firecrawl.get_crawl_status({
  id = "crawl_abc123"
})

print("Status: " .. result.status)
print("Pages scraped: " .. #result.data)

map

Discover all URLs on a website without scraping content.

Parameters

Name	Type	Required	Description
`url`	string	yes	Root URL to map
`limit`	integer	no	Max URLs to return
`includeSubdomains`	boolean	no	Include subdomain URLs. Default: `false`
`search`	string	no	Filter URLs matching this term
`ignoreSitemap`	boolean	no	Skip sitemap.xml. Default: `false`
`includePaths`	array	no	Only include URLs matching these patterns
`excludePaths`	array	no	Exclude URLs matching these patterns

Example

local result = app.integrations.firecrawl.map({
  url = "https://example.com",
  limit = 100,
  includePaths = {"/docs/*"}
})

for _, url in ipairs(result.links) do
  print(url)
end

extract

Extract structured data from one or more URLs using AI.

Parameters

Name	Type	Required	Description
`urls`	array	yes	List of URLs to extract from
`prompt`	string	no	Natural language description of what to extract
`schema`	object	no	JSON schema for expected output structure
`systemPrompt`	string	no	System prompt to guide AI behavior
`allowExternalLinks`	boolean	no	Follow external domain links. Default: `false`
`enableWebSearch`	boolean	no	Supplement with web search. Default: `false`
`includeSubdomains`	boolean	no	Include subdomains. Default: `false`

Example

local result = app.integrations.firecrawl.extract({
  urls = {"https://example.com/product/1"},
  prompt = "Extract the product name, price, and description"
})

print(result.data.product_name)
print(result.data.price)

With JSON schema

local result = app.integrations.firecrawl.extract({
  urls = {"https://example.com/product/1"},
  schema = {
    type = "object",
    properties = {
      name = {type = "string"},
      price = {type = "number"},
      description = {type = "string"},
      inStock = {type = "boolean"}
    }
  }
})

get_current_user

Get the authenticated user’s account information and usage stats.

Parameters

None.

Example

local user = app.integrations.firecrawl.get_current_user({})
print("Account: " .. user.email)
print("Plan: " .. user.plan)

Multi-Account Usage

If you have multiple Firecrawl accounts configured, use account-specific namespaces:

-- Default account (always works)
app.integrations.firecrawl.scrape({url = "https://example.com"})

-- Explicit default (portable across setups)
app.integrations.firecrawl.default.scrape({url = "https://example.com"})

-- Named accounts
app.integrations.firecrawl.production.scrape({url = "https://example.com"})
app.integrations.firecrawl.staging.scrape({url = "https://staging.example.com"})

All functions are identical across accounts — only the credentials differ.

Raw agent markdown

# Firecrawl — Lua API Reference

## scrape

Scrape a single URL and extract its content in the requested format.

### Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `url` | string | yes | The URL to scrape (e.g., `"https://example.com"`) |
| `formats` | array | no | Output formats: `"markdown"`, `"html"`, `"rawHtml"`, `"content"`, `"links"`, `"screenshot"`. Default: `["markdown"]` |
| `onlyMainContent` | boolean | no | Extract only main content, remove nav/footers. Default: `true` |
| `includeTags` | array | no | CSS selectors to include |
| `excludeTags` | array | no | CSS selectors to exclude |
| `waitFor` | integer | no | Milliseconds to wait for dynamic content |
| `timeout` | integer | no | Timeout in ms (default: 30000) |
| `actions` | array | no | Actions before scraping (click, scroll, wait, screenshot) |

### Example

```lua
local result = app.integrations.firecrawl.scrape({
  url = "https://example.com",
  formats = {"markdown", "links"},
  onlyMainContent = true
})

print(result.data.markdown)
```

---

## crawl

Start an asynchronous crawl job to scrape all pages from a website. Returns a job ID for status checking.

### Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `url` | string | yes | Root URL to crawl from |
| `limit` | integer | no | Max pages to crawl. Default: 10 |
| `maxDepth` | integer | no | Max depth from root URL |
| `formats` | array | no | Output formats per page. Default: `["markdown"]` |
| `excludePaths` | array | no | URL path patterns to exclude |
| `includePaths` | array | no | Only crawl URLs matching these patterns |
| `allowBackwardLinks` | boolean | no | Allow crawling parent page links. Default: `false` |
| `allowExternalLinks` | boolean | no | Allow crawling external domains. Default: `false` |
| `onlyMainContent` | boolean | no | Extract only main content per page. Default: `true` |

### Example

```lua
local job = app.integrations.firecrawl.crawl({
  url = "https://example.com",
  limit = 50,
  formats = {"markdown"}
})

print("Crawl started with ID: " .. job.id)

-- Poll for results
local status = app.integrations.firecrawl.get_crawl_status({
  id = job.id
})

if status.status == "completed" then
  for _, page in ipairs(status.data) do
    print(page.metadata.sourceURL .. ": " .. #page.markdown .. " chars")
  end
end
```

---

## get_crawl_status

Check the status and retrieve results of a crawl job.

### Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `id` | string | yes | The crawl job ID returned by `crawl` |

### Status Values

`scraping`, `completed`, `failed`, `cancelled`

### Example

```lua
local result = app.integrations.firecrawl.get_crawl_status({
  id = "crawl_abc123"
})

print("Status: " .. result.status)
print("Pages scraped: " .. #result.data)
```

---

## map

Discover all URLs on a website without scraping content.

### Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `url` | string | yes | Root URL to map |
| `limit` | integer | no | Max URLs to return |
| `includeSubdomains` | boolean | no | Include subdomain URLs. Default: `false` |
| `search` | string | no | Filter URLs matching this term |
| `ignoreSitemap` | boolean | no | Skip sitemap.xml. Default: `false` |
| `includePaths` | array | no | Only include URLs matching these patterns |
| `excludePaths` | array | no | Exclude URLs matching these patterns |

### Example

```lua
local result = app.integrations.firecrawl.map({
  url = "https://example.com",
  limit = 100,
  includePaths = {"/docs/*"}
})

for _, url in ipairs(result.links) do
  print(url)
end
```

---

## extract

Extract structured data from one or more URLs using AI.

### Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `urls` | array | yes | List of URLs to extract from |
| `prompt` | string | no | Natural language description of what to extract |
| `schema` | object | no | JSON schema for expected output structure |
| `systemPrompt` | string | no | System prompt to guide AI behavior |
| `allowExternalLinks` | boolean | no | Follow external domain links. Default: `false` |
| `enableWebSearch` | boolean | no | Supplement with web search. Default: `false` |
| `includeSubdomains` | boolean | no | Include subdomains. Default: `false` |

### Example

```lua
local result = app.integrations.firecrawl.extract({
  urls = {"https://example.com/product/1"},
  prompt = "Extract the product name, price, and description"
})

print(result.data.product_name)
print(result.data.price)
```

### With JSON schema

```lua
local result = app.integrations.firecrawl.extract({
  urls = {"https://example.com/product/1"},
  schema = {
    type = "object",
    properties = {
      name = {type = "string"},
      price = {type = "number"},
      description = {type = "string"},
      inStock = {type = "boolean"}
    }
  }
})
```

---

## get_current_user

Get the authenticated user's account information and usage stats.

### Parameters

None.

### Example

```lua
local user = app.integrations.firecrawl.get_current_user({})
print("Account: " .. user.email)
print("Plan: " .. user.plan)
```

---

## Multi-Account Usage

If you have multiple Firecrawl accounts configured, use account-specific namespaces:

```lua
-- Default account (always works)
app.integrations.firecrawl.scrape({url = "https://example.com"})

-- Explicit default (portable across setups)
app.integrations.firecrawl.default.scrape({url = "https://example.com"})

-- Named accounts
app.integrations.firecrawl.production.scrape({url = "https://example.com"})
app.integrations.firecrawl.staging.scrape({url = "https://staging.example.com"})
```

All functions are identical across accounts — only the credentials differ.

Metadata-Derived Lua Example

local result = app.integrations.firecrawl.firecrawl_scrape({
  url = "example_url",
  formats = "example_formats",
  onlyMainContent = true,
  includeTags = "example_includeTags",
  excludeTags = "example_excludeTags",
  waitFor = 1,
  timeout = 1,
  actions = "example_actions"
})
print(result)

Functions

`firecrawl_scrape`

Scrape a single URL and extract its content. Returns the page content in the requested format (markdown by default). Supports actions like waiting for JavaScript, taking screenshots, and extracting specific elements.

Operation: Read read
Full name: firecrawl.firecrawl_scrape

Parameter	Type	Required	Description
`url`	string	yes	The URL to scrape (e.g., "https://example.com").
`formats`	array	no	Output formats to return. Options: "markdown", "html", "rawHtml", "content", "links", "screenshot", "actions". Default: ["markdown"].
`onlyMainContent`	boolean	no	Extract only the main content, removing navigation, footers, etc. Default: true.
`includeTags`	array	no	CSS selectors to include. Only these elements will be scraped.
`excludeTags`	array	no	CSS selectors to exclude. These elements will be removed from the result.
`waitFor`	integer	no	Time in milliseconds to wait for dynamic content to load before scraping.
`timeout`	integer	no	Timeout in milliseconds for the scrape request. Default: 30000.
`actions`	array	no	List of actions to perform before scraping (e.g., click, scroll, wait, screenshot).

`firecrawl_crawl`

Start a crawl job to scrape all pages from a website starting at the given URL. Returns a crawl job ID — use firecrawl_get_crawl_status to check progress and retrieve results.

Operation: Read read
Full name: firecrawl.firecrawl_crawl

Parameter	Type	Required	Description
`url`	string	yes	The root URL to start crawling from (e.g., "https://example.com").
`limit`	integer	no	Maximum number of pages to crawl. Default: 10.
`maxDepth`	integer	no	Maximum crawl depth from the root URL. Default: based on plan.
`formats`	array	no	Output formats for each page. Options: "markdown", "html", "rawHtml", "content", "links". Default: ["markdown"].
`excludePaths`	array	no	URL path patterns to exclude from crawling (e.g., ["/blog/*"]).
`includePaths`	array	no	Only crawl URLs matching these path patterns (e.g., ["/docs/*"]).
`allowBackwardLinks`	boolean	no	Allow crawling links that go back to parent pages. Default: false.
`allowExternalLinks`	boolean	no	Allow crawling links to external domains. Default: false.
`onlyMainContent`	boolean	no	Extract only main content from each page. Default: true.

`firecrawl_get_crawl_status`

Check the status and retrieve results of a crawl job. Returns the current status (scraping, completed, failed, cancelled) and all scraped data once complete.

Operation: Read read
Full name: firecrawl.firecrawl_get_crawl_status

Parameter	Type	Required	Description
`id`	string	yes	The crawl job ID returned by the firecrawl_crawl tool.

`firecrawl_map`

Map a website to discover all linked URLs. Returns a list of all URLs found on the site without scraping full content. Useful for understanding site structure before crawling.

Operation: Read read
Full name: firecrawl.firecrawl_map

Parameter	Type	Required	Description
`url`	string	yes	The root URL to map (e.g., "https://example.com").
`limit`	integer	no	Maximum number of URLs to return. Default: based on plan.
`includeSubdomains`	boolean	no	Include URLs from subdomains. Default: false.
`search`	string	no	Filter URLs that match a search term (only returns URLs containing this string).
`ignoreSitemap`	boolean	no	Skip sitemap.xml discovery and only use on-page links. Default: false.
`includePaths`	array	no	Only include URLs matching these path patterns.
`excludePaths`	array	no	Exclude URLs matching these path patterns.

`firecrawl_extract`

Extract structured data from one or more URLs using AI. Provide a prompt describing what to extract, or a JSON schema for the expected output format. Ideal for pulling specific data points from web pages.

Operation: Read read
Full name: firecrawl.firecrawl_extract

Parameter	Type	Required	Description
`urls`	array	yes	List of URLs to extract data from (e.g., ["https://example.com/about"]).
`prompt`	string	no	Natural language description of what data to extract from the pages.
`schema`	object	no	JSON schema defining the expected output structure. The response will conform to this schema.
`systemPrompt`	string	no	System prompt to guide the AI extraction behavior.
`allowExternalLinks`	boolean	no	Allow following links to external domains during extraction. Default: false.
`enableWebSearch`	boolean	no	Enable web search to supplement extraction with additional context. Default: false.
`includeSubdomains`	boolean	no	Include subdomains when following links. Default: false.

`firecrawl_get_current_user`

Get the authenticated user's account information, including plan details and usage statistics. Useful for verifying API key validity and checking remaining credits.

Operation: Read read
Full name: firecrawl.firecrawl_get_current_user

Parameter	Type	Required	Description
No parameters.