KosmoKrator

data

arXiv Lua API for KosmoKrator Agents

Agent-facing Lua documentation and function reference for the arXiv KosmoKrator integration.

Lua Namespace

Agents call this integration through app.integrations.arxiv.*. Use lua_read_doc("integrations.arxiv") inside KosmoKrator to discover the same reference at runtime.

Call Lua from the Headless CLI

Use kosmo integrations:lua when a shell script, CI job, cron job, or another coding CLI should run a deterministic arXiv workflow without starting an interactive agent session.

Inline Lua call
kosmo integrations:lua --eval 'dump(app.integrations.arxiv.search_papers({search_query = "example_search_query", id_list = "example_id_list", start = 1, max_results = 1, sortBy = "example_sortBy", sortOrder = "example_sortOrder"}))' --json
Read Lua docs headlessly
kosmo integrations:lua --eval 'print(docs.read("arxiv"))' --json
kosmo integrations:lua --eval 'print(docs.read("arxiv.search_papers"))' --json

Workflow file

Put repeatable logic in a Lua file, then execute it with JSON output for the calling process.

workflow.lua
local arxiv = app.integrations.arxiv
local result = arxiv.search_papers({search_query = "example_search_query", id_list = "example_id_list", start = 1, max_results = 1, sortBy = "example_sortBy", sortOrder = "example_sortOrder"})

dump(result)
Run the workflow
kosmo integrations:lua workflow.lua --json
kosmo integrations:lua workflow.lua --force --json
Namespace note. integrations:lua exposes app.integrations.arxiv, app.mcp.*, docs.*, json.*, and regex.*. Use app.integrations.arxiv.default.* or app.integrations.arxiv.work.* when you configured named credential accounts.

MCP-only Lua

If the script only needs configured MCP servers and does not need arXiv, use the narrower mcp:lua command.

MCP Lua command
# Use mcp:lua for MCP-only scripts; use integrations:lua for this integration namespace.
kosmo mcp:lua --eval 'dump(mcp.servers())' --json

Agent-Facing Lua Docs

This is the rendered version of the full Lua documentation exposed to agents when they inspect the integration namespace.

arXiv

Namespace: arxiv

arXiv exposes a public Atom API for searching preprints and an OAI-PMH endpoint for metadata harvesting. This integration normalizes XML into compact Lua-friendly arrays while keeping arXiv and OAI field names recognizable.

Use arxiv_search_papers with the official search_query syntax. Common prefixes include all, ti, au, abs, cat, id, jr, and doi.

local results = arxiv.search_papers({
  search_query = 'cat:cs.AI AND ti:"agent"',
  start = 0,
  max_results = 5,
  sortBy = "submittedDate",
  sortOrder = "descending",
})

Focused helpers build common queries for you:

local by_author = arxiv.search_by_author({
  author = "Ada Lovelace",
  max_results = 5,
})

local by_title = arxiv.search_by_title({
  title = "agent",
  max_results = 5,
})

local recent = arxiv.search_recent({
  search_query = "cat:cs.AI",
  max_results = 10,
})

Get By ID

Use arxiv_get_papers when you already have arXiv IDs.

local papers = arxiv.get_papers({
  id_list = { "2103.15348", "1706.03762" },
})

Return Shape

Results include feed metadata and an entries array. Each entry contains:

  • arxiv_id
  • title
  • summary
  • published
  • updated
  • authors
  • primary_category
  • categories
  • doi
  • journal_ref
  • comment
  • abs_url
  • pdf_url
  • links

The API supports paging with start and max_results. For repeated large fetches, keep slices small and respect arXiv pacing guidance.

OAI-PMH Metadata

Use OAI tools when you need repository metadata, bulk identifiers, records, date ranges, sets, or resumption tokens. arXiv metadata prefixes commonly include arXiv, arXivRaw, and oai_dc.

local info = arxiv.oai_identify({})

local records = arxiv.oai_list_records({
  metadataPrefix = "arXiv",
  from = "2024-01-01",
  until = "2024-01-31",
  set = "cs",
})

local next_page = arxiv.oai_list_records({
  resumptionToken = records.data.ListRecords.children.resumptionToken._text,
})

Use arxiv_oai_get_record for one known OAI identifier:

local record = arxiv.oai_get_record({
  identifier = "oai:arXiv.org:2103.15348",
  metadataPrefix = "arXiv",
})

OAI responses include:

  • response_date
  • request
  • errors
  • data

The data object preserves nested XML nodes, attributes in _attributes, text in _text, and repeated elements as arrays. OAI harvesting can return partial pages; always pass the resumptionToken from the previous response until arXiv stops returning one.

Raw agent markdown
# arXiv

Namespace: `arxiv`

arXiv exposes a public Atom API for searching preprints and an OAI-PMH endpoint for metadata harvesting. This integration normalizes XML into compact Lua-friendly arrays while keeping arXiv and OAI field names recognizable.

## Search

Use `arxiv_search_papers` with the official `search_query` syntax. Common prefixes include `all`, `ti`, `au`, `abs`, `cat`, `id`, `jr`, and `doi`.

```lua
local results = arxiv.search_papers({
  search_query = 'cat:cs.AI AND ti:"agent"',
  start = 0,
  max_results = 5,
  sortBy = "submittedDate",
  sortOrder = "descending",
})
```

Focused helpers build common queries for you:

```lua
local by_author = arxiv.search_by_author({
  author = "Ada Lovelace",
  max_results = 5,
})

local by_title = arxiv.search_by_title({
  title = "agent",
  max_results = 5,
})

local recent = arxiv.search_recent({
  search_query = "cat:cs.AI",
  max_results = 10,
})
```

## Get By ID

Use `arxiv_get_papers` when you already have arXiv IDs.

```lua
local papers = arxiv.get_papers({
  id_list = { "2103.15348", "1706.03762" },
})
```

## Return Shape

Results include feed metadata and an `entries` array. Each entry contains:

- `arxiv_id`
- `title`
- `summary`
- `published`
- `updated`
- `authors`
- `primary_category`
- `categories`
- `doi`
- `journal_ref`
- `comment`
- `abs_url`
- `pdf_url`
- `links`

The API supports paging with `start` and `max_results`. For repeated large fetches, keep slices small and respect arXiv pacing guidance.

## OAI-PMH Metadata

Use OAI tools when you need repository metadata, bulk identifiers, records, date ranges, sets, or resumption tokens. arXiv metadata prefixes commonly include `arXiv`, `arXivRaw`, and `oai_dc`.

```lua
local info = arxiv.oai_identify({})

local records = arxiv.oai_list_records({
  metadataPrefix = "arXiv",
  from = "2024-01-01",
  until = "2024-01-31",
  set = "cs",
})

local next_page = arxiv.oai_list_records({
  resumptionToken = records.data.ListRecords.children.resumptionToken._text,
})
```

Use `arxiv_oai_get_record` for one known OAI identifier:

```lua
local record = arxiv.oai_get_record({
  identifier = "oai:arXiv.org:2103.15348",
  metadataPrefix = "arXiv",
})
```

OAI responses include:

- `response_date`
- `request`
- `errors`
- `data`

The `data` object preserves nested XML nodes, attributes in `_attributes`, text in `_text`, and repeated elements as arrays. OAI harvesting can return partial pages; always pass the `resumptionToken` from the previous response until arXiv stops returning one.
Metadata-derived Lua example
local result = app.integrations.arxiv.search_papers({search_query = "example_search_query", id_list = "example_id_list", start = 1, max_results = 1, sortBy = "example_sortBy", sortOrder = "example_sortOrder"})
print(result)

Functions

search_papers Read

Search arXiv papers using the official Atom API. Use arXiv query syntax such as all:electron, ti:"diffusion model", au:"Smith", cat:cs.AI, and boolean operators.

Lua path
app.integrations.arxiv.search_papers
Full name
arxiv.arxiv_search_papers
ParameterTypeRequiredDescription
search_query string no arXiv search expression such as all:electron, ti:"transformer", au:"Smith", or cat:cs.AI.
id_list array no Optional arXiv IDs used alone or as a filter with search_query.
start integer no Zero-based offset of the first returned result.
max_results integer no Maximum results to return. arXiv recommends small slices for repeated calls.
sortBy string no Sort field.
sortOrder string no Sort direction.
get_papers Read

Retrieve arXiv paper metadata by one or more arXiv IDs.

Lua path
app.integrations.arxiv.get_papers
Full name
arxiv.arxiv_get_papers
ParameterTypeRequiredDescription
id_list array yes arXiv IDs such as 2103.15348 or 2103.15348v1.
search_by_author Read

Search arXiv papers by author name.

Lua path
app.integrations.arxiv.search_by_author
Full name
arxiv.arxiv_search_by_author
ParameterTypeRequiredDescription
No parameters.
search_by_title Read

Search arXiv papers by title text.

Lua path
app.integrations.arxiv.search_by_title
Full name
arxiv.arxiv_search_by_title
ParameterTypeRequiredDescription
No parameters.
search_by_category Read

Search recent arXiv papers by category code.

Lua path
app.integrations.arxiv.search_by_category
Full name
arxiv.arxiv_search_by_category
ParameterTypeRequiredDescription
No parameters.
search_recent Read

Search arXiv with newest submissions first.

Lua path
app.integrations.arxiv.search_recent
Full name
arxiv.arxiv_search_recent
ParameterTypeRequiredDescription
No parameters.
oai_identify Read

Read arXiv OAI-PMH repository metadata.

Lua path
app.integrations.arxiv.oai_identify
Full name
arxiv.arxiv_oai_identify
ParameterTypeRequiredDescription
No parameters.
oai_metadata_formats Read

List OAI-PMH metadata formats supported by arXiv.

Lua path
app.integrations.arxiv.oai_metadata_formats
Full name
arxiv.arxiv_oai_list_metadata_formats
ParameterTypeRequiredDescription
No parameters.
oai_sets Read

List arXiv OAI-PMH sets.

Lua path
app.integrations.arxiv.oai_sets
Full name
arxiv.arxiv_oai_list_sets
ParameterTypeRequiredDescription
No parameters.
oai_identifiers Read

List OAI-PMH identifiers and datestamps.

Lua path
app.integrations.arxiv.oai_identifiers
Full name
arxiv.arxiv_oai_list_identifiers
ParameterTypeRequiredDescription
No parameters.
oai_records Read

List OAI-PMH metadata records.

Lua path
app.integrations.arxiv.oai_records
Full name
arxiv.arxiv_oai_list_records
ParameterTypeRequiredDescription
No parameters.
oai_get_record Read

Retrieve one OAI-PMH metadata record by identifier.

Lua path
app.integrations.arxiv.oai_get_record
Full name
arxiv.arxiv_oai_get_record
ParameterTypeRequiredDescription
No parameters.