LLM Integration

Jade is built around first-class LLM access. A prompt declaration names a prompt string; the ? operator sends it to the configured model and returns the response. The |> pipe suffix coerces the response to a typed Jade value, with automatic retry on failure.

Declaring a Prompt

Use the prompt keyword to bind a prompt string to a name. The right-hand side is any expression that evaluates to a str.

prompt p = "What is the capital of France?"

A prompt binding holds the prompt text — it does not call the model. The model is only called when the variable is dereferenced with ?.

let question = "What is 2 + 2?"
prompt p = question

Untyped Dereference — `?p`

Prefixing a prompt variable with ? sends the prompt to the configured model and returns the raw response as a str.

prompt p = "Say exactly: Hello from Jade!"
let response = ?p
print(response)

Each ? dereference appends a user turn and the model's reply to the shared conversation history for this program run. Subsequent dereferences see the full prior context, so prompts can naturally build on each other.

prompt p1 = "My name is Alice."
prompt p2 = "What is my name?"

let _ = ?p1          // establishes context
let name = ?p2       // model sees p1 exchange, should reply "Alice"
print(name)

Typed Dereference — `?p |> type`

Append |> type after a prompt dereference to coerce the model's response to a Jade value type. The supported target types are:

Type	Accepted LLM output	Result
`int`	`"42"`, `"-7"`	`int` value
`float`	`"3.14"`, `"1e10"`	`float` value
`bool`	`"true"`, `"True"`, `"false"`	`bool` value
`str`	anything	`str` value (always succeeds)

prompt p = "What is 3 + 4? Respond with only the number."
let n = ?p |> int
print(n + 1)          // 8

prompt p = "Is 5 greater than 3? Respond with only: true or false"
let result = ?p |> bool
if result {
    print("correct!")
}

note

?p |> type must be assigned to a variable — it cannot appear directly inside print(). Use let n = ?p |> int then print(n).

Retry on Coercion Failure

When the model's response cannot be parsed as the requested type, Jade automatically sends a correction follow-up and tries again. This continues up to max_retries times (default: 3, giving 4 total attempts including the initial one).

If all attempts fail, Jade raises a PromptOverflow runtime error naming the prompt variable and the number of attempts made.

On success, the retry conversation turns are stripped from the shared history — only the final successful exchange is retained.

// With max_retries = 3, Jade will try up to 4 times to get a valid int.
prompt p = "Pick a lucky number."
let n = ?p |> int
print(n)

Configuration

Jade reads LLM settings from jade.toml in the working directory. Environment variables override file values.

`jade.toml` format

[model]
provider    = "anthropic"          # or "openai"
model       = "claude-haiku-4-5-20251001"
max_retries = 3
api_key     = "sk-..."             # optional — prefer the env var

Environment variables

Variable	Purpose
`JADE_API_KEY`	API key (overrides `api_key` in jade.toml)
`JADE_PROVIDER`	`anthropic` or `openai`
`JADE_MODEL`	Model name string
`JADE_MAX_RETRIES`	Integer, overrides `max_retries`

Interactive setup

Run jade configure to launch the interactive wizard. It prompts for provider, model, API key, and max retries, then writes jade.toml in the current directory.

jade configure

warning

Security: storing api_key in jade.toml saves it in plaintext. Prefer setting JADE_API_KEY in your shell environment and omitting the key from the file.

Supported providers

Provider string	API used	Default model
`anthropic` (default)	Anthropic Messages API	`claude-haiku-4-5-20251001`
`openai`	OpenAI Chat Completions	`gpt-4o-mini`
`jade-os`	Jade OS on-device kernel backend	set by device configuration

Session Variables

Jade pre-populates several read-only-by-convention variables in the global scope before your program runs. These are updated after each ? dereference.

Variable	Type	Description
`__tokens__`	`int`	Running total of tokens consumed by all inference calls this run
`__model__`	`str`	Name of the model being used
`__max_retries__`	`int`	Configured maximum retry count for typed dereferences
`__retry_log__`	`array`	Log of retry events (reserved for future structured output)

prompt p = "Say: hi"
let _ = ?p

print(__model__)        // e.g. "claude-haiku-4-5-20251001"
print(__max_retries__)  // 3
print(__tokens__)       // tokens used so far

Runtime Configuration — `use "llm"`

Import the built-in llm package to adjust LLM settings at runtime:

use "llm"

llm.set_max_tokens(256)    // cap responses at 256 tokens

`llm.set_max_tokens(n)`

Sets the maximum number of tokens the model may generate per inference call. n must be a positive integer. The setting takes effect for all subsequent ? dereferences in the same run.

use "llm"

llm.set_max_tokens(64)

prompt p = "Write a haiku about rain."
let haiku = ?p
print(haiku)

note

set_max_tokens overrides any max_tokens value set in jade.toml or environment variables for the remainder of the program run. There is no way to reset it to the config file value once changed.

Async Inference

Jade supports concurrent LLM inference through async fn definitions and await expressions. Defining a function with async fn allows it to run prompt dereferences concurrently with other async functions.

Within an async fn, prefix any expression with await to wait for its result. When running under jade run, async functions execute concurrently via the Tokio runtime — multiple LLM calls can be in-flight at the same time.

async fn ask_question(q) {
    prompt p = q
    return await ?p
}

let a = ask_question("What is the capital of France?")
let b = ask_question("What is the capital of Germany?")
print(await a)
print(await b)

The two calls above run concurrently — both prompts are sent to the model at the same time.

note

The REPL executes async fn definitions synchronously (one at a time). Use jade run for true concurrent execution. A warning is printed to stderr when an async fn is evaluated in the tree-walk path.

See Async / Await for the full reference.

Error Reference

Error	Cause
`MissingApiKey`	`?p` was evaluated but no API key was configured
`NotAPrompt`	`?x` where `x` is not a `prompt` binding
`PromptOverflow`	Typed dereference exhausted all retries without producing a valid value
`InferenceError`	HTTP or API error from the provider (non-2xx response, network failure, etc.)
`StreamingWithType`	`?p
`NotAFuture`	`await` applied to a non-Future value
`DoubleAwait`	The same Future was awaited more than once
`AsyncPanic`	A spawned async task panicked; the message and span are captured from the task

Declaring a Prompt​

Untyped Dereference — ?p​

Typed Dereference — ?p |> type​

Retry on Coercion Failure​

Configuration​

jade.toml format​

Environment variables​

Interactive setup​

Supported providers​

Session Variables​

Runtime Configuration — use "llm"​

llm.set_max_tokens(n)​

Async Inference​

Error Reference​

Declaring a Prompt

Untyped Dereference — `?p`

Typed Dereference — `?p |> type`

Retry on Coercion Failure

Configuration

`jade.toml` format

Environment variables

Interactive setup

Supported providers

Session Variables

Runtime Configuration — `use "llm"`

`llm.set_max_tokens(n)`

Async Inference

Error Reference