Skip to content

Llm

LLM configuration for the voice session. Uses a subset of the responses create schema — fields like input, stream, background, webhooks, and store are excluded since the voice agent manages those. WARNING: This object is serialized into the LiveKit JWT metadata which is transmitted via HTTP headers. Keep payloads small — large tools arrays can exceed the ~8KB header size limit imposed by most servers and proxies.

Example Usage

typescript
import { Llm } from "@meetkai/mka1/models/components";

let value: Llm = {
  model: "Model S",
};

Fields

FieldTypeRequiredDescription
modelstring✔️The ID of the model to use for generating the response.
instructionsstringSystem or developer instructions inserted into the model's context before user input. Use this to guide the agent's behavior, set personality, define constraints, or provide domain-specific knowledge. Acts as a persistent system message for this response.
conversationcomponents.LivekitTokenRequestConversationThe conversation that this response belongs to. Can be a conversation ID string or a conversation object. Used to maintain context and history across multiple agent interactions. Optional - omit for one-off interactions.
previousResponseIdstringThe unique ID of a previous response to continue from. Used for multi-turn conversations to maintain context and history. The agent will have access to all previous interactions in the chain.
promptcomponents.PromptTemplateN/A
toolscomponents.LivekitTokenRequestTools[]Array of tool definitions that the agent can use during response generation. Supports built-in tools (web search, file search, code interpreter, computer use, image generation) and custom tools (functions, MCP, local shell). Each tool definition specifies its type and configuration.
toolChoicecomponents.ToolChoiceN/A
parallelToolCallsbooleanWhether to allow the agent to execute multiple tool calls in parallel. When true, the agent can make concurrent tool calls for efficiency. When false, tools are called sequentially. Defaults to true for better performance.
maxToolCallsnumberThe maximum total number of tool calls the agent can make during this response. Useful for controlling execution time and preventing infinite loops. Applies to all built-in tools like web search, file search, code interpreter, etc. Must be a positive integer between 1 and 300. Defaults to 30.
reasoningcomponents.ReasoningConfigurationN/A
maxOutputTokensnumberThe maximum number of tokens the model can generate in its output. Sets an upper bound to control costs and response length. The actual output may be shorter if the model finishes naturally. Must be a positive integer.
temperaturenumberControls randomness in the model's output. Higher values (e.g., 1.5-2.0) make output more random and creative, lower values (e.g., 0.0-0.5) make it more focused and deterministic. Must be between 0 and 2.
topPnumberNucleus sampling parameter. The model considers only the tokens with top_p cumulative probability. Lower values (e.g., 0.1) make output more focused, higher values (e.g., 0.9) allow more diversity. Must be between 0 and 1. Alternative to temperature.
truncationcomponents.LivekitTokenRequestTruncationThe truncation strategy for handling inputs that exceed the model's context window. 'auto' automatically truncates old messages to fit, 'disabled' returns an error if context is too long. Defaults to 'auto'.
presencePenaltynumberPenalizes new tokens based on whether they appear in the text so far. Higher values increase the model's likelihood to talk about new topics.
frequencyPenaltynumberPenalizes new tokens based on their frequency in the text so far. Higher values decrease the model's likelihood to repeat the same line verbatim.
contextManagementcomponents.CompactionContextManagement[]Array of context management strategies to apply during response generation.
serviceTiercomponents.LivekitTokenRequestServiceTierSpecifies the processing tier for serving the request. 'auto' lets the system choose, 'default' uses standard processing, 'flex' allows flexible scheduling for lower priority, 'priority' provides faster processing. Defaults to 'auto'.
metadataRecord<string, string>N/A
textcomponents.TextConfigurationN/A