Llm.Speech
Overview
Available Operations
- transcribe - Speech to text transcription
- speak - Text to speech
- speakStreaming - Streaming text to speech
transcribe
Convert audio to text using advanced speech recognition.
Upload Methods:
Complete File Upload (Standard)
- Use
Content-Type: multipart/form-data - Upload the complete audio file in one request
- Maximum file size: 25MB
- Example (curl):
bashcurl -X POST "http://localhost:3000/api/v1/llm/speech/transcriptions?language=en" \ -F "file=@audio.flac"- Use
Chunked Upload (Streaming)
- Use
Transfer-Encoding: chunkedheader - Stream audio data in chunks as it's being recorded
- No need to know total file size upfront
- Server buffers chunks until complete before processing
- Maximum total size: 25MB
- Example (curl):
bashcurl -X POST "http://localhost:3000/api/v1/llm/speech/transcriptions?language=en" \ -H "Transfer-Encoding: chunked" \ -H "Content-Type: multipart/form-data" \ --data-binary @audio.flac- Use
Supported Formats: FLAC, MP3, MP4, MPEG, MPGA, M4A, OGG, WAV, WebM
Query Parameters:
language(optional): ISO-639-1 language code (e.g., "en", "es", "fr"). Auto-detects if not specified.prompt(optional): Text to guide transcription styletemperature(optional): Sampling temperature 0-1 (higher = more random)
Response: Returns transcribed text in JSON format.
Example Usage
import { SDK } from "@meetkai/mka1";
import { openAsBlob } from "node:fs";
const sdk = new SDK({
serverURL: "https://api.example.com",
bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});
async function run() {
const result = await sdk.llm.speech.transcribe({
requestBody: {
file: await openAsBlob("example.file"),
},
});
console.log(result);
}
run();Standalone function
The standalone function version of this method:
import { SDKCore } from "@meetkai/mka1/core.js";
import { llmSpeechTranscribe } from "@meetkai/mka1/funcs/llmSpeechTranscribe.js";
import { openAsBlob } from "node:fs";
// Use `SDKCore` for best tree-shaking performance.
// You can create one instance of it to use across an application.
const sdk = new SDKCore({
serverURL: "https://api.example.com",
bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});
async function run() {
const res = await llmSpeechTranscribe(sdk, {
requestBody: {
file: await openAsBlob("example.file"),
},
});
if (res.ok) {
const { value: result } = res;
console.log(result);
} else {
console.log("llmSpeechTranscribe failed:", res.error);
}
}
run();React hooks and utilities
This method can be used in React components through the following hooks and associated utilities.
Check out this guide for information about each of the utilities below and how to get started using React hooks.
import {
// Mutation hook for triggering the API call.
useLlmSpeechTranscribeMutation
} from "@meetkai/mka1/react-query/llmSpeechTranscribe.js";Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
request | operations.TranscribeRequest | ✔️ | The request object to use for the request. |
options | RequestOptions | ➖ | Used to set various options for making HTTP requests. |
options.fetchOptions | RequestInit | ➖ | Options that are passed to the underlying HTTP request. This can be used to inject extra headers for examples. All Request options, except method and body, are allowed. |
options.retries | RetryConfig | ➖ | Enables retrying HTTP requests under certain failure conditions. |
Response
Promise<components.TranscriptionResponse>
Errors
| Error Type | Status Code | Content Type |
|---|---|---|
| errors.APIError | 4XX, 5XX | */* |
speak
Convert text to speech with automatic language detection.
Request Body:
text: Input text to convert to speech - requiredlanguage: Language code (default: "auto") - "auto" for automatic detection, or ISO 639-1 codes: en, zh, hi, es, ar, bn, pt, ru, ja, pa, de, ko, fr, tr, it, th, pl, nl, id, vi, ur
Response: Returns audio file in WAV format with X-Language-Code header
Example Usage
import { SDK } from "@meetkai/mka1";
const sdk = new SDK({
serverURL: "https://api.example.com",
bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});
async function run() {
const result = await sdk.llm.speech.speak({
text: "<value>",
});
console.log(result);
}
run();Standalone function
The standalone function version of this method:
import { SDKCore } from "@meetkai/mka1/core.js";
import { llmSpeechSpeak } from "@meetkai/mka1/funcs/llmSpeechSpeak.js";
// Use `SDKCore` for best tree-shaking performance.
// You can create one instance of it to use across an application.
const sdk = new SDKCore({
serverURL: "https://api.example.com",
bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});
async function run() {
const res = await llmSpeechSpeak(sdk, {
text: "<value>",
});
if (res.ok) {
const { value: result } = res;
console.log(result);
} else {
console.log("llmSpeechSpeak failed:", res.error);
}
}
run();React hooks and utilities
This method can be used in React components through the following hooks and associated utilities.
Check out this guide for information about each of the utilities below and how to get started using React hooks.
import {
// Mutation hook for triggering the API call.
useLlmSpeechSpeakMutation
} from "@meetkai/mka1/react-query/llmSpeechSpeak.js";Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
request | components.TextToSpeechRequest | ✔️ | The request object to use for the request. |
options | RequestOptions | ➖ | Used to set various options for making HTTP requests. |
options.fetchOptions | RequestInit | ➖ | Options that are passed to the underlying HTTP request. This can be used to inject extra headers for examples. All Request options, except method and body, are allowed. |
options.retries | RetryConfig | ➖ | Enables retrying HTTP requests under certain failure conditions. |
Response
Promise<operations.TextToSpeechResponse>
Errors
| Error Type | Status Code | Content Type |
|---|---|---|
| errors.APIError | 4XX, 5XX | */* |
speakStreaming
Convert text to speech with real-time streaming audio delivery.
Key Features:
- Low-latency audio streaming - playback can start immediately as chunks arrive
- Automatic language detection
- Multiple format support: MP3 or PCM/WAV
- High-quality audio: 24kHz sample rate, 16-bit mono
Request Body:
text: Input text to convert to speech - requiredlanguage: Language code (default: "auto") - "auto" for automatic detection, or ISO 639-1 codes: en, zh, hi, es, ar, bn, pt, ru, ja, pa, de, ko, fr, tr, it, th, pl, nl, id, vi, urformat: Audio format (default: "mp3") - "mp3" for compressed MPEG audio (96 kbps) or "pcm" for uncompressed WAV
Response:
- Streams audio chunks in real-time
- Returns
X-Language-Codeheader with detected/used language - Content-Type:
audio/mpeg(MP3) oraudio/wav(PCM)
Use Cases:
- Real-time applications requiring immediate audio playback
- Interactive voice responses
- Low-latency text-to-speech scenarios
Example Usage
import { SDK } from "@meetkai/mka1";
const sdk = new SDK({
serverURL: "https://api.example.com",
bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});
async function run() {
const result = await sdk.llm.speech.speakStreaming({
text: "<value>",
});
console.log(result);
}
run();Standalone function
The standalone function version of this method:
import { SDKCore } from "@meetkai/mka1/core.js";
import { llmSpeechSpeakStreaming } from "@meetkai/mka1/funcs/llmSpeechSpeakStreaming.js";
// Use `SDKCore` for best tree-shaking performance.
// You can create one instance of it to use across an application.
const sdk = new SDKCore({
serverURL: "https://api.example.com",
bearerAuth: "<YOUR_BEARER_TOKEN_HERE>",
});
async function run() {
const res = await llmSpeechSpeakStreaming(sdk, {
text: "<value>",
});
if (res.ok) {
const { value: result } = res;
console.log(result);
} else {
console.log("llmSpeechSpeakStreaming failed:", res.error);
}
}
run();React hooks and utilities
This method can be used in React components through the following hooks and associated utilities.
Check out this guide for information about each of the utilities below and how to get started using React hooks.
import {
// Mutation hook for triggering the API call.
useLlmSpeechSpeakStreamingMutation
} from "@meetkai/mka1/react-query/llmSpeechSpeakStreaming.js";Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
request | components.TextToSpeechStreamingRequest | ✔️ | The request object to use for the request. |
options | RequestOptions | ➖ | Used to set various options for making HTTP requests. |
options.fetchOptions | RequestInit | ➖ | Options that are passed to the underlying HTTP request. This can be used to inject extra headers for examples. All Request options, except method and body, are allowed. |
options.retries | RetryConfig | ➖ | Enables retrying HTTP requests under certain failure conditions. |
Response
Promise<operations.TextToSpeechStreamingResponse>
Errors
| Error Type | Status Code | Content Type |
|---|---|---|
| errors.APIError | 4XX, 5XX | */* |