API Walkthrough
In this guide we walk through the specific API features for audio and video.
Create Document
The mode
parameter is an object or string, to support backwards compatibility. In the mode
object, static
can be hi_res
or fast
, audio
is a boolean, and video
can be audio_only
, video_only
, or audio_video
. audio_video
mode will process both the audio and video of the uploaded video file, while the other 2 options will only process their respective tracks.
Static - Backwards Compatibility
Uploading a static file is backwards compatible with the mode
constants.
const file = new File([blob], "presentation.pdf");
const result = await ragie.documents.create({
file: file,
mode: "hi_res", // or "fast"
metadata: {
category: "business",
},
});
Static - Object
const file = new File([blob], "presentation.pdf");
const result = await ragie.documents.create({
file: file,
mode: {
static: "hi_res", // or "fast"
};
metadata: {
category: "business",
},
});
Audio
const file = new File([blob], "presentation.mp3");
const result = await ragie.documents.create({
file: file,
mode: {
audio: true,
};
metadata: {
category: "business",
},
});
Video
const file = new File([blob], "presentation.mp4");
const result = await ragie.documents.create({
file: file,
mode: {
video: "audio_video", // or "audio_only", "video_only"
};
metadata: {
category: "business",
},
});
Setting Multiple Modes
You can set multiple modes at once; the applicable mode will be used for the uploaded file.
const file = new File([blob], "presentation.mp4");
const result = await ragie.documents.create({
file: file,
mode: {
static: "hi_res",
audio: true,
video: "audio_video",
};
metadata: {
category: "business",
},
});
Response
The response from the Create Document API is the same for any file type.
{
"status": "partitioning",
"id": "id123",
"created_at": "2025-05-15T19:58:03.483172Z",
"updated_at": "2025-05-15T19:58:03.586856Z",
"name": "presentation.mp4",
"metadata": {},
"partition": "default",
"chunk_count": null,
"external_id": "",
"page_count": null
}
Get Document Chunks
The Get Document Chunks endpoint returns additional information relevant to audio and video files. The request schema is the same. The response includes end_time
and start_time
in the metadata
field in each chunk. For audio and video files, the links
field contains links to the chunk, its text, a stream URL, and a download URL.
Example response:
{
"pagination": {
"next_cursor": null,
"total_count": 10
},
"chunks": [
{
"id": "id123",
"index": 0,
"text": "The text of the chunk.",
"metadata": {
"end_time": 57.68,
"start_time": 2.44
},
"links": {
"self": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId",
"type": "application/json"
},
"self_text": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId/content?media_type=text/plain-text",
"type": "text/plain-text"
},
"document": {
"href": "https://api.ragie.ai/documents/d5c36cb0-0ec4-46bd-aacc-626c252598e4",
"type": "application/json"
},
"document_text": {
"href": "https://api.ragie.ai/documents/docId/content?media_type=text/plain-text",
"type": "text/plain-text"
},
"self_audio_stream": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId/content?media_type=audio/mpeg",
"type": "audio/mpeg"
},
"self_audio_download": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId/content?media_type=audio/mpeg&download=true",
"type": "audio/mpeg"
},"document_audio_stream": {
"href": "https://api.ragie.ai/documents/docId/content?media_type=audio/mpeg",
"type": "audio/mpeg"
},
"document_audio_download": {
"href": "https://api.ragie.ai/documents/docId/content?media_type=audio/mpeg&download=true",
"type": "audio/mpeg"
}
}
},
{...
For a video document, the response would be the same, except the links
object would contain self
, self_text
, self_video_stream
, self_video_download
, document_video_stream
, and document_video_download
.
The end_time
and start_time
values represent the times when the chunk ends and starts, respectively, in the entire audio or video file.
Get Document Chunk
The Get Document Chunk endpoint includes even more granular data for each chunk of an audio or video file, namely word-level timestamps. In addition to the metadata
and links
fields, this endpoint returns modality_data
that contains word-level timestamps.
Example response:
{
"id": "id123",
"index": 0,
"text": "The text of the chunk.",
"metadata": {
"end_time": 57.68,
"start_time": 2.44
},
"links": {
"self": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId",
"type": "application/json"
},
"self_text": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId/content?media_type=text/plain-text",
"type": "text/plain-text"
},
"document": {
"href": "https://api.ragie.ai/documents/docId",
"type": "application/json"
},
"document_text": {
"href": "https://api.ragie.ai/documents/docId/content?media_type=text/plain-text",
"type": "text/plain-text"
},
"self_audio_stream": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId/content?media_type=audio/mpeg",
"type": "audio/mpeg"
},
"self_audio_download": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId/content?media_type=audio/mpeg&download=true",
"type": "audio/mpeg"
},
"document_audio_stream": {
"href": "https://api.ragie.ai/documents/docId/content?media_type=audio/mpeg",
"type": "audio/mpeg"
},
"document_audio_download": {
"href": "https://api.ragie.ai/documents/docId/content?media_type=audio/mpeg&download=true",
"type": "audio/mpeg"
}
},
"modality_data": {
"type": "audio",
"word_timestamps": [
{
"start_time": 2.44,
"end_time": 3.06,
"word": " President",
"probability": 0.75537109375
},
{
"start_time": 3.06,
"end_time": 3.38,
"word": " Pitzer,",
"probability": 0.67431640625
},
...
For static files, the modality_data
field will be returned as null
.
Get Document Chunk Content
The Get Document Chunk Content endpoint returns the content of a document chunk in the requested format. This can be used to stream media of the content for audio and video documents.
Request
The media_type
parameter is used to describe the desired mime type of the content returned. If the requested media_type
is not supported for the chunk’s document type, an error will occur.
Example request:
const response = await client.documents.getChunkContent({
documentId: "docId",
chunkId: "chunkId",
mediaType: "audio/mpeg",
});
Response
The response will be in the mime type specified in the request. If the download
parameter is false, chunks from audio and video files will return a stream of raw data. If download
is true, the content will be returned as a named file for download. This endpoint will behave the same as Get Document Chunk if media_type
is set to application/json
.
Retrievals
The Retrieve endpoint works across all modalities, including audio and video.
You can get 3 different types of chunks in the retrieval: text, audio, and video. The retrieval request schema remains the for all modalities. For audio and video, the response contains the following extra information:
Example response:
{
"scored_chunks": [
{
"text": "This is the text of a chunk",
"score": 0.19090909090909092,
"id": "chunkId",
"index": 2,
"metadata": {
"end_time": 167.5,
"start_time": 115.37
},
"document_id": "docId",
"document_name": "presentation.mp3",
"document_metadata": {},
"links": {
"self": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId",
"type": "application/json"
},
"self_text": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId/content?media_type=text/plain-text",
"type": "text/plain-text"
},
"self_audio_stream": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkId/content?media_type=audio/mpeg",
"type": "audio/mpeg"
},
"self_audio_download": {
"href": "https://api.ragie.ai/documents/docId/chunks/chunkid/content?media_type=audio/mpeg&download=true",
"type": "audio/mpeg"
}
}
},
...
The scoredChunks
array will include metadata
that contains the end_time
and start_time
for the specific chunk. Note the document’s specific metadata is in the document_metadata
field.
Understanding a Chunk's Links
Various links may be provided with chunks. Here is a breakdown of the links provided:
Text | Audio (download) | Audio (stream) | Video (download) | Video (stream) | |
---|---|---|---|---|---|
self_text | self_audio_download | self_audio_stream | self_video_download | self_video_stream | |
Chunk | The plain text of the chunk. | Download the audio of the chunk. | Stream the audio of the chunk. | Download the video of the chunk. | Stream the video of the chunk. |
document_text | document_audio_download | document_audio_stream | document_video_download | document_video_stream | |
Document | the plain text of the document. | Download the audio of the document. | Stream the audio of the document. | Download the video of the document. | Stream the video of the document. |
Stream the video of a specific chunk:
const streamUrl = chunk.links.self_audio_stream.href;
Download the entire video document of which the chunk is a member:
const downloadUrl = chunk.links.document_video_download.href;
Updated 3 days ago