Update Document File

Path Params
uuid
required

The id of the document.

Body Params
mode

Partition strategy for the document. Different strategies exist for textual, audio and video file types and you can set the strategy you want for each file type, or just for textual types.

For textual documents the options are 'hi_res' or 'fast'. When set to 'hi_res', images and tables will be extracted from the document. 'fast' will only extract text. 'fast' may be up to 20x faster than 'hi_res'. hi_res is only applicable for Word documents, PDFs, Images, and PowerPoints. Images will always be processed in hi_res. If hi_res is set for an unsupported document type, it will be processed and billed in fast mode.

For audio files, the options are true or false. True if you want to process audio, false otherwise.

For video files, the options are 'audio_only', 'video_only', 'audio_video'. 'audio_only' will extract just the audio part of the video. 'video_only' will similarly just extract the video part, ignoring audio. 'audio_video' will extract both audio and video.

To process all media types at the highest quality, use 'all'.

When you specify audio or video stategies, the format must be a JSON object. In this case, textual documents are denoted by the key "static". If you omit a key, that document type won't be processd. See examples below.

Examples

Textual documents only "fast"

Video documents only { "video": "audio_video" }

Specify multiple document types { "static": "hi_res", "audio": true, "video": "video_only" }

Specify only textual or audio document types { "static": "fast", "audio": true }

Highest quality processing for all media types "all"

Agentic OCR "agentic_ocr" Agentic OCR is in early access. agentic_ocr mode extracts content using vision models which can be more accurate, especially across more visually complex documents. If you are interested in accessing this feature, please contact us at [email protected].

file
required

The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: .eml .html .json .md .msg .rst .rtf .txt .xml Images: .png .webp .jpg .jpeg .tiff .bmp .heic Documents: .csv .doc .docx .epub .epub+zip .odt .pdf .ppt .pptx .tsv .xlsx .xls. PDF files over 2000 pages are not supported in hi_res mode.

Headers

An optional partition to scope the request to. If omitted, accounts created after 1/9/2025 will have the request scoped to the default partition, while older accounts will have the request scoped to all partitions. Older accounts may opt in to strict partition scoping by contacting [email protected]. Older accounts using the partitions feature are strongly recommended to scope the request to a partition.

Responses

Language
Credentials
Bearer
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json