Partitions
In Ragie documents may be logically separated into partitions. Retrievals can provide an optional partition
parameter, that, when present, scopes the retrieval to documents in the given partition. If omitted retrievals are scoped to an implicit "default" partition. Partitions can be used for a number of use cases, such as segregating user data in multi-tenant saas applications or defining distinct knowledge bases for use in different contexts. Partitions are an optional feature and, if not used, all documents will exist in a “default” partition for your tenant.
Partitions are provided as a string that is lowercase alphanumeric and may include the _
and -
special characters.
Partitions may also be used to improve retrieval results. Ragie uses a hybrid approach when performing retrievals that includes searching a keyword index. These keywords indexes are also separated by partitions. A relevant detail here is that keyword importance (or weight) is partially determined by how frequently the keyword appears in the set of documents (inverse document frequency). For example legal jargon in a set of law documents would be relatively less important compared to legal jargon appearing in customer service documents, since presumably the legal terms would appear far more frequently in the law documents. Partitions are a useful construct for separating documents by domain to improve the quality of the keyword portion of Ragie’s hybrid search.
Working with Partitions
Creating partitions
A Partition is automatically created anytime a document is created in it.
Creating document
Documents can be created in a partition by providing an optional partition
string when creating them.
Retrievals
The optional partition
parameter can be provided when doing retrievals. When present the retrieval will be scoped to that Partition. Fine grained scoping via metadata filters may be combined with partition scoping.
Partitions handling for other endpoints
Many other partitions support scoping the request to a partition using the partition
http header or a parameter with that same name in the Ragie SDKs. If partition is omitted the requests will be scoped to the "default" partition. One caveat to this behavior is accounts created prior to 1/9/2025, which will have the requests scoped to all partitions. Those accounts may opt in to stricter partition enforcement by contacting [email protected]. If you're using partitions, it's strongly encouraged to explicitly set partition
on requests that support it.
Endpoints that support partition scoping
- Get Document
- Delete Document
- Update Document File
- Update Document Raw
- Patch Document Metadata
- Get Document Chunks
- Get Document Chunk
- Get Document Summary
- Get Instruction Extracted Entities
- Get Document Extracted Entities
Common use cases
Multi-tenant SaaS
Multi-tenant applications will generally want to isolate their users’ data to prevent data leakage between users. Apps may want to use a USER_ID as their partition key or potentially an ORG_ID if the app is more multiuser in nature. Isolating user and organization data is as simple as providing the desired partition key when managing documents and doing retrievals. More fine-grained retrieval scoping via metadata filters is still possible and can be combined with partitions.
Isolated Knowledge Bases
If an organization has multiple distinct domains of knowledge that they want to use as sources for their generative AI applications, creating partitions for those domains will improve the quality of the keyword component of Ragie’s hybrid search approach. Depending on the use case, creating a distinct partition for various functions such as customer support, legal, HR, etc… may be the ideal approach. This is not always a one size fits all recommendation and may affect how you structure your retrievals. If you have any questions, we’re always happy to discuss the particulars of your use case and help you design the best approach.
Updated 8 days ago