Summary Index

In a regular RAG system, its entirely possible that the top_k chunks during retrieval come from a single document. There are application use cases where you want your retrieved chunks to span a number of documents. In Ragie, you can do this using the Summary Index feature.

Retrieval

Ragie's retrievals endpoint takes an optional max_chunks_per_document parameter. When this is set to a value > 0, it works in conjunction with the top_k parameter to increase the number of documents, chunks are retrieved from. Behind the scenes it does the following.

  1. Uses the summary index to retrieve the top n documents where n is top_k/max_chunks_per_document. Document relevancy is calculated using cosine distance between the query and the summary of the document.
  2. For each document, retrieves the top max_chunks_per_document using the document chunk index.
  3. Sorts the final list of chunks by cosine similarity score.
  4. Optionally, if rerank=true, does a LLM based reranking of the results.
{
  "query": "what are the most requested features?",
  "top_k": 24,
  "filter": {
    "collection": "sales",
  },
  "rerank": false,
  "max_chunks_per_document": 2
}

Document Summary Creation, Embedding & Indexing

Most RAG systems use a single document chunk index for indexing and retrieval. Ragie's RAG platform implements the more advanced two-tier indexing. In addition to the document chunk index, we also create a second document summary index.

  1. For each document which is created or updated within Ragie, we create a detailed summary of the document using the most advanced LLMs. We use state-of-the-art large context LLMs which work with documents upto 1M tokens in length. Documents of type xlsx, csv, and json are not supported for summarization.
  2. The document summary is then embedded using a LLM. Because the summary chunks tend to be longer and more information dense than regular document chunks, we create higher dimensionality(3072) vectors for each summary chunk.
  3. Each summary vector is then stored in a high performance vector db index along with its document's metadata.
  4. Once created and indexed, document summaries may be retrieved using our document summary API.

Document summary generation is automatic for all compatible document types. It does not require any additional configuration or API parameters to enable.