Vector Store RAG

Vector Store RAG Workflow Documentation

Overview

This workflow implements a Retrieval Augmented Generation (RAG) system using a Chroma vector database. It takes user chat input, retrieves relevant context from a document via vector similarity search, generates a response using an OpenAI language model, and displays the response in a chat interface. The system allows for both single-turn and multi-turn conversations by leveraging the conversation's history. A long document is processed by first splitting it into chunks, embedding each, and then storing them in the Chroma database. This enables efficient search over large datasets.

Components Overview

The workflow utilizes the following components:

Chat Input: Captures user input from the chat interface, including text, files, and metadata.
OpenAI Embeddings: Generates embeddings for text using an OpenAI model.
Chroma DB (Chroma VectorStoreComponent): A vector database storing document embeddings and facilitating similarity searches. Two instances are used: one for single-turn prompts and one for ingesting and searching the document chunks.
Split Text: Divides a large input text file into smaller, manageable chunks.
Parse Data: Formats data into a textual representation suitable for prompt engineering.
Prompt: Constructs the prompt for the language model, combining the user's query and retrieved context.
OpenAI Model: An OpenAI language model that generates text responses based on the provided prompt.
Chat Output: Displays the AI's response in the chat interface.
File: Loads a text file for processing and embedding.

Detailed Component Descriptions

Chat Input

Description: Retrieves chat inputs (text, files, metadata) from the user interface.
Input Parameters: Text, files, conversation ID, sender type, sender name, session ID, should store message (boolean).
Output Parameters: Message object containing the user's input.
Key Configurations/Conditions: Requires at least the input text or files to be provided.

OpenAI Embeddings (Two Instances)

Description: Generates embeddings for text data using specified OpenAI models.
Input Parameters: Text data, model name, chunk size (for long text input).
Output Parameters: Embeddings representing the input text.
Key Configurations/Conditions: Requires an OpenAI API key and specification of the embedding model.

Chroma DB (Two Instances)

Description: Stores and searches vector embeddings.
Input Parameters: Embeddings, search query, number of results, collection name, persist directory.
Output Parameters: Search results (Data) containing relevant context from the stored embeddings.
Key Configurations/Conditions: Requires a collection name and optionally a persist directory for saving the database.

Split Text

Description: Splits a large text input into smaller chunks of specified size with optional overlap.
Input Parameters: Text data, chunk size, chunk overlap, separator.
Output Parameters: Chunks of text as Data object.
Key Configurations/Conditions: chunk_size and chunk_overlap parameters control the splitting process.

Parse Data

Description: Converts Data objects into plain text based on a template.
Input Parameters: Data, template string, separator.
Output Parameters: Text representation of the Data.
Key Configurations/Conditions: The template string specifies how the data is formatted; it should include placeholders for data fields.

Prompt

Description: Creates a prompt for the OpenAI model by combining the context and question.
Input Parameters: Context (retrieved from Chroma DB), question (user query).
Output Parameters: A formatted prompt string.
Key Configurations/Conditions: Uses a template to structure the prompt; the template includes placeholders for context and question.

OpenAI Model

Description: Generates text using an OpenAI language model.
Input Parameters: Prompt, model name, temperature, maximum tokens.
Output Parameters: Generated text.
Key Configurations/Conditions: Requires an OpenAI API key and selection of a model.

Chat Output

Description: Displays a chat message in the user interface.
Input Parameters: Text message, sender type, sender name, session ID, conversation ID.
Output Parameters: Displays the message in the UI; no explicit output parameters.
Key Configurations/Conditions: Sender type and name determine how the message is displayed.

File

Description: Loads a file from a specified path.
Input Parameters: File path.
Output Parameters: Loaded file data as a Data object.
Key Configurations/Conditions: Supports various file types (listed in the JSON).

Workflow Execution

Chat Input receives user input.
Chat Input's output feeds into Prompt as the question, and also to Chroma DB (1) for retrieving relevant context from previously indexed text.
The results from Chroma DB (1) are formatted by Parse Data and fed into Prompt as the context.
Prompt generates a prompt string based on the context and question.
OpenAI Model generates a response based on the formatted prompt.
Chat Output displays the model's response in the chat interface.
File loads the document to be indexed.
Split Text processes the document, creating chunks.
OpenAI Embeddings (2) creates embeddings for each chunk.
Chroma DB (2) ingests the chunks and their embeddings.

Additional Notes

Successful execution requires valid API keys for OpenAI and proper configuration of the Chroma database. The performance of the workflow depends on the size of the document being processed and the speed of the OpenAI API and the Chroma database. Consider using efficient chunk sizes and optimizing the prompt engineering for better performance. Error handling is crucial, particularly concerning API requests and file loading. Consider adding error checks and fallback mechanisms.

graph TD %%{init: {'theme': 'mc','layout': 'elk'}}%% ChatInput-fONSM[<img src="/_astro/messages-square.BaSDmT6g.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">Chat Input] ChatInput-fONSM@{ shape: rounded} style ChatInput-fONSM stroke:#a170ff OpenAIEmbeddings-S3sDw[<img src="/_astro/openAI.CA91HhVI.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">OpenAI Embeddings] OpenAIEmbeddings-S3sDw@{ shape: rounded} style OpenAIEmbeddings-S3sDw stroke:#a170ff ParseData-exDEJ[<img src="/_astro/braces.Djq0PW4_.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">Parse Data] ParseData-exDEJ@{ shape: rounded} style ParseData-exDEJ stroke:#a170ff Prompt-coxOI[<img src="/_astro/square-terminal.BMOXc-nZ.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">Prompt] Prompt-coxOI@{ shape: rounded} style Prompt-coxOI stroke:#a170ff OpenAIModel-WrWp4[<img src="/_astro/openAI.CA91HhVI.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">OpenAI] OpenAIModel-WrWp4@{ shape: rounded} style OpenAIModel-WrWp4 stroke:#a170ff ChatOutput-9m6eP[<img src="/_astro/messages-square.BaSDmT6g.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">Chat Output] ChatOutput-9m6eP@{ shape: rounded} style ChatOutput-9m6eP stroke:#a170ff SplitText-I2N4B[<img src="/_astro/scissors-line-dashed.CajsPhTx.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">Split Text] SplitText-I2N4B@{ shape: rounded} style SplitText-I2N4B stroke:#a170ff OpenAIEmbeddings-DjVaM[<img src="/_astro/openAI.CA91HhVI.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">OpenAI Embeddings] OpenAIEmbeddings-DjVaM@{ shape: rounded} style OpenAIEmbeddings-DjVaM stroke:#a170ff ChromaVectorStoreComponent-m3pBz[<img src="/_astro/chroma.Ctr3VfNN.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">Chroma DB] ChromaVectorStoreComponent-m3pBz@{ shape: rounded} style ChromaVectorStoreComponent-m3pBz stroke:#a170ff ChromaVectorStoreComponent-QUsLQ[<img src="/_astro/chroma.Ctr3VfNN.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">Chroma DB] ChromaVectorStoreComponent-QUsLQ@{ shape: rounded} style ChromaVectorStoreComponent-QUsLQ stroke:#a170ff File-GTxND[<img src="/_astro/file-text.tXP78Pke.svg" class="w-8 h-8 max-w-12 object-contain mx-auto" width="20">File] File-GTxND@{ shape: rounded} style File-GTxND stroke:#a170ff ParseData-exDEJ -.- Prompt-coxOI linkStyle 0 stroke:#a170ff Prompt-coxOI -.- OpenAIModel-WrWp4 linkStyle 1 stroke:#a170ff OpenAIModel-WrWp4 -.- ChatOutput-9m6eP linkStyle 2 stroke:#a170ff ChatInput-fONSM -.- ChromaVectorStoreComponent-m3pBz linkStyle 3 stroke:#a170ff OpenAIEmbeddings-S3sDw -.- ChromaVectorStoreComponent-m3pBz linkStyle 4 stroke:#a170ff ChromaVectorStoreComponent-m3pBz -.- ParseData-exDEJ linkStyle 5 stroke:#a170ff SplitText-I2N4B -.- ChromaVectorStoreComponent-QUsLQ linkStyle 6 stroke:#a170ff OpenAIEmbeddings-DjVaM -.- ChromaVectorStoreComponent-QUsLQ linkStyle 7 stroke:#a170ff File-GTxND -.- SplitText-I2N4B linkStyle 8 stroke:#a170ff ChatInput-fONSM -.- Prompt-coxOI linkStyle 9 stroke:#a170ff