Vector Store RAG Workflow Documentation
Overview
This workflow implements a Retrieval Augmented Generation (RAG) system using a Chroma vector database. It takes user chat input, retrieves relevant context from a document via vector similarity search, generates a response using an OpenAI language model, and displays the response in a chat interface. The system allows for both single-turn and multi-turn conversations by leveraging the conversation's history. A long document is processed by first splitting it into chunks, embedding each, and then storing them in the Chroma database. This enables efficient search over large datasets.
Components Overview
The workflow utilizes the following components:
- Chat Input: Captures user input from the chat interface, including text, files, and metadata.
- OpenAI Embeddings: Generates embeddings for text using an OpenAI model.
- Chroma DB (Chroma VectorStoreComponent): A vector database storing document embeddings and facilitating similarity searches. Two instances are used: one for single-turn prompts and one for ingesting and searching the document chunks.
- Split Text: Divides a large input text file into smaller, manageable chunks.
- Parse Data: Formats data into a textual representation suitable for prompt engineering.
- Prompt: Constructs the prompt for the language model, combining the user's query and retrieved context.
- OpenAI Model: An OpenAI language model that generates text responses based on the provided prompt.
- Chat Output: Displays the AI's response in the chat interface.
- File: Loads a text file for processing and embedding.
Detailed Component Descriptions
Chat Input
- Description: Retrieves chat inputs (text, files, metadata) from the user interface.
- Input Parameters: Text, files, conversation ID, sender type, sender name, session ID, should store message (boolean).
- Output Parameters: Message object containing the user's input.
- Key Configurations/Conditions: Requires at least the input text or files to be provided.
OpenAI Embeddings (Two Instances)
- Description: Generates embeddings for text data using specified OpenAI models.
- Input Parameters: Text data, model name, chunk size (for long text input).
- Output Parameters: Embeddings representing the input text.
- Key Configurations/Conditions: Requires an OpenAI API key and specification of the embedding model.
Chroma DB (Two Instances)
- Description: Stores and searches vector embeddings.
- Input Parameters: Embeddings, search query, number of results, collection name, persist directory.
- Output Parameters: Search results (Data) containing relevant context from the stored embeddings.
- Key Configurations/Conditions: Requires a collection name and optionally a persist directory for saving the database.
Split Text
- Description: Splits a large text input into smaller chunks of specified size with optional overlap.
- Input Parameters: Text data, chunk size, chunk overlap, separator.
- Output Parameters: Chunks of text as Data object.
- Key Configurations/Conditions:
chunk_size
andchunk_overlap
parameters control the splitting process.
Parse Data
- Description: Converts Data objects into plain text based on a template.
- Input Parameters: Data, template string, separator.
- Output Parameters: Text representation of the Data.
- Key Configurations/Conditions: The template string specifies how the data is formatted; it should include placeholders for data fields.
Prompt
- Description: Creates a prompt for the OpenAI model by combining the context and question.
- Input Parameters: Context (retrieved from Chroma DB), question (user query).
- Output Parameters: A formatted prompt string.
- Key Configurations/Conditions: Uses a template to structure the prompt; the template includes placeholders for context and question.
OpenAI Model
- Description: Generates text using an OpenAI language model.
- Input Parameters: Prompt, model name, temperature, maximum tokens.
- Output Parameters: Generated text.
- Key Configurations/Conditions: Requires an OpenAI API key and selection of a model.
Chat Output
- Description: Displays a chat message in the user interface.
- Input Parameters: Text message, sender type, sender name, session ID, conversation ID.
- Output Parameters: Displays the message in the UI; no explicit output parameters.
- Key Configurations/Conditions: Sender type and name determine how the message is displayed.
File
- Description: Loads a file from a specified path.
- Input Parameters: File path.
- Output Parameters: Loaded file data as a Data object.
- Key Configurations/Conditions: Supports various file types (listed in the JSON).
Workflow Execution
- Chat Input receives user input.
- Chat Input's output feeds into Prompt as the question, and also to Chroma DB (1) for retrieving relevant context from previously indexed text.
- The results from Chroma DB (1) are formatted by Parse Data and fed into Prompt as the context.
- Prompt generates a prompt string based on the context and question.
- OpenAI Model generates a response based on the formatted prompt.
- Chat Output displays the model's response in the chat interface.
- File loads the document to be indexed.
- Split Text processes the document, creating chunks.
- OpenAI Embeddings (2) creates embeddings for each chunk.
- Chroma DB (2) ingests the chunks and their embeddings.
Additional Notes
Successful execution requires valid API keys for OpenAI and proper configuration of the Chroma database. The performance of the workflow depends on the size of the document being processed and the speed of the OpenAI API and the Chroma database. Consider using efficient chunk sizes and optimizing the prompt engineering for better performance. Error handling is crucial, particularly concerning API requests and file loading. Consider adding error checks and fallback mechanisms.