Chat with your documents
From a document you can ask the AI questions about it as if it were a conversation
+4
graph TD
%%{init: {'theme': 'mc','layout': 'elk'}}%%
ChatOutput-s3yx9[<div><img src="/_astro/messages-square.BaSDmT6g.svg" style="height: 20px !important;width: 20px !important"/></div>Chat Output]
style ChatOutput-s3yx9 stroke:#a170ff
ChatInput-nbeu2[<div><img src="/_astro/messages-square.BaSDmT6g.svg" style="height: 20px !important;width: 20px !important"/></div>Pregunta]
style ChatInput-nbeu2 stroke:#a170ff
GDriveFilesComponent-ield1[<div><img src="/_astro/google_drive.wKmDsV2c.svg" style="height: 20px !important;width: 20px !important"/></div>Obtener documento]
style GDriveFilesComponent-ield1 stroke:#a170ff
Prompt-bqb7h[<div><img src="/_astro/square-terminal.BMOXc-nZ.svg" style="height: 20px !important;width: 20px !important"/></div>Instrucciones]
style Prompt-bqb7h stroke:#a170ff
ParseData-u86r4[<div><img src="/_astro/braces.Djq0PW4_.svg" style="height: 20px !important;width: 20px !important"/></div>Obtener texto]
style ParseData-u86r4 stroke:#a170ff
Chroma-rr0og[<div><img src="/_astro/chroma.CDTUBZSx.svg" style="height: 20px !important;width: 20px !important"/></div>Subir a DB]
style Chroma-rr0og stroke:#a170ff
OpenAIEmbeddings-8lgaa[<div><img src="/_astro/openAI.BhmuxEs3.svg" style="height: 20px !important;width: 20px !important"/></div>OpenAI Embeddings]
style OpenAIEmbeddings-8lgaa stroke:#a170ff
LanguageRecursiveTextSplitter-60k4u[Separador de texto]
style LanguageRecursiveTextSplitter-60k4u stroke:#a170ff
Chroma-v9w2i[<div><img src="/_astro/chroma.CDTUBZSx.svg" style="height: 20px !important;width: 20px !important"/></div>Obtener de DB]
style Chroma-v9w2i stroke:#a170ff
OpenAIEmbeddings-3cp2j[<div><img src="/_astro/openAI.BhmuxEs3.svg" style="height: 20px !important;width: 20px !important"/></div>OpenAI Embeddings2]
style OpenAIEmbeddings-3cp2j stroke:#a170ff
OpenAIModel-h9hjf[<div><img src="/_astro/openAI.BhmuxEs3.svg" style="height: 20px !important;width: 20px !important"/></div>OpenAI]
style OpenAIModel-h9hjf stroke:#a170ff
ParseData-u86r4 -.- Prompt-bqb7h
linkStyle 0 stroke:#a170ff
ChatInput-nbeu2 -.- Prompt-bqb7h
linkStyle 1 stroke:#a170ff
GDriveFilesComponent-ield1 -.- LanguageRecursiveTextSplitter-60k4u
linkStyle 2 stroke:#a170ff
LanguageRecursiveTextSplitter-60k4u -.- Chroma-rr0og
linkStyle 3 stroke:#a170ff
OpenAIEmbeddings-8lgaa -.- Chroma-rr0og
linkStyle 4 stroke:#a170ff
Chroma-v9w2i -.- ParseData-u86r4
linkStyle 5 stroke:#a170ff
OpenAIEmbeddings-3cp2j -.- Chroma-v9w2i
linkStyle 6 stroke:#a170ff
Prompt-bqb7h -.- OpenAIModel-h9hjf
linkStyle 7 stroke:#a170ff
OpenAIModel-h9hjf -.- ChatOutput-s3yx9
linkStyle 8 stroke:#a170ff
Chat with Your Documents
đź§© Overview
The workflow allows a user to ask natural‑language questions about documents stored in Google Drive. It automatically retrieves the relevant document, splits and embeds the text, stores the embeddings in a vector database, and then uses an OpenAI language model to answer the question in a conversational style. The result is presented as a chat message, providing a seamless, AI‑powered Q&A experience over existing documents.
⚙️ Main Features
- Document retrieval from Google Drive using a user‑selected file.
- Text chunking that respects language structure and avoids breaking sentences.
- Embedding generation with OpenAI models for semantic indexing.
- Vector store integration (Chroma) for fast similarity search.
- Dynamic prompt construction that injects relevant document excerpts.
- LLM response generation via OpenAI, outputting natural language answers.
- Chat‑style output that displays the answer as a message in the playground.
🔄 Workflow Steps
| Component Name | Role in the Workflow | Key Inputs | Key Outputs |
|---|---|---|---|
| Chat Input | Captures the user’s question. | Text of the question. | Message containing the question. |
| Google Drive File Retrieval | Loads the selected document from Drive. | File ID or selection. | Raw document data (binary/text). |
| Text Splitter | Divides the document into language‑aware chunks. | Raw document data. | List of text chunks. |
| OpenAI Embeddings | Converts text chunks into dense vectors. | List of text chunks. | Embeddings vector for each chunk. |
| Chroma Vector Store (Add) | Stores embeddings for later retrieval. | Embeddings vector. | Persisted vector store entry. |
| Chroma Vector Store (Search) | Finds the most relevant chunks for the user query. | User query, vector store. | Subset of document chunks (relevant excerpts). |
| Parse Data | Converts retrieved chunks into plain text. | Relevant document chunks. | Concatenated text excerpt. |
| Prompt Builder | Creates a prompt that includes the excerpt and the question. | Extracted text, user question. | Prompt message ready for the LLM. |
| OpenAI Model | Generates an answer to the prompt. | Prompt message. | Generated text response. |
| Chat Output | Presents the answer in the playground as a chat message. | Generated text. | Chat message displayed to the user. |
đź§ Notes
- The workflow requires valid Google Drive credentials and an OpenAI API key.
- The vector store persists data under the directory
chat_wiht_documents; ensure this path is writable. - Only the most recent 100 documents are indexed when using the “All Files in Drive” mode to avoid exceeding API limits.
- The embedding operation is limited to 10,000 tokens per request to comply with OpenAI’s token restrictions.
- The similarity search uses a cosine similarity threshold of 0.1; this can be tuned for stricter or looser matching.
- The OpenAI model defaults to
gpt‑4.1; users may switch to other models via themodel_nameparameter, which may affect token limits and cost. - All components operate in batch mode where possible, improving throughput for large documents.