Document Insights
In this flow, a source document is ingested and an AI agent is used to analyze its content. Based on the identified key topics and concepts, the agent uses a web search tool to find relevant external resources
graph TD
%%{init: {'theme': 'mc','layout': 'elk'}}%%
ChatOutput-bqka0[<div><img src="/_astro/messages-square.BaSDmT6g.svg" style="height: 20px !important;width: 20px !important"/></div>Chat Output]
style ChatOutput-bqka0 stroke:#a170ff
ParseData-sqlix[<div><img src="/_astro/braces.Djq0PW4_.svg" style="height: 20px !important;width: 20px !important"/></div>Obtener Texto]
style ParseData-sqlix stroke:#a170ff
SearXng-5volz[Web Search SearXng]
style SearXng-5volz stroke:#a170ff
OpenAIModel-nkq0m[<div><img src="/_astro/openAI.BhmuxEs3.svg" style="height: 20px !important;width: 20px !important"/></div>OpenAI]
style OpenAIModel-nkq0m stroke:#a170ff
LanggraphReactAgent-dtv48[Agent]
style LanggraphReactAgent-dtv48 stroke:#a170ff
GDriveFilesComponent-f0or9[<div><img src="/_astro/google_drive.wKmDsV2c.svg" style="height: 20px !important;width: 20px !important"/></div>Obtener Documento]
style GDriveFilesComponent-f0or9 stroke:#a170ff
ParseData-sqlix -.- LanggraphReactAgent-dtv48
linkStyle 0 stroke:#a170ff
LanggraphReactAgent-dtv48 -.- ChatOutput-bqka0
linkStyle 1 stroke:#a170ff
SearXng-5volz -.- LanggraphReactAgent-dtv48
linkStyle 2 stroke:#a170ff
OpenAIModel-nkq0m -.- LanggraphReactAgent-dtv48
linkStyle 3 stroke:#a170ff
GDriveFilesComponent-f0or9 -.- ParseData-sqlix
linkStyle 4 stroke:#a170ff
Document Insights Workflow Documentation
đź§© Overview
The Document Insights workflow ingests a document from Google Drive, extracts its plain‑text content, and employs an AI agent to analyze the text.
Based on the identified topics, the agent performs a web search using a SearXNG instance and returns a list of relevant external resources in a chat‑style output. This end‑to‑end process streamlines research and knowledge‑summarization for any document.
⚙️ Main Features
- Seamless retrieval of a Google Drive document through a dedicated file‑access component.
- Automatic conversion of rich‑format files into plain text ready for analysis.
- A pre‑built Langgraph React Agent that interprets the text, decides what to search, and manages the flow of calls to external tools.
- Integration with a SearXNG search engine to gather up‑to‑date web references.
- Display of the final list of resources as a conversational chat message.
🔄 Workflow Steps
| Component Name | Role in the Workflow | Key Inputs | Key Outputs |
|---|---|---|---|
| Obtener Documento | Retrieves a specified file from Google Drive. | File selection (by ID or selection), operation set to Get. | Data – raw file content and metadata. |
| Obtener Texto | Converts the retrieved data into plain text using a template. | Data from Obtener Documento; template {text}. |
Text – the extracted document text. |
| OpenAI | Provides the language model that powers the agent’s reasoning. | Model name (e.g., gpt‑4o‑mini), optional parameters (temperature, max tokens). | LanguageModel – ready‑to‑use model instance. |
| Web Search (SearXng) | Builds a search tool that queries a SearXNG instance. | Search query (the agent’s output), maximum results (30). | Tool – a callable web‑search interface. |
| Agent | Orchestrates the conversation: takes the document text, uses the LLM, and calls the search tool when needed. | input_value – the text from Obtener Texto; llm from OpenAI; tools from Web Search. |
Response – a message containing a list of web links. |
| Chat Output | Presents the agent’s response in a chat‑style message. | input_value – the Response from Agent. |
Message – the final chat output shown to the user. |
đź§ Notes
- Credentials: A Google Drive credential is required for Obtener Documento; an OpenAI API key is needed for OpenAI.
- SearXNG Accessibility: The SearXNG instance must be reachable from the environment where the workflow runs; otherwise the search tool will fail.
- Iteration Limits: The agent is configured for a maximum of 50 iterations and 10 seconds per execution to prevent runaway loops.
- Fallback: No fallback LLM is connected; the workflow will halt if the primary model is unavailable.
- Data Privacy: The workflow only reads the file content and performs external searches; no data is stored persistently beyond the session unless explicitly configured.
- Extensibility: Additional tools or memory summarizers can be attached to the agent by modifying the
toolsor enablinguse_summarizer.