Extractor de datos de facturas
Este flujo automatiza el proceso de extracción de datos de facturas no estructuradas
+2
graph TD
%%{init: {'theme': 'mc','layout': 'elk'}}%%
ParseData-8kr6e[<div><img src="/_astro/braces.Djq0PW4_.svg" style="height: 20px !important;width: 20px !important"/></div>Parse Data]
style ParseData-8kr6e stroke:#a170ff
Prompt-0gsjq[<div><img src="/_astro/square-terminal.BMOXc-nZ.svg" style="height: 20px !important;width: 20px !important"/></div>Extractor de Informacion]
style Prompt-0gsjq stroke:#a170ff
OpenAIModel-m3qyl[<div><img src="/_astro/openAI.BhmuxEs3.svg" style="height: 20px !important;width: 20px !important"/></div>OpenAI]
style OpenAIModel-m3qyl stroke:#a170ff
TextInput-84vxn[<div><img src="/_astro/type.Dy26vmDy.svg" style="height: 20px !important;width: 20px !important"/></div>Text Input]
style TextInput-84vxn stroke:#a170ff
GDriveFilesComponent-7oth8[<div><img src="/_astro/google_drive.wKmDsV2c.svg" style="height: 20px !important;width: 20px !important"/></div>Drive File Manager]
style GDriveFilesComponent-7oth8 stroke:#a170ff
TextOutput-0lnck[<div><img src="/_astro/type.Dy26vmDy.svg" style="height: 20px !important;width: 20px !important"/></div>Text Output]
style TextOutput-0lnck stroke:#a170ff
ParseData-8kr6e -.- Prompt-0gsjq
linkStyle 0 stroke:#a170ff
Prompt-0gsjq -.- OpenAIModel-m3qyl
linkStyle 1 stroke:#a170ff
GDriveFilesComponent-7oth8 -.- ParseData-8kr6e
linkStyle 2 stroke:#a170ff
TextInput-84vxn -.- GDriveFilesComponent-7oth8
linkStyle 3 stroke:#a170ff
OpenAIModel-m3qyl -.- TextOutput-0lnck
linkStyle 4 stroke:#a170ff
Invoice Data Extractor
🧩 Overview
This workflow automates the extraction of structured data from unstructured invoice documents. It ingests an invoice file from a source like Google Drive, processes its content, and uses a large language model to accurately identify and output key fields such as invoice numbers, dates, sender and recipient details, and financial totals. This process transforms raw document data into a clean, structured format suitable for further analysis or record-keeping.
⚙️ Main Features
- Automatically retrieves invoice files from a specified Google Drive folder.
- Converts raw file data into plain text for processing.
- Uses a detailed, structured prompt to guide an AI model in extracting specific invoice fields.
- Outputs the extracted, structured data in a clear, readable format.
🔄 Workflow Steps
| Component Name | Role in the Workflow | Key Inputs | Key Outputs |
|---|---|---|---|
| Text Input | Provides the URL of the Google Drive folder containing the invoice file. | Folder URL | Folder URL |
| Drive File Manager | Retrieves the invoice file from the specified Google Drive folder. | Folder URL | Raw File Data |
| Parse Data | Converts the raw file data into plain text for the AI model to read. | Raw File Data | Invoice Text Data |
| Information Extractor (Prompt) | Constructs a detailed instruction for the AI model, specifying which data points to extract from the invoice text. | Invoice Text Data | Structured Extraction Prompt |
| OpenAI Model | Analyzes the invoice text using the provided prompt and extracts the requested structured data. | Structured Extraction Prompt | Extracted Invoice Data |
| Text Output | Displays the final, structured invoice data extracted by the AI model. | Extracted Invoice Data | Final Structured Output |
🧠 Notes
- The workflow is designed to handle unstructured invoice documents, such as PDFs or images, by first converting them to text.
- The accuracy of data extraction depends on the quality of the source document and the clarity of the text conversion.
- A valid OpenAI API key and Google Drive credentials are required for the respective components to function.
- The model is configured for deterministic output with a low temperature to ensure consistent extraction results.