Invoice Data Extractor
This flow automates the process of extracting data from unstructured invoices.
+2
graph TD
%%{init: {'theme': 'mc','layout': 'elk'}}%%
ParseData-8kr6e[<div><img src="/_astro/braces.Djq0PW4_.svg" style="height: 20px !important;width: 20px !important"/></div>Parse Data]
style ParseData-8kr6e stroke:#a170ff
Prompt-0gsjq[<div><img src="/_astro/square-terminal.BMOXc-nZ.svg" style="height: 20px !important;width: 20px !important"/></div>Extractor de Informacion]
style Prompt-0gsjq stroke:#a170ff
OpenAIModel-m3qyl[<div><img src="/_astro/openAI.BhmuxEs3.svg" style="height: 20px !important;width: 20px !important"/></div>OpenAI]
style OpenAIModel-m3qyl stroke:#a170ff
ChatOutput-jbpok[<div><img src="/_astro/messages-square.BaSDmT6g.svg" style="height: 20px !important;width: 20px !important"/></div>Chat Output]
style ChatOutput-jbpok stroke:#a170ff
GDriveFilesComponent-ls7bc[<div><img src="/_astro/google_drive.wKmDsV2c.svg" style="height: 20px !important;width: 20px !important"/></div>Google Drive File Manager]
style GDriveFilesComponent-ls7bc stroke:#a170ff
ParseData-8kr6e -.- Prompt-0gsjq
linkStyle 0 stroke:#a170ff
Prompt-0gsjq -.- OpenAIModel-m3qyl
linkStyle 1 stroke:#a170ff
OpenAIModel-m3qyl -.- ChatOutput-jbpok
linkStyle 2 stroke:#a170ff
GDriveFilesComponent-ls7bc -.- ParseData-8kr6e
linkStyle 3 stroke:#a170ff
Invoice Data Extractor
đź§© Overview
The workflow automates the extraction of structured data from unstructured invoice files stored in Google Drive. It converts the file into plain text, generates a targeted extraction prompt, queries an OpenAI model, and delivers the extracted information in a chat‑friendly format.
⚙️ Main Features
- Retrieves an invoice file from Google Drive based on user‑specified parameters.
- Parses the file contents into a single text string for downstream processing.
- Builds a dynamic prompt that lists the specific fields to extract from the invoice.
- Sends the prompt to an OpenAI language model and obtains the extraction results.
- Formats the model output as a chat message for easy review.
🔄 Workflow Steps
| Component Name | Role in the Workflow | Key Inputs | Key Outputs |
|---|---|---|---|
| Google Drive File Manager | Accesses the invoice file in Google Drive and returns its content. | File ID or selection, operation mode (e.g., Get), credentials | Data (file content) |
| Parse Data | Converts the raw file data into plain text using a user‑defined template. | Data (file content) | Text (invoice data as a string) |
| Prompt Component | Creates a detailed extraction prompt that enumerates the desired invoice fields. | Invoice data (text) | Prompt Message (text) |
| OpenAI Model | Generates the extraction results by evaluating the prompt with the selected model. | Prompt Message | Text (model response containing extracted fields) |
| Chat Output | Presents the model’s response as a chat‑style message. | Model response (text) | Chat Message (displayed output) |
Note: The Label Component is only used to display the workflow description and does not participate in data processing.
đź§ Notes
- The Google Drive component requires appropriate OAuth credentials and the specified file must be accessible with the selected operation mode.
- Parse Data relies on a template; the default template simply returns the text, but can be customized to filter or reformat the input.
- The Prompt Component must include all target fields; the OpenAI model will only extract what is explicitly requested in the prompt.
- The workflow uses the gpt‑4o model by default, but the model name can be changed to suit cost or performance requirements.
- Model output should be reviewed for accuracy; the workflow assumes the prompt is phrased to return results in a clear, structured format.
- Chat Output presents the final text but does not parse or validate the data; downstream systems can ingest the message content if needed.