Flujo de Captura de contactos de Negocios
Busca negocios por nicho, filtra sitios oficiales, extrae datos clave mediante scraping y añade contactos verificados automáticamente a Google Sheets.
graph TD
%%{init: {'theme': 'mc','layout': 'elk'}}%%
SearXng-ntpo0[Web Search SearXng]
style SearXng-ntpo0 stroke:#a170ff
DeepseekModel-yd7iq[Deepseek]
style DeepseekModel-yd7iq stroke:#a170ff
CreateData-dfb3f[Create Data]
style CreateData-dfb3f stroke:#a170ff
Switch-56w06[Switch]
style Switch-56w06 stroke:#a170ff
WebScraper-p0rr5[Web Scraper]
style WebScraper-p0rr5 stroke:#a170ff
DeepseekModel-4gnpf[Deepseek2]
style DeepseekModel-4gnpf stroke:#a170ff
CreateData-p25ng[Create Data2]
style CreateData-p25ng stroke:#a170ff
TextInput-3uo52[<div><img src="/_astro/type.Dy26vmDy.svg" style="height: 20px !important;width: 20px !important"/></div>Cantidad de citios ]
style TextInput-3uo52 stroke:#a170ff
TextInput-wcn04[<div><img src="/_astro/type.Dy26vmDy.svg" style="height: 20px !important;width: 20px !important"/></div>Query]
style TextInput-wcn04 stroke:#a170ff
CreateData-taieq[Create Data3]
style CreateData-taieq stroke:#a170ff
Switch-44sfm[Switch2]
style Switch-44sfm stroke:#a170ff
AdvancedAgent-plvkg[Agent]
style AdvancedAgent-plvkg stroke:#a170ff
GSheetCellComponent-usi3o[Sheet Cells ]
style GSheetCellComponent-usi3o stroke:#a170ff
DeepseekModel-c77dx[Deepseek3]
style DeepseekModel-c77dx stroke:#a170ff
SearXng-ntpo0 -.- DeepseekModel-yd7iq
linkStyle 0 stroke:#a170ff
DeepseekModel-yd7iq -.- CreateData-dfb3f
linkStyle 1 stroke:#a170ff
CreateData-dfb3f -.- Switch-56w06
linkStyle 2 stroke:#a170ff
Switch-56w06 -.- WebScraper-p0rr5
linkStyle 3 stroke:#a170ff
WebScraper-p0rr5 -.- DeepseekModel-4gnpf
linkStyle 4 stroke:#a170ff
CreateData-p25ng -.- SearXng-ntpo0
linkStyle 5 stroke:#a170ff
TextInput-3uo52 -.- CreateData-p25ng
linkStyle 6 stroke:#a170ff
TextInput-wcn04 -.- CreateData-p25ng
linkStyle 7 stroke:#a170ff
DeepseekModel-4gnpf -.- CreateData-taieq
linkStyle 8 stroke:#a170ff
CreateData-taieq -.- Switch-44sfm
linkStyle 9 stroke:#a170ff
Switch-44sfm -.- AdvancedAgent-plvkg
linkStyle 10 stroke:#a170ff
GSheetCellComponent-usi3o -.- AdvancedAgent-plvkg
linkStyle 11 stroke:#a170ff
DeepseekModel-c77dx -.- AdvancedAgent-plvkg
linkStyle 12 stroke:#a170ff
Flujo de Captura de contactos de Negocios
🧩 Overview
This workflow automates the process of business contact prospecting and lead generation for a specific niche. It begins with a user-defined business category, performs a web search, and then intelligently filters and scrapes official business websites to extract key contact information. The validated data is finally structured and inserted into a Google Sheets database, creating a ready-to-use contact list.
⚙️ Main Features
- Initiates a web search based on a user-provided business niche and desired number of results.
- Filters search results using an AI model to identify only official business websites.
- Scrapes the content of validated websites to extract business names, descriptions, emails, and phone numbers.
- Structures the extracted data into a standardized format and routes it based on validation success.
- Automatically adds successfully extracted and validated contact records to a specified Google Sheets spreadsheet.
🔄 Workflow Steps
| Component Name | Role in the Workflow | Key Inputs | Key Outputs |
|---|---|---|---|
| Create Data | Creates the initial search parameters by combining the user's niche query and the desired number of results. | User query, Number of results | Structured search parameters |
| Web Search (SearXng) | Performs a web search using the provided parameters to find websites related to the business niche. | Search parameters (query, max results) | List of search results (titles, URLs) |
| Deepseek Model | Acts as a website classifier. It analyzes the title of each search result to determine if it belongs to an official business site. | List of website titles | Binary classification ("Sí" for official, "No" for others) |
| Create Data | Structures the classifier's verdict with the corresponding website name and URL into a unified data record. | Classification result, Website name, URL | Data records with label, site name, and URL |
| Switch | Routes the data based on the classifier's label. It separates records marked as official sites ("Sí") for further processing. | Data records with classification label | Filtered list of official site records |
| Web Scraper | Scrapes the content from the URLs of the filtered official business websites. | List of official website URLs | Raw scraped content from each site |
| Deepseek Model | Acts as a structured data extractor. It analyzes the scraped website content to pull out specific business contact details. | Scraped website content | Extracted business data (name, email, phone, description) or "No" if data is insufficient |
| Create Data | Re-structures the extracted business information into a standardized data format for the final steps. | Extracted business data | Structured business contact records |
| Switch | Routes the structured contact data based on the extraction success. It filters out records where extraction failed (marked "No"). | Structured business contact records | Filtered list of valid, complete contact records |
| Agent | An AI agent equipped with a tool to interact with Google Sheets. It receives the valid contact records and is instructed to add them as new rows. | List of valid contact records, System prompt, Language Model, Tool | Agent execution logic |
| Sheet Cells | Provides the tool that allows the agent to perform the "Add Row" operation on a specified Google Sheets spreadsheet. | Google Sheets configuration | Tool for adding rows to Google Sheets |
| Deepseek Model | Serves as the Language Model that powers the decision-making of the AI agent. | Model configuration | Language Model instance for the agent |
🧠 Notes
- The workflow is designed for batch processing, handling multiple websites in parallel for efficiency.
- It implements a two-stage validation process: first filtering for official sites, then validating the completeness of extracted contact data.
- The success of data extraction depends on the availability and clarity of contact information on the target websites.
- The workflow requires valid API credentials for the Deepseek language model and Google Sheets.
- The final output destination is a user-selected Google Sheets spreadsheet and worksheet.