Introduction: Why Your Startup Should Own Its Enrichment Stack
Most early-stage startups get stuck in a trap. They sign up for platforms like Apollo, hoping to scale outreach quickly. But those tools quickly become expensive, often charging $49 to $79 per user per month for access to enrichment data. As your team grows, so does your bill — and you own none of the data you collect.
Building your own company enrichment API changes this completely. You escape the “data lock-in” of commercial platforms and achieve total data ownership. You pay only for the API calls you consume. And you get unlimited internal access through a fast, Supabase-powered endpoint.
Who This Guide Is For
This guide is built for startup founders, marketing managers, and growth hackers working in teams of 1 to 20 people. If you are bootstrapped, seed-stage, or simply focused on cost-effective customer acquisition, this approach is for you. According to data from Gartner, 2025, data acquisition costs for early-stage companies have risen by 20% year-over-year. Building your own stack is the smartest way to push back.
What You Will Build
You will build a self-hosted lead enrichment API that works like this: it fetches firmographic data from Proxycurl, stores everything in Supabase PostgreSQL, and serves it all through a simple REST endpoint. This is the same infrastructure that powers modern lead generation workflows — but without the recurring subscription fees.
Understanding what comes after lead generation is just as important as collecting data. With your own API, you can route enriched leads into CRMs, email sequences, or scoring models without delay.
In the next section, we will walk through the technical prerequisites you need before writing a single line of code.
Prerequisites: Technical Requirements
Before you start building your company enrichment API, let us gather what you need. The good news is that the bar is low. You do not need a large engineering team or a big budget.
First, you need a Supabase account. The free tier works perfectly here. It includes 500 MB of storage and 2 GB of bandwidth. That is more than enough for prototyping and early production use. If you already run a lead generation pipeline, you might find our guide on lead generation in sales useful as context.
Second, you need a Proxycurl API key. Register at nubela.co/proxycurl to get one. Proxycurl uses a pay-as-you-go system. This is ideal for startups. You only pay for the API calls you make. No monthly subscription locks you in.
Third, you need basic TypeScript or JavaScript skills. The code in this guide runs on the Deno runtime. If you have written a simple API endpoint before, you are ready. No advanced backend experience is required.
Fourth, grab a REST client like Postman or Insomnia. These tools let you test your API endpoints easily. You should also prepare a small list of 5 to 10 test domains. Use well-known ones like stripe.com or zapier.com. These help you verify that your enrichment function works.
Once you have these four items ready, you can move to the database setup. Let us configure the schema in the next step.
Step 1: Project Configuration and Database Design
Head to the Supabase dashboard and click New Project. Name it something descriptive like company-enrichment-api. This keeps things organized if you run multiple projects later.
Choose a region close to your target audience. If you enrich mostly US-based companies, pick a US data center. This reduces latency for your API calls. Once you hit Create New Project, Supabase takes roughly 2–5 minutes to provision your database instance.
After provisioning finishes, you will land in the project dashboard. This is your control center. From here, you can manage your database, authentication, and edge functions. Next, you need to define the database table that will store all enriched company records. Let’s head to the SQL Editor to get started.
Defining the Company Schema
Now it’s time to design the database table that will store your enriched company data. Head to the SQL Editor in your Supabase dashboard to create the companies table.
We choose the domain as the unique identifier for each company. Company names change frequently due to rebrands or acquisitions. Domains, however, stay stable and predictable. This makes them perfect for deduplication — you never want two rows for the same company cluttering your database.
Your table needs columns for every field Proxycurl returns: name, industry, employee count, location, funding data, and more. We also include a last_enriched_at timestamp to track data freshness. The full SQL statement below sets everything up with proper indexes for fast lookups.
SQL Table Definition
The SQL above creates a companies table designed for deduplication and flexibility. A UUID primary key keeps the schema portable, while the domain column enforces uniqueness with a UNIQUE constraint. As mentioned earlier, domains are more stable than company names for matching records over time.
The raw_data column uses the JSONB data type. This stores the full Proxycurl response so you can extract new fields later without altering the schema. The last_enriched_at timestamp tracks data freshness, which we will use in the next step to avoid redundant API calls.
Finally, the three indexes speed up common lookups. Queries filtering by domain, industry, or country will run much faster as your database grows. With the schema ready, it is time to secure your API key and connect to Proxycurl.
Step 2: Securing the Proxycurl Integration
Security is critical when building an open source Apollo alternative. You must never store API keys directly in your code. Hardcoding secrets is a common mistake that can lead to data leaks and unexpected charges.
Instead, store your Proxycurl API key as a Supabase Secret. This keeps it safe and injects it as an environment variable at runtime. Secrets stay encrypted and never appear in your source code or version control.
This approach aligns with best practices for a self-hosted lead enrichment stack. It also makes your API keys easy to rotate without redeploying your entire function.
Let’s walk through exactly how to configure these secrets in your Supabase project below.
Setting Up Environment Secrets
To add your Proxycurl API key, open the Supabase dashboard. Go to Settings → API → Edge Functions Secrets. Click New Secret and name it PROXYCURL_API_KEY. Paste your key as the value and save.
Next, check the current rate limit for Proxycurl’s pay-as-you-go tier. The limit dictates how many requests you can send per second. Knowing this helps your function handle concurrent requests safely. You can add retry logic if you hit the limit.
With the key secured, you are ready to build the enrichment function.
Step 3: Developing the Enrichment Edge Function
Now it is time to build the brains of your API. Create a new Edge Function in Supabase named enrich-company. This function sits between your database and Proxycurl as the middleware.
It receives a domain name, fetches company data from Proxycurl, and stores the result in your companies table. Because you already stored the Proxycurl secret in Step 2, the function can authenticate securely without exposing any keys.
Edge Functions run on Deno, which means you write TypeScript on the server side. The free tier includes 500,000 invocations per month, more than enough for early-stage enrichment. Let us walk through the core logic next.
Core Enrichment Logic
The enrich-company function follows a clear four-step pipeline. First, it validates the incoming domain. If no domain is provided, the function returns a 400 error immediately. This prevents wasted API calls on malformed requests.
Next, the function checks the Supabase companies table for existing data. If a record already exists and it was enriched recently, the function can skip the Proxycurl call entirely. This logic saves both time and credits.
If the data is stale or missing, the function calls the Proxycurl API with the given domain. The API returns firmographic data such as industry, employee count, and location. The function then upserts this data into PostgreSQL using the domain as the unique conflict key.
The implementation code below shows exactly how this pipeline works in TypeScript. The structure keeps the function lean and easy to modify.
Implementation Code (TypeScript)
Now let’s walk through the actual code. This Edge Function sits between your Supabase database and Proxycurl. It receives a domain, fetches company data, and stores the result.
import { serve } from "https://deno.land/std@0.168.0/http/server.ts";
import { createClient } from "https://esm.sh/@supabase/supabase-js@2";
serve(async (req) => {
const { domain } = await req.json();
if (!domain) return new Response(JSON.stringify({ error: "Domain required" }), { status: 400 });
const supabase = createClient(Deno.env.get("SUPABASE_URL")!, Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!);
// Proxycurl Request
const proxycurlRes = await fetch(`https://nubela.co/proxycurl/api/v2/linkedin/company?domain=${domain}`, {
headers: { Authorization: `Bearer ${Deno.env.get("PROXYCURL_API_KEY")}` }
});
const data = await proxycurlRes.json();
// Upsert with Deduplication
const { data: company } = await supabase.from("companies").upsert({
domain, name: data.name, industry: data.industry,
last_enriched_at: new Date().toISOString()
}, { onConflict: "domain" }).select().single();
return new Response(JSON.stringify(company), { headers: { "Content-Type": "application/json" } });
});
The function starts by parsing the request body and extracting the domain field. If the domain is missing, it returns a 400 error immediately. This simple validation prevents wasted API calls.
Next, it creates a Supabase client using environment variables. The SUPABASE_SERVICE_ROLE_KEY bypasses Row-Level Security. This is intentional — the function acts as an admin middleware, and you control access at the function level instead.
Then the function calls Proxycurl’s Company Lookup API endpoint. It passes the domain as a query parameter and includes the API key in the Authorization header. The response contains firmographic data like company name, industry, employee count, and location.
Finally, it upserts the data into the companies table. The onConflict: "domain" clause ensures deduplication. If a record with that domain already exists, Supabase updates it. If not, it inserts a new row. The function returns the enriched company object as JSON.
Once enriched data lands in your Supabase database, you can use it for smarter lead scoring. Check out our guide on setting up a lead scoring model in Google Sheets with historical RFQ data to see how enriched attributes like industry and employee count feed into qualification logic. In the next section, we’ll add data freshness rules to prevent unnecessary API spending.
Managing Data Freshness
The core enrichment logic above works well for one-off lookups. But every call to Proxycurl costs credits. You also waste time fetching data you already have.
Company data decays at an estimated 2–3% per month, according to industry research from ZoomInfo. That means a profile you enriched six months ago could be significantly outdated. Stale firmographics lead to wrong targeting and wasted outreach efforts.
A simple freshness gate prevents these redundant API calls. The code below checks the last_enriched_at timestamp before making a new request. If the record is less than 30 days old, the function returns the existing data immediately.
const THIRTY_DAYS = 30 * 24 * 60 * 60 * 1000;
if (existing && (Date.now() - new Date(existing.last_enriched_at).getTime()) < THIRTY_DAYS) {
return new Response(JSON.stringify(existing));
}
This logic keeps your database current without burning through your API budget. It also speeds up response times for frequently queried domains. For more on turning enriched data into actual customers, check out The Complete Guide to Finding Your Best Customers.
Once freshness is handled, the next step is securing your endpoint. Let’s look at authentication and access control.
Step 4: Security and API Exposure
Your Edge Function is now live at https://<project-ref>.supabase.co/functions/v1/enrich-company. But exposing an enrichment endpoint without protection is risky. You need to lock it down before any real usage begins.
Exposed APIs can be abused by bots or competitors. A single malicious user could drain your Proxycurl credits in minutes. The good news? Supabase provides three straightforward security layers that work together.
Authentication and Access Control
Start with Bearer token authentication. Require every incoming request to include a valid Supabase anon key or a custom JWT in the Authorization header. This ensures only your internal tools or authorized users can call the function. Without this gate, anyone who finds your endpoint can enrich any domain on your dime.
Next, enable Row-Level Security (RLS) on the companies table. RLS acts as a database-level filter. Even if a request reaches your database, RLS policies determine which rows the caller can see or modify. For example, you can create a policy that only allows inserts from the enrichment function itself, preventing external tampering with stored data.
Finally, configure CORS headers if you plan to call the API from a browser-based tool. Supabase Edge Functions do not set CORS by default. You must explicitly handle the OPTIONS preflight request and return Access-Control-Allow-Origin headers. This is critical if your frontend application or a no-code tool like What is Lead Generation in Sales? Avoid Costly Acquisition Pitfalls triggers enrichment workflows directly from a dashboard.
Once these security layers are active, your API is production-ready. In the next step, you will test the entire pipeline end to end.
Authentication and Access Control
Your enrichment function is now live at https://<project-ref>.supabase.co/functions/v1/enrich-company. But an open endpoint is a security risk. You need to lock it down.
Bearer Tokens are the first line of defense. Require a Supabase anon key or a custom token in the Authorization header. This ensures only authenticated requests reach your enrichment logic. Without this, anyone who discovers your endpoint could drain your Proxycurl credits.
Row-Level Security (RLS) adds another layer. Enable RLS on the companies table to control read and write access at the row level. This is especially important if your team expands. Different users or services should only see the data they need. A solid lead scoring model in Google Sheets with RFQ data is only useful if your underlying company data is secure.
CORS Configuration matters if you call the API from a browser-based tool. Set the Access-Control-Allow-Origin header in your function response to allow specific origins. For server-to-server calls, you can skip this. But for in-app enrichment, a missing CORS header will block the request entirely. A quick fix is to handle the OPTIONS preflight request explicitly in your Edge Function.
Step 5: Testing and Troubleshooting
Now it is time to validate the entire pipeline. Run the following cURL command from your terminal to test the enrichment function:
curl -X POST https://<project-ref>.supabase.co/functions/v1/enrich-company -H "Authorization: Bearer <anon-key>" -d '{"domain": "stripe.com"}'
Replace <project-ref> with your Supabase project reference. Swap <anon-key> with your actual anon key from the Supabase dashboard. A successful response returns a JSON object with company data like name, industry, and employee count.
If you see an error, check the function logs in the Supabase dashboard under Edge Functions. Common issues include missing secrets, incorrect domain formatting, or a mistyped API key.
Common Technical Hurdles
Even a well-built enrichment API can hit a few snags. Here are the most frequent issues and how to fix them.
Handling Timeouts and CORS
Edge Functions on the free tier have a 60-second timeout limit. Proxycurl responses usually arrive in 2–5 seconds, but slow or large payloads can push past that limit. For bulk jobs, consider using async background processing instead of real-time calls.
CORS errors happen when your frontend tries to call the function from a different origin. The browser blocks the request unless your function explicitly allows it. This is a common blocker when integrating enrichment into your own web app or tool.
CORS Errors
To fix CORS issues, add explicit headers to your function response. Include Access-Control-Allow-Origin: * for development or restrict it to your specific domain. Also handle the OPTIONS preflight request that browsers send before the actual POST.
Here is a quick snippet to add to your function:
const headers = {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "POST, OPTIONS",
"Access-Control-Allow-Headers": "Content-Type, Authorization"
};
if (req.method === "OPTIONS") return new Response(null, { headers });
Empty Results
Proxycurl may return null values for companies it has not indexed yet. This is normal for very small or private businesses. Add an enrichment_status column to your companies table to track partial matches. This way, you know which domains need manual review or re-enrichment later.
A good practice is to log enrichment failures into a separate table. This helps you audit the quality of your data over time. If you want to build a more robust lead qualification workflow, check out how to set up a lead scoring model in Google Sheets with historical data.
Once your function runs without errors and returns clean data, you are ready to move forward. The next section covers scaling your enrichment API for production use.
Handling Timeouts and CORS
Even a well-tested enrichment API will hit snags in production. Here are the three most common issues and how to fix them.
Timeouts. Supabase Edge Functions on the free tier have a 60-second execution limit. If you enrich one domain per request, this is rarely a problem. But for bulk jobs, a single slow Proxycurl call can kill the entire function. The fix is simple: use async background processing. Queue domains in a database table and process them one by one via Database Webhooks. This keeps your API responsive.
CORS Errors. If you call your function from a browser-based tool like a CRM plugin or a React app, you will see CORS errors. Your function must explicitly handle the OPTIONS preflight request. Add logic at the top of your handler to return the proper Access-Control-Allow-Origin headers. Without this, browser requests fail silently.
Empty Results. Proxycurl may return null values for small or unindexed companies. For example, a 5-person startup might have no LinkedIn company page. To handle this gracefully, add an enrichment_status flag to your schema. Set it to "partial" when fields are missing. This lets your lead enrichment workflow filter incomplete records later. After all, building a reliable company enrichment API using Proxycurl and Supabase means planning for edge cases — not just the happy path.
These fixes prepare your API for real-world use. Once stable, you can explore scaling options and automation in the next step.
Step 6: Scaling and Advanced Enhancements
Once the core API is functional, you can add upgrades that boost efficiency and data coverage. These enhancements turn a simple enrichment tool into a robust growth engine for your startup. Let’s walk through the most impactful improvements.
Automation and Integration
Manual enrichment calls are fine for testing. But for real-world use, you need automation. Scheduled re-enrichment keeps your data fresh without human effort. Use pg_cron within Supabase to refresh records older than 30 days. This ensures your team always works with current firmographic data. It also prevents stale leads from slipping through the cracks.
Async bulk queue is another powerful upgrade. Instead of enriching domains one by one, POST a list of domains to a queue table. Use Database Webhooks to trigger the enrichment function automatically. This lets you process hundreds of companies in the background. Your team can focus on outreach while the API handles the data work.
Multi-source enrichment adds depth to your records. Proxycurl provides excellent LinkedIn firmographics. But you can supplement it with Clearbit for company logos and Crunchbase for funding details. Combining sources gives you a richer picture of each prospect. This layered approach helps with understanding what comes after lead generation — moving from raw data to actionable sales insights.
Real-World Use Case: Cold Outreach
Let’s look at a concrete example. Manual research for 200 leads takes about 6.7 hours. That’s nearly a full workday of copying, pasting, and formatting data. Your enrichment API processes the same 200 domains in under 5 minutes. The cost? Roughly $2 to $4 in Proxycurl credits.
This speed changes your outreach strategy entirely. Enriched data can flow directly into CRMs like HubSpot. Or you can push it into email tools like Instantly via Supabase’s REST API. The result is a seamless pipeline from enrichment to engagement. For teams exploring the real goal of lead generation, this automation is a game-changer. It lets you focus on personalized messaging instead of data grunt work.
Once your scaling enhancements are in place, head to the next section for a final checklist and answers to common questions.
Automation and Integration
Once your company enrichment API is running, the next step is making it work on autopilot. A scheduled re-enrichment strategy keeps your data fresh without manual intervention. You can use pg_cron in Supabase to automatically refresh records older than 30 days. This pairs perfectly with the freshness logic we set up earlier — no redundant Proxycurl calls, just scheduled updates.
For bulk operations, an async queue is a smarter approach. Instead of calling the API one domain at a time, you can POST domains to a Supabase queue table. Then, use Database Webhooks to trigger your enrichment function as new rows appear. This allows you to enrich hundreds of domains in the background while your team focuses on other tasks.
You can also layer in multiple data sources for richer profiles. Proxycurl handles firmographics well, but you can supplement it with Clearbit for company logos or Crunchbase for funding history. This multi-source approach gives you a more complete picture of each lead — which is essential when you’re trying to find your best customers.
These automation tactics turn a simple API into a self-sustaining enrichment engine. Once your pipeline handles data at scale, you can put it to work on real-world use cases like cold outreach.
Real-World Use Case: Cold Outreach
Let’s put this API to work with a real example. Consider a typical cold outreach campaign targeting 200 companies. Manual research — scraping LinkedIn, checking websites, cross-referencing Crunchbase — takes roughly 6.7 hours. That’s nearly a full workday lost to repetitive data gathering.
Your custom enrichment API changes that math completely. It processes 200 domains in under 5 minutes. The total cost in Proxycurl credits runs about $2 to $4. That is a fraction of the time and money compared to manual work or a paid Apollo subscription.
Once enriched, the data is ready to fuel your outreach pipeline. Supabase’s built-in REST API lets you export records directly to tools like HubSpot, Salesforce, or Instantly. No CSV exports. No manual mapping. Just a GET request that pulls clean firmographic data into your CRM.
This approach directly supports smarter lead generation in sales without the expensive overhead. You avoid common pitfalls like paying for stale contact lists or overspending on unused platform seats.
For teams serious about scaling, combining enriched data with a solid lead scoring model in Google Sheets helps prioritize the accounts most likely to convert. The enrichment API gives you the raw material. Your scoring model turns it into actionable pipeline priority.
Coming up next: we’ll walk through a final checklist to make sure your deployment is production-ready, and answer the most common questions about building this open-source Apollo alternative.
Final Checklist and FAQ
Before you deploy your company enrichment API to production, run through this quick checklist. Each item addresses a common failure point that startups face when building their own tools.
Database schema deployed with indexes. Your companies table should have indexes on domain, industry, and country. We covered this in Step 1. Without indexes, queries slow down as your lead list grows.
Secrets configured (never hardcoded). Your PROXYCURL_API_KEY must live in Supabase Secrets. Hardcoding keys is the fastest way to leak credentials on GitHub. Step 2 shows how to set this up safely.
Freshness logic prevents redundant billing. The 30-day stale check from Step 3 stops you from paying for data you already have. This alone can cut your Proxycurl costs by 60% or more.
Error handling handles invalid domains gracefully. Your function should return a clear 400 error for bad inputs. It should not crash or expose stack traces. Test with domains like “notacompanyzzz.com” to confirm.
Authentication/RLS is active. Your API endpoint must require a bearer token. Row-Level Security on your companies table adds a second layer of protection. Step 4 covers both.
People Also Ask
Q: What is an open-source Apollo alternative? A: It means building your own enrichment stack with open-source tools like Supabase and pay-as-you-go data providers like Proxycurl. This approach gives you full data ownership and avoids expensive per-seat subscriptions. For more on what comes after collecting enriched data, check out our guide on what comes after lead generation.
Q: How much does Proxycurl cost? A: Proxycurl uses a credit-based pricing model. Each company enrichment costs a fraction of a credit. For a startup enriching 200 leads per month, the cost is typically under $10. That is far cheaper than a $79-per-user Apollo seat.
Q: Can I run this on the Supabase free tier? A: Yes. The free tier gives you 500 MB of database storage and 2 GB of bandwidth. That is plenty for a prototype or early production use. You can scale to a paid plan only when your lead volume grows.
Building your own company enrichment API is one of the smartest investments a lean startup can make. You avoid vendor lock-in, control your data, and keep costs predictable. The real goal of lead generation is not just collecting emails — it is owning the pipeline that fills your CRM. With Proxycurl and Supabase, that pipeline is yours to build and keep.
People Also Ask
Q: What is an open-source Apollo alternative?
It is a self-hosted system that replaces expensive B2B data platforms. You build it using open-source tools like Supabase and data APIs like Proxycurl. This approach replicates enrichment workflows without per-seat subscriptions. Your team gets unlimited internal access for a fraction of the monthly cost. To learn more about finding the right prospects, check out The Complete Guide to Finding Your Best Customers.
Q: How much does Proxycurl cost?
Proxycurl uses a pay-as-you-go credit model. Each API call consumes credits based on the data endpoint you request. For startups targeting specific lead lists, this is far cheaper than flat-fee platforms. You only pay for the records you enrich, not for unused seats. This fits a cost-conscious lead generation in sales strategy.
Q: Can I run this on the Supabase free tier?
Yes, the Supabase free tier is enough for prototyping and early production. It provides 500 MB of database storage and 2 GB of bandwidth. A small team enriching a few hundred leads per month will not hit those limits. As your volume grows, you can upgrade to a paid plan incrementally. The architecture described in this guide scales with you.


Leave a Reply