Skip to main content
POST
/
api
/
lists
/
check-duplicates
Check a batch of leads for list duplicates
curl --request POST \
  --url https://app.puffle.ai/api/lists/check-duplicates \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "people": [
    {
      "linkedinUrl": "<string>",
      "fullName": "<string>",
      "company": "<string>",
      "location": "<string>"
    }
  ],
  "searchId": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "excludeListId": "3c90c3cc-0d44-4b50-8888-8dd25736052a"
}
'
{
  "duplicateCount": 12
}

Overview

Read-only counting endpoint that reports how many people in an input batch are duplicates of leads already in the caller’s other lists. No list is mutated — this is purely a “what would I save?” preview for a future import. Dedupe uses the three-tier cascade implemented in lib/lists/cross-list-dedup.ts:
  1. LinkedIn URL — strongest signal, checked first.
  2. Full name + company — tier 2 fallback when LinkedIn URL is missing.
  3. Full name + location — tier 3 fallback.
The request body supports two input modes (mutually exclusive):
  • Inline batch. Pass people — an array of up to 10,000 { linkedinUrl, fullName, company, location } objects.
  • Batch by search id. Pass searchId — the server loads all lead_search_results for that search (paginated 1000 at a time, no 10k cap) and dedupes against the caller’s lists. The search must be owned by the caller.
Use excludeListId to exclude a specific list from the dedupe set — typical pattern is excluding the list you’re about to import into.

AI agent notes

Typical flow.
  1. User initiates a new list import (CSV or Sales Nav).
  2. Agent/UI calls checkListDuplicates with the candidate set (or searchId).
  3. UI surfaces duplicateCount so the user can decide whether to proceed as-is, add dedupe, or cancel.
  4. Agent calls importFromSalesNav or the appropriate CSV import route — that route handles the actual dedupe during import.
Pick the right mode.
  • For Sales Nav / CSV previews where the data is already in the client, use the people mode.
  • For lead search results already persisted in Supabase, use searchId — it’s both more efficient and uncapped.
Limits.
  • people mode: 10,000 entries per call, otherwise 400.
  • searchId mode: unlimited, the server paginates 1000 at a time.
Cascade nuance.
  • A candidate with only fullName and no company/location will only match tier-1 (LinkedIn URL) — i.e. effectively never. Agents should prefer enriching these fields before calling.
No side effects. Pure read.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Provide either people (inline array) or searchId (batch fetched server-side). The two paths are mutually exclusive — searchId takes precedence when both are set.

people
object[]

Explicit batch of candidates to check. Capped at 10,000 entries per call. Ignored when searchId is also provided.

Maximum array length: 10000
searchId
string<uuid>

When provided, the server loads all lead_search_results rows for this search (paginated 1000 at a time) instead of reading people from the body. The search must belong to the caller.

Pattern: ^([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[1-8][0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}|00000000-0000-0000-0000-000000000000|ffffffff-ffff-ffff-ffff-ffffffffffff)$
excludeListId
string<uuid>

Optional list to skip when building the caller's cross-list dedupe set. Use this to check duplicates against other lists while excluding the list you're about to import into.

Pattern: ^([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[1-8][0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}|00000000-0000-0000-0000-000000000000|ffffffff-ffff-ffff-ffff-ffffffffffff)$

Response

Count of candidates that would be suppressed by cross-list dedupe. Returns { duplicateCount: 0 } when people is empty or omitted (and no searchId is given).

duplicateCount
integer
required

Number of candidates in the input that already exist in one of the caller's other lists, under the three-tier cascade (LinkedIn URL → name+company → name+location).

Required range: 0 <= x <= 9007199254740991