Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
82e8d82
create logic functions and utils
bosiraphael Jun 5, 2026
e095749
remove comment
bosiraphael Jun 5, 2026
8e56d1d
fixes
bosiraphael Jun 5, 2026
09d68af
Merge branch 'main' of github.com:twentyhq/twenty into r--people-data…
bosiraphael Jun 5, 2026
a21e142
Fixes on selects and add custom errors
bosiraphael Jun 5, 2026
d6ea28b
split files
bosiraphael Jun 5, 2026
eafd31c
Add icons
bosiraphael Jun 5, 2026
fd75c15
Merge branch 'main' into r--people-data-labs-enrichment-mapper
bosiraphael Jun 5, 2026
d7c3f22
fixes and sdk version bump
bosiraphael Jun 5, 2026
5d4424a
Merge branch 'r--people-data-labs-enrichment-mapper' of github.com:tw…
bosiraphael Jun 5, 2026
3dfc7ab
upgrade sdk
bosiraphael Jun 8, 2026
ef4c94a
create enrichment workflows via post-install
bosiraphael Jun 8, 2026
e0bf07b
remove the suffixes
bosiraphael Jun 8, 2026
4205e67
remove indexes
bosiraphael Jun 8, 2026
95e0e2a
Merge branch 'main' into r--people-data-labs-enrichment-mapper
bosiraphael Jun 8, 2026
db470ab
fix post install seeding failure
bosiraphael Jun 8, 2026
292bfbd
update role
bosiraphael Jun 8, 2026
2b69ac4
update enrichment workflows port install
bosiraphael Jun 8, 2026
e1bbe9b
fix
bosiraphael Jun 8, 2026
20b3a50
Merge branch 'main' into r--people-data-labs-enrichment-mapper
bosiraphael Jun 8, 2026
e0479b6
add comments until sdk is fixed
bosiraphael Jun 8, 2026
02440dd
Merge branch 'r--people-data-labs-enrichment-mapper' of github.com:tw…
bosiraphael Jun 8, 2026
0849b36
Merge remote-tracking branch 'origin/main' into r--people-data-labs-e…
bosiraphael Jun 9, 2026
302180b
Merge branch 'main' into r--people-data-labs-enrichment-mapper
bosiraphael Jun 9, 2026
c187e7c
Merge branch 'main' into r--people-data-labs-enrichment-mapper
bosiraphael Jun 9, 2026
2219bcb
Merge branch 'main' into r--people-data-labs-enrichment-mapper
bosiraphael Jun 9, 2026
4c05eb6
logic functions improvements
bosiraphael Jun 9, 2026
ddf03d2
Merge branch 'r--people-data-labs-enrichment-mapper' of github.com:tw…
bosiraphael Jun 9, 2026
8bc4ec3
update functions and add seeded views
bosiraphael Jun 9, 2026
e2524a7
update logic functions
bosiraphael Jun 9, 2026
7ba1fe5
use object signatures
bosiraphael Jun 9, 2026
3478e0f
improvements
bosiraphael Jun 9, 2026
f794acf
improvements
bosiraphael Jun 10, 2026
f832fc4
improvements
bosiraphael Jun 10, 2026
0051996
Merge branch 'main' into r--people-data-labs-enrichment-mapper
bosiraphael Jun 10, 2026
6c1534b
rename variables
bosiraphael Jun 10, 2026
040206c
add new error
bosiraphael Jun 10, 2026
1e5ced3
Merge branch 'main' into r--people-data-labs-enrichment-mapper
bosiraphael Jun 10, 2026
11b84fc
typecheck
bosiraphael Jun 10, 2026
fc6c66d
add navigation menu folder
bosiraphael Jun 10, 2026
f99ac2b
fixes
bosiraphael Jun 10, 2026
cd3927b
capitalize name
bosiraphael Jun 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 0 additions & 4 deletions packages/twenty-apps/internal/people-data-labs/.oxlintrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,6 @@
"ignorePatterns": ["node_modules", "dist"],
"rules": {
"func-style": ["error", "declaration", { "allowArrowFunctions": true }],
"no-console": [
"warn",
{ "allow": ["group", "groupCollapsed", "groupEnd"] }
],
"no-control-regex": "off",
"no-debugger": "error",
"no-duplicate-imports": "error",
Expand Down
144 changes: 105 additions & 39 deletions packages/twenty-apps/internal/people-data-labs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,61 @@

Enriches **Person** and **Company** records with [People Data Labs](https://www.peopledatalabs.com/) (PDL) data.

> **Status: data-model scaffold.** This package defines the fields, relation, indexes,
> views, role, and manifest. The enrichment **logic function (the "mapper") is not yet
> implemented** — see [What the mapper must do](#what-the-mapper-must-do).
> **Status: data model + enrichment mapper.** This package defines the fields, relation,
> views, role, and manifest, and implements the enrichment **logic functions** that call the
> PDL REST API and map the response onto the standard + `pdl*` fields. The manual "Enrich"
> record-action workflows are currently **created by hand** — automatic post-install seeding is
> implemented but not wired up (see [Seeded workflows](#seeded-workflows-post-install)).

---

## Enrichment logic functions

`enrich-person` / `enrich-company` (bulk workflow actions, for the manual record action) plus
`enrich-person-tool` / `enrich-company-tool` (single-record AI tools) all delegate to a shared,
trigger-agnostic core in `src/logic-functions/handlers/`:

- The workflow-action functions accept a **list of records** (`{ records, force? }`) and loop
the single-record core over each, aggregating the outcome (`total` / `matched` / `notFound` /
`skipped` / `errored`); a per-record failure is captured as `ERROR` without aborting the batch
(`src/logic-functions/utils/run-batch-enrichment.ts`). The AI tools stay single-record.

- Read the record, guard against re-enriching within a TTL (`pdlLastEnrichedAt`), pick a
match identifier (person: `pdlId` → LinkedIn → email → name; company: `pdlId` → domain →
name), and call the PDL Person/Company Enrichment API (`src/logic-functions/utils/`).
- On a match: fill **standard fields only when empty** (never clobber user data), always
(re)write `pdl*` fields, and set `pdlEnrichmentStatus = MATCHED`, `pdlLastEnrichedAt`,
`pdlRawPayload` (+ `pdlLikelihood` for Person). PDL `404` → `NOT_FOUND`; other errors →
`ERROR`. No identifier / fresh TTL → skipped with no writes.
- SELECT/MULTI_SELECT values are normalized and dropped if not in the field's option set
(`src/logic-functions/utils/`); the option sets are the same `src/constants/*-options.ts`
the field definitions use.

Run locally: `yarn twenty dev:function:exec -n enrich-person -p '{"records":[{"id":"<id>"}]}'`.

### Seeded workflows (post-install)

> **Not currently wired up.** `post-install.function.ts` is a no-op
> (`return { seededWorkflows: [] }`); the seeding implementation in
> `src/logic-functions/handlers/post-install.ts` (`postInstallCore`) is **not invoked**. An
> app's `CoreApiClient` only exposes per-object CRUD over the workspace `/graphql` schema, and the
> workflow-builder mutations needed to seed a workflow (`createWorkflowVersionStep` /
> `activateWorkflowVersion`) are core resolvers the app surface does not yet expose. Until the SDK
> exposes them, **create the two "Enrich" workflows by hand**.

When re-enabled, each workflow is a `MANUAL` / `BULK_RECORDS` trigger wired to a single
`LOGIC_FUNCTION` step whose `records` input is bound to the selected records
(`{{trigger.companies}}` / `{{trigger.people}}`):

- **Enrich companies** — runs `enrich-company` over the selected Companies.
- **Enrich people** — runs `enrich-person` over the selected People.

The intended seeding (`postInstallCore`) resolves each function's runtime id from its
`universalIdentifier` via the metadata API, publishes the version
(`activateWorkflowVersion`), and is **idempotent** (skips a workflow whose name already exists).

**Deferred to a later PR:** enrichment metering/billing, and auto-enrichment
triggers (on-create event + cron backfill).

---

Expand All @@ -25,16 +77,22 @@ PDL schema v34.1**:
| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------ | ------------------------------------------ |
| `pdlSeniority` (`job_title_levels`, array) | MULTI_SELECT | 10 |
| `pdlFundingStages` (`funding_stages`, array) | MULTI_SELECT | 29 |
| `pdlIndustry` / `pdlJobCompanyIndustry` (`industry`) | SELECT | 147 |
| `pdlIndustry` (`industry`) | SELECT | 147 |
| `pdlJobTitleSubRole` (`job_title_sub_role`) | SELECT | 106 |
| `pdlJobTitleClass`, `pdlInferredSalary`, `pdlSex`, `pdlCompanyType`, `pdlSizeRange`, `pdlJobCompanySize`, `pdlLatestFundingStage`, `pdlLocationContinent`, `pdlLocationMetro`, `pdlMicExchange` | SELECT | 5 / 11 / 2 / 6 / 8 / 8 / 29 / 7 / 384 / 70 |
| `pdlJobTitleClass`, `pdlInferredSalary`, `pdlSex`, `pdlCompanyType`, `pdlSizeRange`, `pdlLatestFundingStage`, `pdlLocationContinent`, `pdlLocationMetro`, `pdlMicExchange` | SELECT | 5 / 11 / 2 / 6 / 8 / 29 / 7 / 384 / 70 |

- Option `value`s are normalized to **GraphQL enum names** (`united states` → `UNITED_STATES`):
uppercase, accents stripped, non-alphanumeric → `_`, digit-leading prefixed.
- Option `universalIdentifier`s are **unique per field** (shared enums like industry get a
separate id-set per field).
- Option `universalIdentifier`s are **unique per field** (shared enums like industry, metro, and
funding stage get a separate id-set per field).
- The large option sets (`metro-options.ts`, `industry-options.ts`, …) and the UUID registry
(`universal-identifiers.ts`) are generated from the PDL taxonomy and checked in. When
regenerating for a newer PDL schema, **never change an existing option or field UUID** — that
orphans stored data; only append ids for new options. `select-option-constants.spec.ts` guards
global UUID uniqueness, value normalization, and per-field id integrity.
- **Stays `TEXT`** (no canonical PDL enum file exists): `pdlIndustryDetail` (`industry_v2`),
`pdlJobOnetCode`, `pdlLocationRegion`.
`pdlJobOnetCode`. PDL `location_region` has no dedicated field — it fills the `state` slot of
the person `pdlLocation` ADDRESS composite.

### Standard-field mapping

Expand All @@ -58,11 +116,19 @@ into the standard bags).
- `pdlLocationMetro` (both) and `pdlLocationContinent` (company) stay SELECT — ADDRESS has no slot.
_Trade-off:_ ADDRESS `country` is free text, so the country SELECT was dropped.

### Relation
### Current company → standard `company`

PDL's detected current employer (`job_company_*`) is resolved to a Company record
(**find-or-create**, matched by `pdlId` → domain → LinkedIn → name; created with
`name` / `domainName` / `linkedinLink` + `pdlId` / `pdlIndustry` / `pdlSizeRange` when none
matches) and linked via the **standard `company`** relation, **fill-only-if-empty** — it never
overwrites a company the user already set, and the lookup is skipped entirely when the person
already has one (no orphan companies).

Dedicated **`pdlCurrentCompany`** (Person `MANY_TO_ONE` → Company) ↔ inverse
**`pdlCurrentEmployees`** (Company `ONE_TO_MANY` → Person). Deliberately **not** the standard
`company` relation, so PDL's detected employer can't overwrite the user's CRM account link.
Company attributes live on the **Company** record, not denormalized on the Person. The earlier
`pdlCurrentCompany` / `pdlCurrentEmployees` relation and the six `pdlJobCompany*` scalar fields
were **removed** as duplicates of the standard `company` relation and the linked Company's own
fields.

### Enrichment metadata

Expand All @@ -75,40 +141,40 @@ Dedicated **`pdlCurrentCompany`** (Person `MANY_TO_ONE` → Company) ↔ inverse
### Other

- `pdlTotalFunding` is `CURRENCY` (mapper must convert the bare USD float → micros).
- **Indexes**: `pdlId` and `pdlLastEnrichedAt` on both objects.
- **Views**: a curated "People Data Labs" TABLE view per object.
- **Role**: read/update on Person & Company (object-level; tighten to field-scoped later).

---

## What the mapper must do

The logic function (to be built) must:
## What the mapper does

**Orchestration**
**Orchestration** (`src/logic-functions/`)

1. Trigger via manual command-menu action / record create / batch (TBD).
2. Call PDL Person and/or Company Enrichment with `PDL_API_KEY`; pass `min_likelihood` /
`required_fields` to control match quality.
3. On `200` → write fields + set `pdlEnrichmentStatus = MATCHED`; on `404` →
`NOT_FOUND`; on error → `ERROR`.
4. Respect PDL rate limits (queue / throttle on `429`).
5. **TTL guard**: skip re-enrichment if `pdlLastEnrichedAt` is recent; prefer re-enriching by `pdlId`.
1. Runs from the manual "Enrich" record action (`BULK_RECORDS`) or the single-record AI tools.
2. Calls the PDL Person / Company Enrichment API with `PDL_API_KEY`, passing a `min_likelihood`
chosen by identifier strength (2 with a strong identifier, 6 for a weaker name-based match;
overridable per call).
3. A match → `pdlEnrichmentStatus = MATCHED`; PDL `404` / no match → `NOT_FOUND`; other errors →
`ERROR`. Errored and not-found records are also stamped with `pdlLastEnrichedAt` so the TTL
guard backs off instead of re-submitting them on every run.
4. **TTL guard**: skips re-enrichment when `pdlLastEnrichedAt` is within 7 days (bypass with
`force`), and prefers re-enriching by `pdlId`.

**Field writing**

6. Write **standard fields** (fill-only-if-empty to avoid overwriting user data):
Person `name`, `emails`, `phones`, `linkedinLink`, `jobTitle`; Company `name`,
`domainName`, `linkedinLink`, `address`.
7. Write `pdl*` fields for everything else.
8. **SELECT guard**: only write a SELECT/MULTI_SELECT value if the normalized value exists in
the field's option set; otherwise skip and keep it in `pdlRawPayload` (handles PDL schema
versions newer than v34.1). Use the same normalization as the option `value`s.
9. **MULTI_SELECT** arrays: `job_title_levels` → `pdlSeniority`; `funding_stages` → `pdlFundingStages`.
10. **CURRENCY**: `total_funding_raised` (USD float) → `{ amountMicros: value × 1_000_000, currencyCode: 'USD' }`.
11. **ADDRESS**: split PDL `location.*` into the composite — Company → standard `address`,
Person → `pdlLocation`.
12. **Relation**: resolve `job_company_id` → find/upsert a Company record → link `pdlCurrentCompany`.
13. **Dates**: handle partial PDL dates (`YYYY`, `YYYY-MM`) for `job_start_date`,
`last_funding_date`, `birth_date`.
14. Always set `pdlId`, `pdlLastEnrichedAt`, `pdlRawPayload`, `pdlLikelihood` (person).
5. **Standard fields** are filled **only when empty** (never clobber user data): Person `name`,
`emails`, `phones`, `linkedinLink`, `jobTitle`; Company `name`, `domainName`, `linkedinLink`,
`address`. All `pdl*` fields are (re)written on every match.
6. **SELECT guard**: a SELECT/MULTI_SELECT value is written only if its normalized form is in the
field's option set; otherwise it is skipped and preserved in `pdlRawPayload` (handles PDL
schema versions newer than the bundled one). `job_title_levels` → `pdlSeniority`,
`funding_stages` → `pdlFundingStages`.
7. **CURRENCY**: `total_funding_raised` (USD) → `{ amountMicros: value × 1_000_000, currencyCode: 'USD' }`.
8. **ADDRESS**: PDL `location.*` is split into the composite — Company → standard `address`,
Person → `pdlLocation`.
9. **Current company**: `job_company_*` is resolved to a Company record (find-or-create, matched by
`pdlId` → domain → LinkedIn → name) and linked via the standard `company` relation
(fill-only-if-empty); resolutions are cached within a batch run.
10. **Dates**: partial PDL dates (`YYYY`, `YYYY-MM`) for `job_start_date`, `last_funding_date`,
`birth_date` are expanded and range-validated.
11. Always sets `pdlId`, `pdlLastEnrichedAt`, `pdlRawPayload` (+ `pdlLikelihood` for Person).
6 changes: 4 additions & 2 deletions packages/twenty-apps/internal/people-data-labs/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@
"twenty": "twenty",
"lint": "oxlint -c .oxlintrc.json .",
"lint:fix": "oxlint --fix -c .oxlintrc.json .",
"test": "vitest run --passWithNoTests",
"test:watch": "vitest"
"typecheck": "tsc --noEmit -p tsconfig.spec.json",
"test": "vitest run --config vitest.unit.config.ts --passWithNoTests",
"test:watch": "vitest --config vitest.unit.config.ts",
"test:integration": "vitest run --passWithNoTests"
},
"dependencies": {
"twenty-client-sdk": "2.10.1",
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ export default defineApplication({
universalIdentifier: APPLICATION_UNIVERSAL_IDENTIFIER,
displayName: 'People Data Labs',
description: 'Enrich People and Companies with People Data Labs data.',
logoUrl: 'public/people-data-labs-icon.png',
serverVariables: {
PDL_API_KEY: {
description: 'People Data Labs API key',
Expand Down
Loading
Loading