PIIvacy is a serverless JavaScript npm package that scrubs personally identifiable information from text before sending it to a Large Language Model, then restores the original values when the LLM responds. It catches 35+ types of PII via regex (emails, phone numbers, credit cards, SSNs, addresses, 25+ provider API keys, JWTs, IP addresses, and more) and provides three substitution modes: opaque tokens, realistic fakes, or pass-through. It has zero runtime dependencies and runs anywhere Node.js 18+ runs.

How is PIIvacy different from a privacy proxy?

PIIvacy is a library, not a service. It runs inside your application as two functions: scrub(text, session) and restore(text, session). No proxy, no separate infrastructure, no new failure modes. The package never makes HTTP calls itself; you wire it up to whatever LLM client you already have.

What does PIIvacy catch that regex usually misses?

Regex catches most structured PII reliably. Names, codenames, project references, and free-form sensitive context need a second pass. PIIvacy ships BYO-LLM helpers (buildPiiCheckPrompt and parsePiiCheckResponse) that let you wire up any chat model — including in-browser models like WebLLM — to catch what regex missed and feed those flagged values back into the next scrub.

Can PIIvacy generate realistic fake names?

Yes. PIIvacy ships a 267,072-name fake-name table built from US Social Security Administration baby names and US Census 2010 surnames, both public-domain. Names are bucketed by demographics (gender × decade-of-peak-popularity for first names, dominant Census ethnicity for surnames) so realistic-mode substitutions stay culturally and demographically appropriate. Marcus → Terrance, Bryant, Orlando, Cedric, Terrell. Garcia → Nunez, Valdez, Santiago, Maldonado, Dominguez.

Yes. PIIvacy is MIT licensed and free to use, fork, and extend. Install with `npm install piivacy`. Source code is at github.com/callieschneider/piivacy.

PIIvacy — paste PII into LLMs without leaking it

Real name	Bucket	Alternates
Marcus	m:1980s	Terrance, Bryant, Orlando, Cedric, Terrell
Sarah	f:1980s	Amber, Danielle, Brittany, Tiffany, Crystal
Mervyn	m:1930s	Ned, Delmar, Dudley, Arlen, Huey
Liam	m:2020s	Waylon, Griffin, Ellis, Rowan, Alonzo
Garcia	hispanic	Nunez, Valdez, Santiago, Maldonado, Dominguez
Nguyen	eastAsian	Li, Wong, Le, Wang, Park
Patel	southAsian	Sharma, Chu, Ma, Chin, Kumar

Quick start

import { scrub, restore, createSession } from 'piivacy';

const session = createSession();

// Before sending to the LLM
const { text } = await scrub(userInput, session);

// Send to your LLM
const reply = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: text }]
});

// Put the PII back
const restored = restore(reply.choices[0].message.content, session);

Per-category modes

await scrub(text, session, {
  defaultMode: 'token',
  modes: {
    contact: 'realistic',     // emails, phones → fakes
    location: 'pass-through', // addresses preserved
    secrets: 'token'          // API keys → [[OPENAI_KEY_1]]
  }
});

Per-label overrides

await scrub(text, session, {
  defaultMode: 'token',
  modes: { contact: 'realistic' },
  labels: {
    EMAIL: 'realistic',     // explicit wins
    DOB:   'pass-through',  // we want to discuss age
    ZIP_US: 'token'         // override location-category default
  }
});

Built-in presets

import { presets } from 'piivacy';

await scrub(text, session, presets.maximumRedaction);
// → tokens for everything

await scrub(text, session, presets.naturalConversation);
// → contact + location realistic; secrets/financial/identifiers token

await scrub(text, session, presets.localSearch);
// → location pass-through; everything else tokenized

await scrub(text, session, presets.testFriendly);
// → realistic where possible; token for danger categories

Add your own pattern

import { registerPattern } from 'piivacy';

registerPattern({
  label: 'INTERNAL_TICKET',
  regex: /\bTICKET-\d{6}\b/g,        // /g flag REQUIRED
  category: 'custom',
  priority: 25,                      // slots between defaults
  validate: (v) => Number(v.slice(7)) > 100000,
  fake: (_value, { counter }) =>
    `TICKET-${(counter + 1000).toString().padStart(6, '0')}`,
  description: 'Internal Jira-style ticket'
});

Inspect what was redacted

import { listRedactions } from 'piivacy';

await scrub('email [email protected] phone (415) 555-0142', session);

console.log(listRedactions(session));
// [
//   { kind: 'token', identifier: '[[EMAIL_1]]', label: 'EMAIL',
//     value: '[email protected]', count: 1, firstSeenAt: ..., lastSeenAt: ... },
//   { kind: 'token', identifier: '[[PHONE_US_1]]', label: 'PHONE_US',
//     value: '(415) 555-0142', count: 1, ... }
// ]

Useful for audit trails, downstream feature calls, and debugging.

Go ahead. Paste it in.

What is PII, and why does it matter?

What it actually looks like

Who needs this

How it works in three lines

Try it

The three substitution modes

token default

realistic

pass-through

What it catches

BYO-LLM helpers live demo

Enable in-browser LLM

1. Missed-PII detection

2. Intent-driven mode picking

The 267,072-name fake-name table

How alternates are picked

Examples from the shipped table

Tried but didn't ship: embedding nearest-neighbor

Regenerate or extend the table

How to use it

Quick start

Per-category modes

Per-label overrides

Built-in presets

Add your own pattern

Inspect what was redacted

How this was built

Stats

What it catches / What it misses

Catches reliably

Misses (use the LLM second-pass)

Disclaimer

License

Built with public-domain data

For AI agents

Get in touch