PIIvacy

Go ahead. Paste it in.

A serverless PII scrubber so you can actually use the customer's email in your LLM prompt. Two functions. Zero proxies. Zero runtime deps. Runs anywhere Node 18+ runs.

npm install piivacy
35+PII patterns
3substitution modes
267kfake-name table
0runtime deps
218passing tests
Diagram of the PIIvacy scrubbing pipeline

What is PII, and why does it matter?

PII stands for Personally Identifiable Information — anything that can identify a real person or expose them to harm. That includes the obvious stuff (names, email addresses, phone numbers, Social Security numbers, credit cards, home addresses) and the less-obvious stuff (API keys, IP addresses, dates of birth, internal project codenames, license plates, MAC addresses, BTC/ETH wallets, employee IDs).

When you paste text containing PII into ChatGPT, Claude, Gemini, or any third-party LLM, that data leaves your control. It may be retained, logged, used to fine-tune future models, or surfaced to humans during safety review. For a hobby project that's annoying. For a product handling customer data, that's a leak you can't undo.

PIIvacy fixes this. Two functions: scrub() swaps PII for safe substitutes before the LLM ever sees the text. restore() puts the real values back when the LLM responds. The LLM provider only ever sees substitutes. Your users see the real values. No proxy, no server, no third-party service to trust — it runs inside your app.

What it actually looks like

Real input → what the LLM sees → what your user sees back:

1. The user types
Hi support, I'm Jane Doe. My email is
[email protected] and my order #18472
shipped to 234 Main Street. The card
ending in 4242 was charged twice.
2. PIIvacy scrubs → goes to LLM
Hi support, I'm Jane Doe. My email is
[[EMAIL_1]] and my order #18472
shipped to [[ADDRESS_US_1]]. The card
ending in 4242 was charged twice.
3. LLM responds, PIIvacy restores
Sorry about the duplicate charge, Jane.
We'll refund the second charge to the card
ending 4242 within 3 business days.
Confirmation will be sent to [email protected].

Names, codenames, and oblique references that regex can't catch? Pipe the scrubbed text through any cheap LLM with the included missed-PII helpers and the next scrub catches what was missed.

Who needs this

💬 Customer-support chatbots

Tickets, transcripts, and complaint emails routinely contain SSNs, card numbers, and addresses. Scrub before the model summarizes or replies.

🛠️ Internal AI assistants

Slack bots and Notion plugins that touch HR data, payroll exports, or customer records. Same engineering team, but the LLM provider isn't on your data-handling agreement.

📝 Content + analysis tools

Form processors, log analyzers, code-review bots that send raw text into prompts. Anything where users paste real-world data.

🔐 Regulated industries

Healthcare, fintech, legal, education. Defense in depth — combine PIIvacy with your existing data-handling controls. Not a substitute for compliance review, but a meaningful surface-area reduction.

How it works in three lines

const session = createSession();
const { text } = await scrub(userInput, session);  // PII → safe substitutes
// ... call your LLM with `text` ...
const restored = restore(llmResponse, session);  // substitutes → real values

That's the whole library. Two functions. Three substitution modes (token, realistic, pass-through). 58 patterns out of the box. Zero runtime dependencies. Full docs in the Docs tab.

Try it

Paste any text below or pull from the 500-sample test corpus. The scrubber runs server-side using the actual published package — no LLM calls, no third parties, nothing leaves this server.

Input 0 chars
ready

          
        
Redactions 0