Privacy

Privacy in AgentSync is local-first and configurable per use case. Nothing leaves the submitter's machine until it has passed through the active privacy profile and (by default) a human preview.

TypeScript model: packages/core/src/privacy.ts Example config: examples/agentsync.config.example.json

The profile model

A profile is a named bundle of privacy rules. Different use cases want different trade-offs, so AgentSync ships several built-ins and lets users define their own. The active profile is chosen by:

agentsync submit --profile strict        # explicit flag
AGENTSYNC_PROFILE=research                # env
defaultProfile in agentsync.config.json   # config

Profiles compose via extends, so a custom profile can inherit strict and relax one field.

Built-in profiles (starting point)

Profile	Intended use case	Posture
`strict`	Default / unsure	Redact all secret + PII detectors, anonymize all identifiers, drop file contents, tool output → `metadata-only`, no reasoning.
`research`	Trajectory corpora	Redact secrets + PII, anonymize identifiers, keep truncated tool output and reasoning, drop raw file contents.
`hiring-signal`	Portfolio / hiring	Redact secrets, keep file contents and full tool output, identity-linked (`anonymize: false`) so work attributes to the person.
`raw-min`	Self-hosting / trusted sink	Redact only high-confidence secrets; everything else passes through.

These are defaults, not laws. The whole point of the config is that each deployment / use case picks its own posture.

A profile's four levers

jsonc

{
  "name": "research",
  "extends": "strict",            // optional inheritance
  "redaction":     { … },         // 1. remove secrets & PII
  "anonymization": { … },         // 2. de-identify stable identifiers
  "content":       { … },         // 3. how much of the payload to include
  "scope":         { … }          // 4. which sessions/paths are eligible
}

1. `redaction` — secrets & PII (client-side, always on by default)

The user chose client-side secret redaction as the baseline. Detectors run over every event's text and tool I/O before upload.

jsonc

{
  "enabled": true,
  "builtins": ["api-keys","aws","gcp","jwt","private-keys","env-values",
               "emails","ip-addresses","credit-cards"],
  "custom": [
    { "id": "internal-host", "pattern": "\\bcorp\\.example\\.com\\b", "type": "pii" }
  ],
  "replacement": "[REDACTED:{type}]",
  "onUncertain": "redact"          // "redact" | "flag" | "ignore"
}

builtins — named detector packs (regex + entropy + context heuristics).
custom — user-supplied regex for org-specific secrets/hostnames.
onUncertain — what to do with medium-confidence hits; strict redacts.

2. `anonymization` — de-identify, don't delete

Replaces stable identifiers with deterministic pseudonyms so trajectory structure survives while identity is protected.

jsonc

{
  "enabled": true,
  "fields": ["username","hostname","abs-paths","repo-name","git-remote"],
  "strategy": "hash",              // "hash" → stable pseudonym | "strip" → remove
  "salt": "per-submitter"          // "per-submitter" | "fixed" | "random-per-run"
}

salt: per-submitter lets a consumer group one person's submissions without ever learning who they are.
The hiring-signal profile sets enabled: false so work is attributable.

3. `content` — how much payload to include

The privacy/utility dial. Strip expensive or sensitive material entirely.

jsonc

{
  "includeFileContents": false,
  "toolOutput": "truncated",       // "full" | "truncated" | "metadata-only" | "none"
  "includeReasoning": true,
  "maxToolOutputBytes": 16384
}

4. `scope` — what's even eligible to submit

Path/repo gating so whole categories of work never get considered.

jsonc

{
  "includeRepos": ["github.com/me/*"],
  "excludePaths": ["**/secrets/**","**/.env*","/work/**"],
  "denyIfMatch": ["NDA","CONFIDENTIAL"]    // skip a session if these appear
}

The preview gate

Even with redaction on, the default flow shows a preview of the exact payload and requires confirmation:

agentsync submit                 # interactive preview + confirm
agentsync submit --yes           # CI / non-interactive (still redacts)
agentsync submit --dry-run       # write redacted payload locally, upload nothing

--dry-run is the recommended way to inspect what a profile produces before trusting it.

Server-side defense-in-depth

The client is the privacy authority, but the ingestion API independently re-scans for high-confidence secrets and rejects or quarantines anything that still trips them. A leaked secret should fail closed at two layers, not one. The server never sees pre-redaction data.

Roadmap (not in v0, but the model leaves room for them)

The four levers above cover the use cases you named. Future, heavier options the schema is designed to accommodate:

End-to-end encryption — encrypt the payload to consumer-held keys so the store holds only ciphertext.
Aggregate-only / differential privacy — submit derived statistics instead of raw trajectories for sensitive orgs.
Self-hosted sink — point the CLI at your own ingestion URL; data never touches shared infrastructure.

Privacy ​

The profile model ​

Built-in profiles (starting point) ​

A profile's four levers ​

1. redaction — secrets & PII (client-side, always on by default) ​

2. anonymization — de-identify, don't delete ​

3. content — how much payload to include ​

4. scope — what's even eligible to submit ​

The preview gate ​

Server-side defense-in-depth ​

Roadmap (not in v0, but the model leaves room for them) ​

Privacy

The profile model

Built-in profiles (starting point)

A profile's four levers

1. `redaction` — secrets & PII (client-side, always on by default)

2. `anonymization` — de-identify, don't delete

3. `content` — how much payload to include

4. `scope` — what's even eligible to submit

The preview gate

Server-side defense-in-depth

Roadmap (not in v0, but the model leaves room for them)