Architecture

AgentSync has three planes. This repo currently specifies the contracts between them; the client and server implementations come later.

┌──────────────────────────── submitter's machine ────────────────────────────┐
│                                                                              │
│  agent transcripts        AgentSync CLI (npx agentsync)                      │
│  ~/.claude/*.jsonl   ──▶   ┌──────────┐  ┌─────────┐  ┌──────────┐  ┌──────┐ │
│  ~/.codex/*          ──▶   │ adapters │─▶│ redact  │─▶│ preview  │─▶│upload│ │
│  custom agents       ──▶   └──────────┘  └─────────┘  └──────────┘  └──┬───┘ │
│                          detect+parse   privacy profile  confirm        │     │
└─────────────────────────────────────────────────────────────────────────┼────┘
                                                                           │ HTTPS + key
┌──────────────────────────── ingestion plane ─────────────────────────────▼────┐
│  POST /v1/transcripts   →  validate key  →  validate schema  →  store          │
│                            rate-limit        re-scan secrets     object + index │
└────────────────────────────────────────────────────────────────────────────────┘
                                                                           │
┌──────────────────────────── consumption plane ───────────────────────────▼────┐
│  corpora export · hiring dashboards · research queries (out of scope for v0)   │
└────────────────────────────────────────────────────────────────────────────────┘

Trust boundary

The single most important boundary is the submitter's machine. Everything that protects privacy — secret redaction, identifier anonymization, content stripping, and the human preview/confirm step — runs before the first byte leaves that machine.

The server treats incoming payloads as already-redacted but does not trust them: it re-runs a conservative secret scan as defense-in-depth and rejects (or quarantines) anything that still trips high-confidence detectors. The server never has access to pre-redaction data.

Components

CLI client (`agentsync`, future `packages/cli`)

Distributed via npx agentsync / npm i -g agentsync.
Loads the user's key (agentsync login, AGENTSYNC_KEY, or config file).
Runs the pipeline: detect → parse → redact → preview → upload.
Loads a privacy profile (built-in or from agentsync.config.*).

Shared core (`packages/core`, in this repo)

Pure TypeScript contracts shared by client, server, and consumers.
No I/O, no runtime behavior yet — just the normalized model, the privacy config model, and the adapter interface.

Ingestion API (future `packages/server`)

Stateless HTTP service. See api.md.
Validates the key, validates against schema/transcript.schema.json, re-scans, stores the object, writes a lightweight index row.

Why "schema + spec first"

The transcript shape is the contract every other piece depends on. Pinning it (plus the privacy and adapter interfaces) before writing the client/server means:

adapters for new agents can be written and tested in isolation,
the redaction layer has a stable surface to operate on,
the server's validation is just "does it match the schema",
downstream consumers can build against a versioned, documented format.

Versioning

schemaVersion is embedded in every transcript (semver, starts at 0.1.0).
The wire format is additive within a minor version; breaking changes bump the major and are negotiated via the Accept/Content-Type version suffix (see api.md).

Architecture ​

Trust boundary ​

Components ​

CLI client (agentsync, future packages/cli) ​

Shared core (packages/core, in this repo) ​

Ingestion API (future packages/server) ​

Why "schema + spec first" ​

Versioning ​