UNIHODL · Agent Handoff SDK · Technical Specification v1.0

Agent Handoff SDK

The decision-continuity protocol for human-to-agent and agent-to-agent work transfer.

Quickstart · 5 minutes

Install the SDK, mint a scoped key, and pass a UNIHODL resume token into your agent loop. An agent that receives a resume token starts its turn already knowing what the human decided, what they read, and what they intended to do next.

1. Install

bash

npm install @unihodl/agent-sdk
# or
pip install unihodl-agent

2. Mint a scoped resume token (server-side)

bash

curl https://unihodl.app/api/v1/resume_tokens \
  -H "Authorization: Bearer $UNIHODL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "ses_8f3aZ91b",
    "scopes": ["read:context", "read:reasoning"],
    "ttl_seconds": 3600,
    "audience": "claude.anthropic.com"
  }'

3. Hand it off to an agent

python

from unihodl_agent import Client
from anthropic import Anthropic

uh = Client(api_key=os.environ["UNIHODL_API_KEY"])
claude = Anthropic()

# Hydrate the session — returns a structured ResumeContext
ctx = uh.sessions.hydrate("ses_8f3aZ91b")

# Pass the serialized context as the first system message
resp = claude.messages.create(
    model="claude-sonnet-4-7",
    max_tokens=2048,
    system=ctx.as_system_prompt(),
    messages=[
        {"role": "user", "content": "Continue Sarah's research on API v3."}
    ],
)
print(resp.content[0].text)

What just happened

The agent received the human’s open tabs with scroll positions, the AI-tagged decision thread, partial conclusions, and intended next step — not just a list of links. It can pick up the work mid-thought.

1. Overview & Design Principles

Most agentic AI failures are not reasoning failures — they are cold-start failures. The agent begins each session with no idea what the human already concluded, what tabs they had open, or which option they were leaning toward. UNIHODL’s Agent Handoff SDK closes that gap by exposing the human’s working session as a first-class, machine-readable artifact.

The four primitives

Primitive	Definition
`Session`	An open work context — tabs, scroll positions, video timestamps, partial notes — captured by UNIHODL on the device.
`Resume Token`	A signed, scoped, time-bound credential that grants an agent read (and optionally write) access to a session.
`Resume Context`	The hydrated payload an agent receives — structured, redacted, serializable into any model's prompt format.
`Reasoning Thread`	The why-layer: partial conclusions, decision stance, blockers, next-step intent.

Design principles

•Audience-bound by default. Every resume token is bound to a single audience (a model vendor, an MCP server URL, or a known agent identity). Tokens cannot be replayed against other audiences.
•Redaction at the boundary. Sensitive content is redacted server-side before serialization based on per-workspace policies — the agent never sees content it isn't entitled to.
•Reasoning is structured, not narrative. Decision threads are serialized as typed graph nodes, not free text, so agents can ground tool calls in specific intent vectors.
•Format-pluggable. The same Resume Context can serialize to a Claude system prompt, a Gemini system instruction, an OpenAI Agents SDK input, or a raw MCP resource.
•Auditable. Every hydration creates an immutable access record: who, when, what scopes, what was redacted, what was returned.

2. Resume Token Data Schema

A resume token is a JWS-signed JWT (RFC 7519) issued by api.unihodl.appand consumed by either UNIHODL’s hydration endpoint or, when scopes permit, directly by an agent that validates against UNIHODL’s JWKS.

Token claims

Claim	Type	Description
`iss`	string	Always https://api.unihodl.app — issuer.
`sub`	string	Workspace-scoped subject, e.g. wks_3kQ:usr_8aZ.
`aud`	string	Bound audience: model vendor or MCP server URL.
`jti`	string	Token ID. Used to revoke.
`iat / nbf / exp`	int	Standard JWT timing claims. exp ≤ iat + 86400.
`scope`	string[]	Space-delimited capability list (see §5).
`session_id`	string	Bound session, e.g. ses_8f3aZ91b.
`redaction_policy`	string	Named policy applied during hydration.
`max_hydrations`	int	Replay cap. 1 = single-use.
`delegator`	object?	If issued via handoff, the human who consented.
`nonce`	string?	Required for write-scoped tokens.

Example token (decoded)

json

{
  "iss": "https://api.unihodl.app",
  "sub": "wks_3kQ:usr_8aZ",
  "aud": "claude.anthropic.com",
  "jti": "rt_01HW7KQ8X2N3C9V0F4R6PD",
  "iat": 1746480000,
  "nbf": 1746480000,
  "exp": 1746483600,
  "scope": ["read:context", "read:reasoning", "read:redacted"],
  "session_id": "ses_8f3aZ91b",
  "redaction_policy": "default-strict",
  "max_hydrations": 5,
  "nonce": null
}

Resume Context payload (returned by /hydrate)

json

{
  "session_id": "ses_8f3aZ91b",
  "version": "1.0",
  "captured_at": "2026-05-05T22:14:08Z",
  "title": "GraphQL migration for API v3",
  "summary": "Researching whether to migrate API v3 from REST to GraphQL.",
  "tabs": [
    {
      "url": "https://blog.apollographql.com/rest-vs-graphql",
      "title": "REST vs GraphQL — performance",
      "scroll_y_pct": 0.62,
      "selected_text": "GraphQL batching outperforms ...",
      "tab_role": "primary_source"
    }
  ],
  "media": [
    { "kind": "video", "url": "youtu.be/abc", "timestamp_s": 1843,
      "transcript_anchor": "...the unified resolver pattern ..." }
  ],
  "reasoning_thread": [
    { "kind": "observation", "text": "REST batching wins on read-heavy endpoints." },
    { "kind": "decision_stance", "text": "Leaning hybrid; not full GraphQL." },
    { "kind": "blocker", "text": "Need finance approval for vendor B." },
    { "kind": "next_step", "text": "Draft hybrid RFC; meet with platform team." }
  ],
  "ai_tags": ["api-v3", "graphql", "hybrid-architecture"],
  "redactions": [{ "field": "tabs[2].selected_text", "reason": "PII" }]
}

3. Serialization, Compression & Redaction

Resume Contexts are designed to be small enough to fit comfortably in a 200K-token window alongside the agent’s task, structured enough to reason over, and safe enough to send across trust boundaries. The pipeline is deterministic: capture → normalize → redact → compress → serialize.

Wire formats

Format	MIME type	When to use
Canonical JSON	`application/vnd.unihodl.context+json`	Default. Best for inspection, debugging, and most agent inputs.
Compact CBOR	`application/vnd.unihodl.context+cbor`	Mobile clients, low-bandwidth handoffs, edge MCP servers.
Prompt-ready text	`text/x-unihodl-prompt`	Pre-rendered for direct concatenation into a system prompt.
MCP resource	`application/vnd.mcp.resource+json`	When the consumer is an MCP-capable agent.

Compression

Payloads larger than 32 KB are compressed with zstd level 9 and content-encoded (Content-Encoding: zstd). Clients that send Accept-Encoding: zstd always receive compressed payloads. For prompt-ready text the SDK transparently decompresses before serialization.

Redaction model

Redaction runs on the server during hydration. Policies are named, versioned, and bound to tokens via the redaction_policy claim. A policy is a list of typed rules applied in order. Field-level redactions are reflected in resume_context.redactions[] so the agent knows what was removed and why.

yaml

# default-strict policy (excerpt)
- match: { kind: "selected_text", regex: "\\b\\d{3}-\\d{2}-\\d{4}\\b" }
  action: { redact: true, reason: "PII:SSN" }

- match: { kind: "tab_url", host_in: ["banking.*", "payroll.*"] }
  action: { drop: true, reason: "domain:financial" }

- match: { kind: "reasoning_thread", contains_class: "PROTECTED_HEALTH" }
  action: { redact: true, reason: "PHI" }

- match: { kind: "ai_tags", value_in: ["confidential"] }
  action: { drop_all_with_tag: true }

Redaction is auditable

Each /hydrate call produces an audit_record with a hash of the unredacted payload, the policy version, and the list of fields removed. The unredacted payload itself is never logged.

Determinism & versioning

The serializer guarantees byte-stable output for any given (session_version, policy_version, format). The SDK exposes unihodl.context.fingerprint(ctx) — a SHA-256 over the canonical JSON form — so callers can cache hydrations and detect drift.

4. REST + MCP API Design

All HTTP endpoints are versioned at /v1/* and served from https://unihodl.app/api in v0 (with future graduation to https://api.unihodl.app). Every endpoint accepts and returns JSON unless otherwise noted. The MCP server speaks the Model Context Protocol natively over stdio in v0 and will be hosted at mcp.unihodl.app in v1.

Endpoint reference

Method	Path	Purpose	Auth
`POST`	`/v1/resume_tokens`	Mint a scoped resume token bound to a session and audience.	API key
`GET`	`/v1/sessions/{id}`	Inspect session metadata. Does not return content.	API key
`POST`	`/v1/sessions/{id}/hydrate`	Hydrate a session into a Resume Context. Requires resume token.	Resume token
`POST`	`/v1/sessions/{id}/handoff`	Transfer a session to another principal with consent.	API key + step-up
`POST`	`/v1/sessions/{id}/notes`	Append an agent-authored note (write scope required).	Resume token (write)
`POST`	`/v1/resume_tokens/{jti}/revoke`	Revoke a token before exp. Idempotent.	API key
`GET`	`/.well-known/jwks.json`	Public keys for token validation.	Public
`MCP`	`mcp.unihodl.app`	Tools: hold, resume, hand_off; Resources: unihodl://session/{`{id}`}.	Resume token

MCP surface

UNIHODL exposes itself as a first-class MCP server. Agents that already speak MCP (Claude, Cursor, Cline, Continue, custom MCP clients) can mount UNIHODL with no other code:

json

{
  "mcpServers": {
    "unihodl": {
      "command": "npx",
      "args": ["-y", "@unihodl/mcp-server"],
      "env": { "UNIHODL_API_KEY": "uh_live_..." }
    }
  }
}

5. Authentication & Permissioning

UNIHODL uses a two-tier credential model. API keys identify a workspace and are used by trusted backend code to mint resume tokens, which are short-lived, scoped, audience-bound credentials sent to agents.

Credential types

Type	Format	Lifetime	Use
API key (live)	`uh_live_…`	Until rotated	Server-side, mints tokens, never to agents.
API key (test)	`uh_test_…`	Until rotated	Sandbox environment, no real sessions.
Resume token	`eyJhbGc…`	≤ 24h, 1h default	Sent to agents/MCP clients. Audience-bound.
OAuth (delegated)	`Bearer (RFC 6749)`	Configurable	Third-party apps acting on behalf of users.

Scope catalog

Scope	Grants
`read:context`	Tabs, scroll positions, media timestamps.
`read:reasoning`	The reasoning_thread (intent, blockers, next step).
`read:redacted`	Surfaces redacted-but-acknowledged metadata (field paths only).
`read:transcript`	Full media transcripts when present.
`write:notes`	Append agent-authored notes to a session.
`write:next_step`	Mutate the next_step intent on the reasoning thread.
`session:hand_off`	Initiate a handoff to another principal (requires step-up).

Step-up authentication

Mutating endpoints (handoff, write:next_step) require the human to confirm in-app within the last 5 minutes. The SDK exposes session.requireStepUp() to trigger the prompt.

6. Human Reasoning State Model

Most context-passing schemes lose the why. Tab lists tell an agent what the human looked at, not what they were thinking. UNIHODL’s reasoning thread is a typed graph of nodes captured during HOLD by the on-device tagger, optionally enriched server-side, and serialized for agent consumption.

Node taxonomy

kind	Meaning
`observation`	A factual datum extracted from a source (tab, video, transcript).
`partial_conclusion`	A claim the human is forming but has not committed to.
`decision_stance`	Where the human currently leans on a pending decision.
`blocker`	An unresolved dependency stopping forward motion.
`question`	An open question the human is investigating.
`next_step`	The intended next action — agents should ground tool calls here.
`reference`	A pointer back into tabs[] or media[] for evidence.

Wire format

json

{
  "reasoning_thread": [
    {
      "id": "rn_01",
      "kind": "observation",
      "text": "REST batching wins on read-heavy endpoints.",
      "evidence": ["tabs[0]", "tabs[2]"],
      "confidence": 0.78
    },
    {
      "id": "rn_02",
      "kind": "partial_conclusion",
      "text": "Pure GraphQL is overkill for our workload.",
      "supports": ["rn_01"],
      "confidence": 0.66
    },
    {
      "id": "rn_03",
      "kind": "decision_stance",
      "text": "Leaning hybrid: GraphQL for write paths, REST for hot reads.",
      "supports": ["rn_02"]
    },
    {
      "id": "rn_04",
      "kind": "blocker",
      "text": "Need finance approval for vendor B's cap on Tier-2 throughput."
    },
    {
      "id": "rn_05",
      "kind": "next_step",
      "text": "Draft hybrid RFC; meet platform team Thursday.",
      "depends_on": ["rn_03"]
    }
  ]
}

Prompt-ready serialization

When X-Unihodl-Format: prompt-ready is requested, the reasoning thread is rendered as a structured natural-language block that LLMs reliably parse:

bash

## CONTEXT (UNIHODL session ses_8f3aZ91b · 2026-05-05)

The user is researching: "GraphQL migration for API v3."

What they have concluded:
  • REST batching wins on read-heavy endpoints. (confidence 0.78)
  • Pure GraphQL is overkill for our workload. (confidence 0.66)

Where they currently lean:
  → Hybrid: GraphQL for write paths, REST for hot reads.

Open blockers:
  ! Finance approval for vendor B's Tier-2 cap.

Intended next step:
  → Draft hybrid RFC; meet platform team Thursday.

Continue from here.

7. Reference: Anthropic Claude

Claude integrates with UNIHODL in two flavors: (a) system prompt injection — the simplest path, hydrate and pass the prompt-ready text as the system prompt — or (b) MCP server mounting, where Claude’s tool loop calls UNIHODL on demand.

Mode A — System prompt injection

python

import os
from unihodl_agent import Client
from anthropic import Anthropic

uh = Client(api_key=os.environ["UNIHODL_API_KEY"])
claude = Anthropic()

# Mint an audience-bound, single-use token for this turn.
token = uh.resume_tokens.create(
    session_id="ses_8f3aZ91b",
    audience="claude.anthropic.com",
    scopes=["read:context", "read:reasoning"],
    ttl_seconds=600,
    max_hydrations=1,
)

# Hydrate; ask for prompt-ready format.
ctx = uh.sessions.hydrate(
    session_id="ses_8f3aZ91b",
    token=token,
    format="prompt-ready",
)

resp = claude.messages.create(
    model="claude-sonnet-4-7",
    max_tokens=4096,
    system=ctx.text,
    messages=[
        {"role": "user", "content": "Continue Sarah's research and draft the hybrid RFC."}
    ],
)
print(resp.content[0].text)

Mode B — MCP server mount

Claude Desktop and Claude API both speak MCP. Mount UNIHODL once, then Claude can hold, resume, and hand off sessions on its own.

8. Reference: Google Gemini

Gemini integrates via function calling on the google-genaiSDK. Declare UNIHODL’s resume tool, register it with the model, and Gemini will call it when its reasoning needs the human’s prior context.

9. Reference: OpenAI Agents SDK & LangGraph

The OpenAI Agents SDK supports first-class tools. Wrap UNIHODL hydration as a tool and pass it to your agent — the agent decides when to fetch context. In LangGraph, hydrate UNIHODL once at graph entry and inject the Resume Context into the shared state.

Cross-framework guarantee

The Resume Context schema is identical across all integrations. An agent built on Claude can hand a session off to a Gemini-based agent, and the receiving agent sees the same fields, the same reasoning thread, the same evidence pointers.

10. Errors, Rate Limits & Versioning

Error envelope

json

HTTP/1.1 403 Forbidden
Content-Type: application/json

{
  "error": {
    "code": "audience_mismatch",
    "message": "Token audience claude.anthropic.com does not match request origin.",
    "audit_id": "aud_01HW7KQ9...",
    "hint": "Mint a new resume_token with the correct aud.",
    "doc_url": "https://unihodl.app/sdk/spec#errors"
  }
}

Error codes

code	HTTP	Description
`invalid_token`	401	Signature, exp, or nbf failed verification.
`audience_mismatch`	403	Token aud does not match calling origin.
`scope_insufficient`	403	Required scope is not present on the token.
`session_not_found`	404	session_id is unknown or has been deleted.
`hydration_exhausted`	429	max_hydrations cap reached for this token.
`rate_limited`	429	Per-workspace or per-token rate limit exceeded.
`redaction_failed`	422	Redaction policy could not be applied; see hint.
`step_up_required`	401	Mutating call requires recent in-app human confirmation.
`payload_too_large`	413	Estimated context exceeds requested max_tokens.

Rate limits

Endpoint group	Limit	Notes
Token mint	`300 / minute / workspace`	Bursts to 600; enforced via leaky bucket.
Hydration	`60 / minute / token`	Independent of token's max_hydrations cap.
Hydration	`10,000 / day / workspace`	Enterprise plans negotiate higher limits.
Webhook delivery	`20 / second / endpoint`	Beyond this, UNIHODL backs off exponentially.

Versioning

API path is the major version (/v1). Schema evolutions ship as additive minor versions surfaced in the version field of every payload. SDKs follow semver. Breaking changes require a 12-month deprecation window and a new path prefix.

Continue building. Sample apps and OpenAPI spec land at unihodl.app/sdk. Spec questions: developers@unihodl.app.

Open gaps tracked at /sdk/roadmap.