Agent API

Stable, dual-auth API + CLI for terminals, scripts, and agents. Base URL: https://api.getpromethic.com/api/v2/public

Quickstart

Issue a key in the web app at app.getpromethic.com → Settings → Developer Keys, then:

curl -H "X-API-Key: pmk_..." \
  https://api.getpromethic.com/api/v2/public/prompts

Or via the CLI:

npm install -g @soulwarestudio/promethic-cli
promethic auth login                              # paste pmk_... key
promethic prompts list
promethic run <prompt-id> --input "summarize this article"

For write-scope keys, an agent can author a whole prompt declaratively via a YAML manifest. Note the nested parameters object — the server requires { model_id, parameters } shape so per-model parameter values stay distinct from envelope fields.

Step one: discover a real model_id via GET /api/v2/public/models. Catalog IDs are opaque (e.g., gpt54nano_2c6f9b4d); the wire name isn't the marketing name. Pick one from the response:

curl -H "X-API-Key: pmk_..." \
  https://api.getpromethic.com/api/v2/public/models \
  | jq '.recommended_defaults'
# → {"model_id": "gpt55_8e2b1d4f", "reasoning_effort": "medium", ...}

Step two: paste it directly into your manifest — the catalog response is round-trippable. Copy model_id into modelSettings.model_id; pick parameter values from each param's values / min / max / provider_default.

# prompt.yaml
name: Article summarizer
promptText: |
  Summarize the input in three bullets.
modelSettings:
  model_id: gpt54nano_2c6f9b4d   # ← from /models response above
  parameters:
    reasoning_effort: low
attachments:
  - type: text
    file: ./examples.txt
promethic prompts create --manifest prompt.yaml

Authentication

All requests carry an API key in the X-API-Key header. Keys start with pmk_ and are scoped to one user. Three scopes are available in V1.0.1:

ScopeGrants
read list/get prompts, versions, records, attachments, record images, models catalog
execute run, revise (runId or recordId), finalize, abandon
write create + update + delete prompts, versions, attachments, records (V1.0.1 + V1.1 Phases 2 & 3)

Scopes are checked literally; an execute-only key cannot list prompts. To do all three, request all three: scopes: ["read", "execute", "write"]. DELETE is now part of write as of V1.1; per-prompt grants gate leaked-key blast radius.

Per-prompt grants V1.1

On top of scopes, each key can be optionally restricted to a specific set of prompts. Managed via the web app at app.getpromethic.com → Settings → Developer Keys → Manage; agents do not configure their own restrictions.

  • Unrestricted (default) — a freshly-minted key has zero per-prompt grants and can access ALL of your prompts, gated only by its scopes.
  • Restricted — once you add ANY prompt to a key's grant list, the key is restricted: only listed prompts are accessible. Calls to other prompts return 403 grant_required.
  • Removing the last prompt from a restricted key returns it to unrestricted (the web UI confirms this transition).

Enforcement is per-call: every endpoint that resolves to a single prompt (run, revise on runId, revise on recordId, finalize, abandon, prompt and version reads, prompt PATCH/version create/attachment upload, record GET/image/DELETE/PATCH) re-checks the grant list at the time of the call. If you revoke a grant mid-workflow, the next call to the affected prompt 403s.

Mixed credentials: a request carrying both a session token and an API key MUST resolve to the same user, else the server returns 400 mixed_credentials_principal_mismatch.

Endpoints

Read (read scope)

GET/prompts?limit=&cursor=list (slim DTO)
GET/prompts/{id}+ current version
GET/prompts/{id}/versionspaginated history
GET/prompts/{id}/versions/{vid}single version
GET/prompts/{id}/attachmentsprompt attachments
GET/attachments/{id}attachment download
GET/records?promptId=&versionId=&source=&createdBy=list (cursor-paginated)
GET/records/{id}slim record DTO
GET/records/{id}/image?index=Nimage PNG (binary)
GET/modelscatalog: model_id, supportedOutputModalities, costs (V1.0.1)

Execute (execute scope)

POST/prompts/{id}/runcreates RunSession; SSE
POST/runs/{runId}/reviseappend turn; SSE
POST/runs/{runId}/finalizesession→record (V1.1: ?persist= removed; always commits)
POST/runs/{runId}/abandonidempotent
GET/runs/{runId}/images/{N}session image (active sessions)
POST/records/{id}/revise V1.2rehydrate fresh run from record + revise (replaces V1.1 /revise-again); per-record advisory lock; SSE
DELETE/records/{id} V1.1self-delete (ApiKey-owned, 24h window)
PATCH/records/{id} V1.1amend notes/tag (RFC 7396; no time window)

Run lifecycle body shapes V1.1

The execute surface mirrors the desktop Avalonia app's Convert / Revise / Copy / Copy & Tag flow. Body shapes for each call:

POST /runs/{runId}/revise
{
  "instruction": "make it more concise",        // required
  "intermediateOutput": "edited prior output"   // optional
}

intermediateOutput is the user's edited prior-turn output. When passed, the model sees this (instead of the prior turn's actual output) as context for this revision, AND it lands in the resulting ConversionDelta.IntermediateOutput. Mirrors the Avalonia "user edited the textbox before hitting Revise" flow. Omit for vanilla revise-from-prior-output. Image- modality runs reject this field (400 intermediate_output_not_supported_image) — image revisions always source from the model's actual prior output. 32 KB cap.

POST /runs/{runId}/finalize
{
  "finalText": "edited final output",   // optional; creates edit delta if differs from model output
  "tag": "exemplar",                    // optional; attaches to edit delta (requires finalText)
  "notes": "context for this run"       // optional; record-level
}

Three-axis surface that mirrors Avalonia's Copy / Copy & Tag semantic exactly:

  • Plain finalize (no body): equivalent to Avalonia Copy with no edits. Saves the record from the model's last output. No edit delta.
  • finalText only: Avalonia Copy after editing the output box. If finalText differs from the model's last output, server creates an edit delta with empty tag. If it matches, no edit delta (treated as a clean Copy).
  • finalText + tag: Avalonia Copy & Tag. Tag attaches to the edit delta — the Refine wizard signal. Server returns 400 tag_without_delta if finalText is omitted or matches the model output (no edit delta to tag). Avalonia disables the Copy & Tag button in the same situation.
  • notes: independent of the above. Record-level free text. Available without finalText on plain finalize, or alongside finalText/tag.

Image-modality runs reject finalText (400 final_text_not_supported_image) — record.finalCopiedOutput for image records is server-derived from the per-turn effectivePromptForImage accumulation chain (training-data invariant; CLAUDE.md "Image Records: FinalCopiedOutput as Accumulated Prompt").

Caps: finalText 256 KB (413 final_text_too_large); notes 64 KB (413 notes_too_large); tag 256 chars (413 tag_too_large).

Record DTO turns[] field V1.2

Records returned by GET /records, GET /records/{id}, and POST /finalize include a turns[] array that is a synthesized linear history of the record's states (run + revisions + optional edit). Each entry has a stable index matching the fromTurn parameter accepted by /revise and /finalize.

{
  "id": "...",
  "promptId": "...",
  "inputText": "Summarize this article: ...",
  "finalCopiedOutput": "Y_edited",
  "turns": [
    { "index": 0, "kind": "run",      "input": "Summarize this article: ...", "output": "X" },
    { "index": 1, "kind": "revision", "instruction": "make formal",
      "intermediateOutput": "X", "output": "Y", "modelId": "...", "costMicroCents": 1234 },
    { "index": 2, "kind": "edit",     "intermediateOutput": "Y",
      "output": "Y_edited", "tag": "user-edit" }
  ],
  ...
}

Three kinds:

  • kind: "run" — always index 0. Carries input (the prompt's user input). NOTE: output for the run turn is "what the next turn saw as its prior context, OR the record's finalCopiedOutput if no later turn." If a user edited the textbox client-side before pressing Revise, the edit was committed forward as the next revision's intermediateOutput; the model's literal output text is not preserved.
  • kind: "revision" — each /revise call appends one. instruction is the user's revise instruction. intermediateOutput is what the model saw as prior context.
  • kind: "edit" — at most one per record; always the last entry. Created when /finalize was called with finalText differing from the model's last output. intermediateOutput is the model's actual last output before the edit; output is the user's edited text (= record's finalCopiedOutput).

Use turns[].index to pass fromTurn on /revise or /finalize for rewind-and-redo (V1.2).

Record self-management (V1.1)

DELETE /api/v2/public/records/{id} hard-deletes a record if and only if (a) the caller is an API key, (b) the record was created by the same API key (credentialPrincipalType=ApiKey + matching id), and (c) the record is less than 24h old (anchored to record.createdAt). Returns 204 No Content on success, or 403 record_not_owned_by_api_key / 409 record_self_delete_window_expired / 404 record_not_found otherwise. Delete is hard (cascade to deltas; FK ON DELETE SET NULL on RunSession.FinalizedRecordId auto-clears any finalize-replay rendezvous so the next /finalize replay returns 410 record_was_deleted). Image bytes are best-effort cleaned from blob storage. Retries after a successful DELETE return 404 — the row is gone, so HTTP-level idempotency is by-construction (the second call always 404s).

PATCH /api/v2/public/records/{id} updates notes and/or the record's edit-delta tag. Same ApiKey-owned check as DELETE, but no time window — amends are non-destructive. Body is RFC 7396 JSON Merge Patch (Content-Type: application/merge-patch+json): missing key = unchanged, explicit null = clear, value = set. Setting tag on a record with no edit delta returns 409 record_no_edit_delta — use notes for record-level labels instead. The response shape is { id, notes, tag, lastPatchedAtUtc }. HIPAA §164.312(b) audit row is written with field presence/length/SHA-256 prefix metadata only — never the raw notes/tag content.

Write (write scope) V1.0.1

All mutating POSTs honour the Idempotency-Key header — see Idempotency. PATCH /prompts/{id} follows RFC 7396 JSON Merge Patch: missing keys leave server state untouched; explicit null clears.

POST/promptscreate prompt + initial version
PATCH/prompts/{id}RFC 7396 merge patch — application/merge-patch+json
PUT/prompts/{id}/current-versionswitch which version is "current"
POST/prompts/{id}/versionsappend new version (auto-increments versionNumber)
POST/prompts/{id}/attachmentsupload file (multipart/form-data)
DELETE/prompts/{id} V1.1soft-delete prompt; cascades hide records/versions/attachments under it
DELETE/prompts/{id}/versions/{vid} V1.1soft-delete version; rejects current version (409 version_is_current)
DELETE/attachments/{id} V1.1soft-delete + storage refund; 409 attachment_referenced_by_active_run if any active run's snapshot references the blob

Field caps: name ≤ 256 chars, promptText ≤ 256 KB, modelSettings JSON ≤ 64 KB, text attachments ≤ 5 MB, image attachments ≤ 10 MB, ≤ 20 attachments per prompt, ≤ 50 MB per prompt, 1 GB per user.

Cursor pagination

List endpoints return { items: [...], nextCursor: string|null }. Cursors are signed (HMAC-SHA256) with a server-side root key; tampered cursors return 400 invalid_cursor; cursors issued for one user can't be replayed by another (400 invalid_cursor); changing query filters mid-pagination returns 400 cursor_filter_mismatch.

SSE protocol

All run-producing endpoints (/run, /runs/{runId}/revise, /records/{id}/revise) return 200 OK with Content-Type: text/event-stream. Errors are in-stream events, not HTTP status codes — clients should NOT branch on HTTP status for these endpoints.

Event taxonomy (protocol v1)

EventWhenPayload
run_sessionFirst event{protocolVersion, runId, turnIndex, modelId, outputModality?, seededFromRecordId?}
client_hint V1.1 Phase 0Image-modality runs only, immediately after run_session{filter: "partial_image_b64", reason: "image_modality_inline_payloads", canonicalAccess: "GET /api/v2/public/runs/{runId}/images/{n}"}
(upstream events)MiddleOpenAI's response.output_text.delta etc., passed through verbatim
run_completedSuccess terminator{runId, turnIndex, modelId, costMicroCents, imageCount?}
run_failedFailure terminator{runId, reasonCode, message?, charged, costMicroCents?, usageLogId?}
run_replayed V1.1Idempotency-Key replay terminator{runId, turnIndex, modelId, outputModality?, state, streamingInProgress, recordId?, hint}
record_finalized V1.2Chained auto-finalize succeeded; emitted AFTER run_completed on the same stream when ?autoFinalize=true{runId, recordId, turns, costMicroCents?}
record_finalize_failed V1.2Chained auto-finalize failed; the upstream run already succeeded. If retryable, agent calls POST /finalize manually.{runId, reasonCode, retryable}
record_finalize_skipped V1.2Informational; emitted AFTER run_failed when ?autoFinalize=true was set. Agents MUST NOT trigger separate failure handling — the failure is already reported in run_failed.{runId, reason: "run_failed", reasonCode}
Forward-compat: clients MUST ignore unknown event names. Servers MUST NOT remove or rename existing event names within a protocol version. Adding new event names is OK. Check protocolVersion in run_session against your parser's expected version.
Image-modality runs:
  • HTTP callers (direct REST to /api/v2/public): the SSE stream includes inline base64 image payloads on response.image_generation_call.partial_image and response.output_item.done. These can be tens of KB to several MB per frame. Most clients will want to filter them out and fetch the canonical bytes via GET /api/v2/public/runs/{runId}/images/{n} after run_completed. A client_hint event is emitted right after run_session to flag this.
  • MCP callers (mcp.getpromethic.com/v1): the MCP transport drops partial_image_* events and redacts large base64 from kept frames automatically. Final image bytes are returned inline as MCP image content blocks in the tools/call result alongside the text transcript — no follow-up fetch needed. Clients that don't render image content can still call promethic_get_run_image by index, or GET /records/{recordId}/image?index={n} after finalize.
  • run_completed.imageCount reports how many images THIS turn produced (per-turn, not session-aggregate). Iterate n in [0, imageCount). A text-only revise on an image session emits imageCount: 0.
  • Pre-finalize, GET /runs/{runId}/images/{n} reads the LATEST turn's images. To access prior-turn images after the run ends, finalize first and read record.imageStoredPath via GET /records/{id}/image?index=N — every per-turn image is preserved on the record.
  • If image generation completed upstream but blob storage failed (3× retries exhausted), the run terminates with run_failed { reasonCode: "image_upload_failed", charged: true }. Replay with the same Idempotency-Key returns this same failure (no re-attempt, no double-bill).
run_replayed does not reflect post-replay state. When you retry a /run or /revise with the same Idempotency-Key, the server returns the runId from the original call (the billable work is already in flight or done) and terminates this stream with run_replayed instead of run_completed. The payload's state field is a snapshot from when the replay was recorded — it does NOT track later state changes on the same run. For the live state of a replayed run, the agent must derive it from its own bookkeeping of the original call (and, once GET /api/v2/public/runs/{runId} ships in V1.2, poll that). The CLI's RunCallResult surfaces this as {succeeded: false, reasonCode: "replayed_state_unknown"} rather than masquerading as success.

Charge visibility

run_failed carries charged: true when the upstream call was billed despite the local failure (usage_log_write_failed or session_lost_mid_* after a successful upstream call). Agents should record the cost from costMicroCents for reconciliation.

Cost units V1.1

All cost fields on the public API surface use microcents (1 cent = 1000 microcents; 1 USD = 100,000 microcents). The integer wire format preserves sub-cent precision for image-token / reasoning-heavy calls that previously truncated. Render as cents-with-decimals via costMicroCents / 1000; as USD via costMicroCents / 100000; e.g. costMicroCents: 503 = 0.503¢ = $0.00503. Field was renamed from costMicros in 2026-05-11 because the Micros suffix incorrectly suggested microUSD (the value is actually 1/1000 of a cent, off by 100×).

Affected fields: record.costMicroCents, run_completed.costMicroCents, run_failed.costMicroCents, finalize.costMicroCents. The pre-V1.1 costCents field is removed from public DTOs; pre-V1.1 records (where the column was NULL) backfill via costCents * 1000 so historical rows still surface a cost.

Rate-limit headers V1.1

Every public-API response (success + 429) carries:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1746201600
X-RateLimit-Bucket: key=47/60,user=120/300

The standard pair (Limit / Remaining) reports the most-restrictive bucket — the per-key bucket when the caller is API-key-attributed (always smaller than per-user), else the per-user bucket. The diagnostic X-RateLimit-Bucket reports both (key=N/L,user=N/L) so agents can observe per-key vs per-user pressure separately. Session-only callers see X-RateLimit-Bucket: user=N/L.

CLI

The promethic Node CLI (@soulwarestudio/promethic-cli on npm) ships every public-API endpoint behind ergonomic commands.

npm install -g @soulwarestudio/promethic-cli
promethic auth login                       # paste pmk_... key
promethic auth status
promethic auth logout

promethic prompts list [--limit N] [--cursor C] [--json]
promethic prompts get <id> [--json]
promethic prompts delete <id>                                                # V1.1
promethic prompts delete-version <promptId> <versionId>                      # V1.1

promethic run <prompt-id> [--input "..."] [--input-file path] [--no-accept] [--auto-finalize true|false] [--json]   # V1.2: --auto-finalize
promethic revise <handle> --instruction "..." [--intermediate-output "..."] [--from-turn N] [--no-accept] [--json]
                                                                                                                  # V1.2: handle is run-id or record-id; --from-turn rewinds
promethic finalize <run-id> [--final-text "..."] [--tag "..."] [--notes "..."] [--from-turn N] [--json]
                                                                                                                  # V1.2: --from-turn rewinds; finalize on Finalized session amends in place
promethic abandon <run-id> [--json]

promethic records list [--prompt <id>] [--source API] [--json]
promethic records get <id> [--json]
promethic records image <id> --index N --output path.png
promethic records delete <id>                                                # V1.1
promethic records patch <id> [--notes "..."|--clear-notes] [--tag "..."|--clear-tag] [--json]   # V1.1

promethic run <promptId> --image <file> [--image <file>]                     # V1.1 Phase 3 — vision input

promethic attachments add <promptId> <file> [--type image|text] [--filename ...] [--json]
promethic attachments list <promptId> [--json]                               # V1.1 Phase 3
promethic attachments get <attachmentId> <outputPath>                        # V1.1 Phase 3
promethic attachments delete <id>                                            # V1.1

promethic mcp [--probe]                                                      # V1.1 — Claude Desktop / MCP server

Claude Desktop & other MCP clients V1.1

The CLI doubles as a local MCP (Model Context Protocol) server. promethic mcp exposes 25 tools covering the full Avalonia desktop / Expo web workspace surface:

  • read: list_prompts (returns name + description + outputModality so agents can decide whether to invoke without an extra round-trip), get_prompt, list_records, list_versions, get_version, list_attachments, get_attachment, get_catalog
  • execute: run_prompt (V1.2: optional autoFinalize auto-creates a record on success), revise_run (V1.2: accepts runId XOR recordId — auto-finalized records can be revised by recordId, no need to set autoFinalize=false to "keep editing"; fromTurn rewinds), finalize_run (V1.2: fromTurn rewinds; can amend a Finalized session), abandon_run, get_run_image, delete_record, patch_record (notes / tag merge-patch), create_record (V1.4: agent-curated training data — {promptId, input, output, notes?} creates a finalized record with no LLM call, no spend; for "voice prompt" workflows where the user collaborates with the agent to seed examples before running Generate Prompt on the desktop)
  • write (V1.1 Phase 2 + Phase 3): create_prompt, update_prompt, delete_prompt, create_version (with optional setAsCurrent in one transaction), update_version (versionDescription + description), switch_current_version, delete_version, upload_attachment, delete_attachment

All tools mirror the workspace flow agents would otherwise need a desktop or web browser to drive: author prompts, manage versions, attach reference files, run with vision, revise, finalize, edit notes/tags.

Install once, then add this block to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "promethic": {
      "command": "promethic",
      "args": ["mcp"]
    }
  }
}

Restart Claude Desktop. The 25 Promethic tools appear in its tool tray. The MCP server uses the same pmk_… key as the rest of the CLI (so promethic auth login covers Claude Desktop too); set PROMETHIC_API_KEY in the config block's env to override per-install.

Smoke-test before wiring Claude Desktop:

promethic mcp --probe   # auth + URL + connectivity check, exits

Image input/output V1.1 Phase 3

Output: image-modality runs return image bytes inline as base64 (≤ 16 MB raw) or as a local-file path (larger). Same shape via get_run_image for in-flight runs and get_attachment for prompt attachments.

Input: run_prompt.images accepts the SAME media-ref shape as the output. Agents can pipe a prior run's image straight back in:

{
  "images": [
    { "inline": true,  "base64": "...",      "mimeType": "image/png" },
    { "inline": false, "localPath": "/tmp/photo.png", "mimeType": "image/png" }
  ]
}

Up to 16 images per run, 10 MB each. Requires the prompt's model to declare input_image capability (GPT-5.x via Responses API; gpt-image-1.x as edit inputs). Trust note: localPath is read by the MCP server with the user's permissions — only use paths the agent is authorized to read.

Attachment management

upload_attachment takes either inline base64 or a localPath; idempotency keys are derived from (promptId, filename, content) so retries replay the original upload (no double-billing). list_attachments, get_attachment, and delete_attachment round out the surface. Per-file: 10 MB image / 5 MB text. Per-prompt: 50 MB total.

MCP cancellation propagates to a server-side /abandon so cancelled runs release their RunSession within ~1 s.

Same pattern works for Cursor, Zed, Continue, Cline — any desktop-class MCP-aware client.

Hosted MCP V1.3

For agents that can't (or shouldn't) run a local stdio server — Claude iOS, claude.ai web, sandboxed automations — Promethic hosts the same 24-tool surface at https://mcp.getpromethic.com/v1 speaking MCP Streamable HTTP transport (spec 2025-03-26). Same tools, same scopes, same pmk_ keys.

Why hosted MCP exists: an agent calling tools via stdio needs a local CLI install + a long-lived process. A hosted endpoint replaces both with one HTTP URL — Claude iOS just adds a connector, no local binary. The wire shape is identical to the local CLI, so existing scripts don't change.

Connect Claude Desktop / Cursor

{
  "mcpServers": {
    "promethic": {
      "url": "https://mcp.getpromethic.com/v1",
      "headers": {
        "Authorization": "Bearer pmk_..."
      }
    }
  }
}

Claude Desktop config path is the same as the stdio install above (claude_desktop_config.json). Cursor: settings JSON, same shape. The pmk_ key supplies auth; the server exchanges it at initialize for a connection-bound short-lived mcps_ session token.

Connect Claude iOS / claude.ai web / ChatGPT (OAuth)

These clients use OAuth 2.1 + PKCE instead of bearer-key paste — their connector UI doesn't accept a static token. Add a custom connector pointing at https://mcp.getpromethic.com/v1 and leave the OAuth fields blank; the client auto-discovers them via /.well-known/oauth-protected-resource. On Connect, a popup opens to the Promethic consent screen — sign in, click Allow, the connector activates. Revoke any time at app.getpromethic.com → Settings → Connected Apps. The full 24-tool surface appears in the agent's tool tray. Tools are namespaced promethic_<name> on the wire (underscore separator per Anthropic's Tool API name regex).

Image-modality runs over MCP

promethic_run_prompt on an image-modality prompt returns the generated image bytes inline as MCP image content blocks in the tools/call result, alongside the text transcript. Claude.ai / ChatGPT render these directly. The text transcript shows [image bytes elided ...] for events that originally carried base64 — those bytes live in the image content blocks instead. Use promethic_get_run_image only as a fallback (e.g., when re-fetching after losing the original tool result).

Per-tool grants (opt-in)

Hosted MCP supports an optional per-tool allow-list on top of the read/execute/write scopes. New keys are unconfigured by default and can call any tool the key's scopes permit — useful while the per-tool config UI is being built. Once you opt in (set an explicit allow-list on the key), only those tool names succeed; anything else returns tool_grant_required. The wildcard ["*"] explicitly reverts to allow-all, and the empty array [] blocks every non-discovery tool.

Discovery surfaces (list_prompts, get_catalog) always bypass the per-tool gate so agents can always discover what tools exist.

Differences from local CLI MCP

  • upload_attachment requires inline bytes_base64; localPath is rejected (the server has no agent's filesystem). 10 MB raw cap matches the local CLI; chunked upload (upload_id / chunk_index / chunk_total) is reserved for V2.
  • Idempotency keys: vendor-prefixed _meta namespace — _meta["com.getpromethic/idempotency-key"] on the JSON-RPC request. Same byte-identity replay semantics as the HTTP Idempotency-Key header.
  • Streaming tools (run_prompt, revise_run) use notifications/progress (MCP spec) for live model output. Path B durable resume: if the live stream drops, GET /v1/sessions/{sid}/calls/{toolCallId} returns the final result once available.

Run lifecycle (sessions vs records, auto-finalize)

A run is a transient session: the model receives your input, streams an output, and the server tracks state in RunSession (1h sliding TTL). A record is the persisted artifact: input + final output + any revision turns + edits + cost — the structured data Promethic uses to refine your prompt over time.

To go from session to record you call finalize_run. To do nothing and let the session expire you can abandon_run.

Auto-finalize (default ON since V1.2) chains a finalize_run after a successful run_prompt on the same SSE stream. One call, one round trip, one saved record. The new recordId arrives via the record_finalized SSE event. This is what most agents want — call run_prompt, use the returned record.

Pass autoFinalize: false on a single run_prompt call to opt out — useful when you want to revise_run the output before saving, or just inspect it and decide whether to abandon_run instead. Then revise with the runId and finalize manually with --final-text / --tag / --notes when ready. Use revise with a recordId instead of a runId to rehydrate from a finalized record.

To set the persistent default (so all your runs from any client behave the same way), use one of:

  • Avalonia desktop: Settings → Appearance → Hosted MCP toggle.
  • Expo (web / iOS): Settings → toggle "Auto-save MCP runs as records".
  • CLI: promethic config set auto-finalize-mcp-runs <true|false> (V1.3+).

Per-call autoFinalize always overrides the persistent default. The persistent default in turn overrides the V1.2 server default of true.

--no-accept on the CLI saves a JSON artifact at ~/.promethic/runs/<runId>.json (mode 0600) so the runId + state survives across shell sessions.

Override the API URL

export PROMETHIC_API_URL=http://localhost:8080
promethic auth status

Only https://..., http://localhost, and http://127.0.0.1 are accepted — the CLI refuses to send your key to other http:// hosts.

Errors

All 4xx/5xx responses follow RFC 7807 application/problem+json. Designed for self-healing agents — every error names what went wrong, what to do about it, and (where applicable) which exact field tripped:

{
  "type": "https://api.getpromethic.com/problems/invalid_model_settings",
  "title": "Model settings reference an unknown or inactive model.",
  "status": 400,
  "detail": "The model_id is not in the catalog, or it has been retired.",
  "reason_code": "invalid_model_settings",
  "action_hint": "List models via GET /api/v2/public/models, then retry with a current model_id.",
  "request_id": "req_01HX...",
  "invalid_params": [
    { "name": "modelSettings.model_id", "reason": "unknown_or_inactive_model" }
  ]
}

Each type URL is also a working redirect: GET /problems/{reason_code} 302s to the matching section of these docs (e.g. /problems/idempotency_key_reused). Agents that follow the link land on a human description plus the resolution steps for that specific reason.

Reason codes — each row's id is the redirect target for /problems/{reason_code}:

HTTPreason_codeMeaning
400invalid_requestshape error — see invalid_params (subsumes null-not-allowed via required_field_clear reason)
400invalid_model_settingsunknown / inactive model_id, or missing parameters object; action_hint tells you to GET /api/v2/public/models
400/413field_too_largesee V1.0.1 caps in the Write section
400idempotency_key_invalidmissing, > 255 chars, comma-joined, or non-visible-ASCII
400stream_requiredPOST /run/revise must include stream: true
400invalid_cursortampered or cross-user cursor
400cursor_filter_mismatchfilter params changed mid-pagination
400persist_query_param_removedV1.1 — ?persist query param removed; /finalize always commits. Use DELETE /records/{id} within 24h to undo.
410record_was_deletedV1.1 — this run finalized to a record that has since been self-deleted. Mint a new run.
500snapshot_corruptV1.1 — server-side data integrity: the run session's version snapshot failed to parse. Mint a new run.
409reopen_limit_exceededV1.1 — session has been reopened more than 100 times via /revise after /finalize. Mint a new run.
400intermediate_output_not_supported_imageV1.1 — image-modality runs reject intermediateOutput. Mirrors desktop: image revisions always use the model's actual output.
413intermediate_output_too_largeV1.1 — intermediateOutput exceeds 32 KB per-turn cap.
413final_text_too_largeV1.1 — finalText exceeds 256 KB cap.
413notes_too_largeV1.1 — notes exceeds 64 KB cap.
413tag_too_largeV1.1 — tag exceeds 256 character cap.
400tag_without_deltaV1.1 — /finalize received tag but no edit delta was produced (finalText omitted or matches model output). Use notes for record-level labels, or PATCH /records/{id} to re-tag.
403record_not_owned_by_api_keyV1.1 — DELETE/PATCH /records/{id} on the public API is restricted to the API key that created the record. Mutate via the Promethic web/desktop app, or use the original API key.
409record_self_delete_window_expiredV1.1 — DELETE /records/{id} on the public API is restricted to the first 24h after record creation. Delete via the Promethic web/desktop app instead.
409record_no_edit_deltaV1.1 — PATCH /records/{id} attempted to set tag on a record with no edit delta. Tags attach to edit deltas only; use notes for record-level labels.
403grant_requiredV1.1 — API key is restricted to a specific set of prompts and the requested prompt is not in that set. Manage at Settings → Developer Keys → Manage prompts, or use an unrestricted key.
409version_is_currentV1.1 — DELETE /prompts/{id}/versions/{vid} attempted on the prompt's current version. Switch the current version first via PUT /prompts/{id}/current-version.
409attachment_referenced_by_active_runV1.1 — DELETE /attachments/{id} blocked because an active RunSession's snapshot references this attachment. Wait for runs to finalize/expire (max 1h), or POST /runs/{runId}/abandon them.
409prompt_referenced_by_active_runV1.1 — DELETE /prompts/{id} blocked because at least one non-terminal RunSession is still active for this prompt. Wait or /abandon.
409version_referenced_by_active_runV1.1 — DELETE /prompts/{id}/versions/{vid} blocked because at least one non-terminal RunSession pins this version. Wait or /abandon.
409 / liftedimage_runs_not_supported_v1V1.1 Phase 7: lifted. Image-modality runs via API key are now supported on /run, /revise, /revise-again. The accumulated effective prompt is persisted per-turn and surfaces as record.finalCopiedOutput after /finalize. The reason code is kept in the table for back-compat with old SDKs but is no longer emitted.
400final_text_not_supported_imageV1.1 — image-modality runs reject finalText. record.FinalCopiedOutput for image records is server-derived from the image-prompt accumulation chain (training-data invariant).
413session_deltas_too_largeV1.1 — total session.Deltas jsonb exceeds the 2 MB cap. Finalize and start fresh.
413cost_incurred_no_delta_persistedV1.1 — upstream model call billed but the resulting turn couldn't persist (post-upstream cap exceeded). UsageLog has the charge.
400invalid_image_base64bad base64 in images[].data
400invalid_indexindex ≥ 0 violation (e.g. ?index= on record image)
400invalid_source?source= not in {App, Manual, API}
400instruction_requiredPOST /revise needs a non-empty instruction
400from_turn_invalidV1.2 — fromTurn not a valid non-negative integer
400from_turn_out_of_rangeV1.2 — fromTurn exceeds current turn count; re-read turns[]
400mixed_credentials_principal_mismatchsession + key resolve to different users
400mixed_credentials_key_mismatchtwo API keys present that don't match
401key_unauthorizedmissing / invalid / expired / revoked API key
403scope_requiredkey lacks the required scope
403api_key_not_permittedendpoint requires a session, not a key
404prompt_not_foundno prompt with that id is visible to this caller
404version_not_foundno matching version on the prompt
409current_version_missingprompt's currentVersionId points to a deleted version; no fallback could be self-healed
404record_not_foundno record with that id is visible to this caller
404run_not_foundrun expired, never existed, or not yours
404attachment_not_foundno attachment with that id is visible
404no_image_storedthis record has no stored images
404image_index_out_of_range?index past the count of stored images
404invalid_image_referencedefense-in-depth validation refused the path
409idempotency_key_reusedsame key on a different body — won't replay; mint a new key
409idempotency_in_flightsame key still being processed; retry after Retry-After
409session_busyanother /revise or /finalize in flight
409session_not_activerare CAS-race; re-fetch run state
409session_already_finalizedterminal state — fired by /revise on a finalized run. Note: /finalize itself is idempotent and replays the original 200 + recordId on retry; it does NOT 409.
409session_expiredpast 1h TTL
409session_failedterminal — see reason_code
409session_abandonedabandoned manually or by RevokeAsync's bulk abandon
409revision_chain_too_long25 turns/session cap
409revise_again_attachments_unsupportedrecord's prompt has attachments
409record_revise_in_progressV1.2 — another caller holds the per-record rehydrate lock for this recordId; retry after a short backoff
409snapshot_modality_unreadableinternal snapshot data unreadable
409finalize_completion_failedinternal: finalize transaction failed
500finalize_conflictunexpected record conflict during finalize; session reset to Active — retry /finalize
500image_upload_failedV1.1 Phase 0 — image generated upstream but blob storage write failed after retries; upstream charged (charged: true); replay returns this failure for the Idempotency-Key
500image_extraction_overflowV1.1 Phase 0 — upstream produced more images than the 16-per-turn cap; reduce n or split runs
409run_already_terminalcannot abandon a finalized/abandoned/expired/failed run
409version_create_contentionconcurrent version inserts; retry
413storage_quota_exceededper-prompt or per-user storage cap reached
429rate_limitedper-key or per-user bucket overflow; honour Retry-After
500stream_setup_failedSSE response failed to initialize before the proxy call
503auth_store_unavailabletransient idempotency-store race; retry
500idempotency_outcome_unknownV1.3 Phase 4b — process died mid-flight after possibly committing the domain mutation but before recording Complete; retries of the same Idempotency-Key replay this body until 24h TTL. Verify via GET before retry — see Recovery from idempotency_outcome_unknown for per-tool recipes.

Idempotency V1.0.1

Every mutating POST (the five Write endpoints above, plus /prompts/{id}/run when its body is the same as a prior attempt) accepts an Idempotency-Key header. This is a Stripe-style guarantee: a network glitch mid-call is safe to retry — the server replays the original response byte-identically instead of double-applying the side effect.

Contract

  • Header value: 1–255 visible-ASCII characters (0x21–0x7E), no commas, sent at most once.
  • Same key + same body + same route → server replays the original status, headers, and body.
  • Same key + different body → 409 idempotency_key_reused (the agent picked a key it already used for a different request — generate a new one).
  • Same key, original still in flight → 409 idempotency_in_flight + Retry-After: 1.
  • Records expire 24 h after the original call completes (Stripe parity). After expiry the same key is fresh again.
  • Replay returns the response shape from the original call. If we ship a new field on (e.g.) POST /prompts between your first call and your retry, the retry returns the OLD shape — not the new one. This is intentional Stripe parity: replays are byte-identical snapshots. The 24 h TTL bounds staleness; for the freshest shape, mint a new key.

How the CLI uses it

The CLI auto-generates a fresh UUIDv4 per invocation by default — each promethic prompts create call is a distinct attempt. Pass --idempotency-key <uuid> to pin one if you want a manual retry to be a no-op. The attachments add command derives a deterministic key from sha256(promptId + filename + size + content) so a retry of the same upload is naturally idempotent.

Filename participates in attachment identity. Both the CLI's deterministic key and the server's body hash include the filename. A retry of the same content under a different filename (e.g. --filename newname.txt) is treated as a fresh upload and consumes storage twice. If you want to rename an existing attachment, delete the original through the web app first (DELETE on attachments is V1.1 — see "Not in V1.0.1" below).

Recovery from idempotency_outcome_unknown V1.3 Phase 4b

If the server process dies between committing the domain mutation and recording the idempotency Complete, a sweep flips the row to state=failed with a synthetic body:

{
  "type": "https://api.getpromethic.com/errors/idempotency_outcome_unknown",
  "title": "Idempotent run outcome unknown",
  "status": 500,
  "reasonCode": "idempotency_outcome_unknown",
  "detail": "The original request died mid-flight (process crash or lease expired without heartbeat). The domain change MAY OR MAY NOT have landed. Verify via a GET before any retry — replaying the same Idempotency-Key returns this body verbatim, and a NEW key may duplicate the original mutation.",
  "route": "POST /api/v2/public/prompts"
}

Replays of the same key continue to return this body until the row's 24 h TTL expires. The atomicity refactor in PR #18 (per-endpoint BeginTransactionAsync wrapping the domain mutation + Complete) makes this case much rarer post-2026-05-09 — for tools that landed BEFORE PR #18 (or future tools added without the wrapper), this recovery is still load-bearing.

Per-tool verify recipes (use these BEFORE retrying with the same OR a new key):

Phase 6 timing note (2026-Q3): the recipe for run_prompt below uses a future clientIdempotencyKey field on the record DTO as the authoritative disambiguator. That field is not yet shipped — the explicit "(when available)" framing in the recipe handles this. Until Phase 6 lands, agents will fall back to the heuristic match (createdAt + inputText). The heuristic is unreliable for repeated identical inputs in the same window — verify carefully OR mint a new key + accept the duplicate cost when in doubt.
Tool / routeVerify recipe
POST /prompts + MCP create_prompt GET /api/v2/public/prompts (or MCP list_prompts) then match by the name field you submitted — names are user-chosen + likely unique within your set. If found: the create succeeded; do NOT retry. If not found: safe to mint a new key + retry.
POST /prompts/{id}/versions + MCP create_version GET /api/v2/public/prompts/{id}/versions (or MCP list_versions) then match by versionNumber = (highest from your pre-call read) + 1. If a version with that number exists with your prompt text: succeeded. If not: safe to retry with a new key.
POST /prompts/{id}/attachments + MCP upload_attachment GET /api/v2/public/prompts/{id}/attachments (or MCP list_attachments) then match by originalFilename + fileSize. If found: succeeded. If not: safe to retry. Note: storage quota was reserved at Begin time; a lost call leaves the quota reserved until the idempotency row's 24 h TTL refunds it via the orphan-blob sweep.
POST /prompts/{id}/run + MCP run_prompt Run records auto-finalize by default. Authoritative disambiguator (when available): filter list_records by clientIdempotencyKey — every record carries the originating Idempotency-Key from the call that created it. If found: the run succeeded and the record exists. Cost was billed; you've paid for it. If not found: the run did not complete; safe to mint a new key + retry.
Heuristic fallback (use ONLY when the authoritative path isn't available — e.g., a tool that doesn't yet expose clientIdempotencyKey): match by createdAt in your call window AND inputText. Be aware that an agent calling run_prompt with the same input multiple times in 24h cannot disambiguate via output / cost_micros alone — those are nearly identical for deterministic prompts. The heuristic is a guess; do not blindly retry on a match-of-many.
MCP finalize_run / revise_run These take a runId. Step 1: GET the run state via GET /api/v2/public/runs/{runId} (or let MCP list_records filter by runSessionId). If a record exists with your runId: finalize succeeded. If RunSession.State == Finalized with a finalizedRecordId: ditto, succeeded. If State == Active: the run is back to a state you can retry from — mint a new key + retry. If State ∈ {Running, Finalizing}: a concurrent attempt is in flight or recovering — wait + re-poll. If State == Failed: terminal; do not retry. Do NOT mint a new key without GET-checking state first — retrying a fresh-key finalize against a Finalizing session 409s (session_busy) or races the Phase 4b finalize-failure→Active reset.
MCP delete_* / patch_record GET the resource by id. If 404 (delete) or fields match your patch (patch): succeeded. Otherwise safe to retry with a new key.
Do NOT blindly retry with a new key. The crash body explicitly says "verify before retry" because domain state may have landed. A naive retry-with-new-key duplicates whatever did land — the exact silent-double-execute footgun the flip-to-failed semantic exists to surface.

Rate limits

Per-minute fixed-window buckets, evaluated AFTER auth (so an unauthenticated burst can't drain a per-user bucket the caller doesn't own):

ScopePer keyPer user
read60/min300/min
execute30/min90/min

On overflow: 429 with a Retry-After header and an RFC 7807 problem document carrying reason_code: "rate_limited" plus an action_hint describing whether the key or the user bucket overflowed.

Versioning

The URL path carries the major version (/api/v2/public). The SSE protocol carries an in-band protocolVersion for forward-compat extension within the same path version.

  • Removing or renaming an existing endpoint or event = major bump.
  • Adding a new endpoint, event, or response field = minor (no bump).
  • Changing field semantics on an existing field = major bump.

Catalog stability

GET /api/v2/public/models is an agent-facing contract. What's safe (no major bump) for us to do:

  • Add a new model.
  • Add a new value to a parameter's values enum (e.g. reasoning_effort: ["none","low","medium","high"][..., "xhigh"]). Strict-validating agents should treat unknown enum values as forward-compat additions, not errors.
  • Add a new capability bit, parameter, or cost field.
  • Retire a model. Once retired the model_id is no longer in the catalog and any prompt referencing it gets 400 invalid_model_settings with action_hint directing the agent to fetch the live catalog and pick a current model.

Known V1.0.1 limitations resolved in V1.1

  • cost_cents on records is integer-truncated; sub-cent costs round to 0. Resolved in V1.1 Phase 8: public DTOs now expose costMicroCents (microcents, 1/1000 cent) for sub-cent precision. costCents is removed from the public surface; render via costMicroCents / 1000.

Not in V1.1

  • DELETE on prompts / versions / records / attachments — leaked-key blast radius too high without per-prompt grants. Resolved in V1.1 Phase 5 (records) + Phase 6b (prompts/versions/attachments): per-prompt grants gate every mutation; record self-delete restricted to the originating API key + 24h window; attachment delete blocked while an active RunSession references the blob.
  • Image-output runs for API-key callers409 image_runs_not_supported_v1. Resolved in V1.1 Phase 7 + Phase 0: image-modality runs are supported on /run, /revise, /revise-again. Per-turn effectivePromptForImage accumulation persists into record.finalCopiedOutput on /finalize, restoring the desktop accumulated-prompt invariant. V1.1 Phase 0 wired the actual blob upload (Phase 7 lifted the gate but left ImageBlobKeys: null hardcoded — pre-Phase-0 records came back with imageStoredPath: null). All images now persist to blob storage, retrievable via GET /runs/{runId}/images/{n} in-flight and GET /records/{id}/image?index=N post-finalize. Records preserve every per-turn image (re-finalize merges, never shrinks).
  • GET /api/v2/public/runs/{runId} polling endpoint — V1.2. Until then, agents derive run state from their own bookkeeping of the original call. The run_replayed event on idempotent retries surfaces {succeeded: false, reasonCode: "replayed_state_unknown"} rather than masquerading as success.
  • CLI run --output-dir <dir> — V1.2. Image bytes are fetchable today via GET /runs/{runId}/images/{N} (in-flight) or GET /records/{id}/image?index=N (post-finalize); the auto-save UX is a CLI ergonomics improvement.
  • CLI grants management (keys grants list/add/remove) — V1.2. Per-prompt restrictions are configured by the user via Settings → Developer Keys → Manage in the web/desktop apps; agents do not configure their own restrictions.
  • Searchable prompt picker in Manage view — V1.2. V1.1 ships a plain scrollable checkbox list; search arrives once a user has 30+ prompts.
  • Webhooks, OAuth, PAT, team keys — V2.
  • Streaming on the CLI — CLI internally buffers SSE for revise chain. V2 may surface raw streaming.
  • ?fromTurn=N rewind — RunSession.Deltas is turn-indexed today; surface in V2. Resolved in V1.2: /runs/{runId}/revise and /runs/{runId}/finalize accept fromTurn in the request body. Drops session.Deltas entries with turnIndex > fromTurn before applying the operation. Image blobs orphaned by the rewind enqueue into blob_cleanup_queue (drained by a background worker with reference-count guard).

Changelog

v1.3 — 2026-05-11 — BREAKING (per-prompt MCP tools)

  • Prompt-level description dropped from every wire surface. POST /api/v2/public/prompts no longer accepts description; PATCH /api/v2/public/prompts/{id} accepts only name and abbreviation. PublicPromptCreatedResponse drops the field. The cloud_prompts.Description column is dropped from the database with no data preservation. Capability descriptions live on versions only.
  • Version description is the agent-facing capability description. Every version carries a one-sentence summary that describes what the prompt does — what it expects as input and what it returns. Surfaced in tools/list synthesized tool descriptions and list_prompts, so agents can pick a prompt in one round-trip.
  • New descriptionMode field on versions. Numeric on the wire (0=Auto, 1=Manual). In Auto, the server regenerates Description with gpt-5.4-nano on every PUT version that changes promptText (fire-and-forget worker, ~$0.0003 per fire, conditional UPDATE that no-ops on stale starts). In Manual, the user/agent owns the field.
  • Description-write rule. Writing description at PATCH /api/v2/public/prompts/{id}/versions/{vid} (or update_version on MCP, or PUT version on the private cloud surface) is treated as the caller taking ownership — Mode auto-flips to Manual if it isn't already. Pass descriptionMode: 0 in the same request to revert to Auto and let the server worker resume regenerating; explicit descriptionMode wins over the implicit description-presence flip. JSON null for description is a no-op (send "" to clear deliberately). Earlier (pre-2026-05-13) silent-ignore-in-Auto + explicit-only-flip rule was a footgun and is no longer in force.
  • If-Match precondition (optional) on PUT/PATCH version endpoints. Token format: UpdatedAt.Ticks as lowercase hex (e.g. If-Match: 8db7e12c0e7c100). Mismatch returns 412 Precondition Failed. Absent header keeps last-write-wins legacy semantics. PUT version now returns 200 OK + VersionResponse (was 204) so the client gets the new UpdatedAt for the next If-Match token. Future PUT/PATCH endpoints will follow this convention.
  • Per-prompt MCP tools (opt-in). Toggle via POST /api/v2/prompts/{id}/mcp-toggle with { "expose": true }. Each exposed prompt appears in your agent's MCP tools/list as promethic_{slug} (e.g. promethic_clay_cuties). Agents invoke by name in one round-trip — no list_prompts + get_prompt + run_prompt dance. Cap = 50 per account; cap-hit returns 409 mcp_tool_cap_reached. Tool name is stable across prompt renames so hardcoded agent code keeps working. To re-derive the tool name from the new prompt name, call POST /api/v2/prompts/{id}/mcp-rename; collision returns 409 tool_name_taken or 409 tool_name_reserved.

v1.2 — 2026-05-07

  • Auto-finalize on /run: pass ?autoFinalize=true (now the default; toggle per-user via the autoFinalizeMcpRuns setting) and the server chains an internal /finalize after a successful run. The new recordId arrives via the record_finalized SSE event on the same stream as run_completed. Three new SSE events: record_finalized, record_finalize_failed (chain failed; agent decides whether to call POST /finalize manually based on retryable), record_finalize_skipped (informational, after run_failed).
  • fromTurn rewind primitive: /runs/{runId}/revise and /runs/{runId}/finalize accept fromTurn. Drops session turns > fromTurn, then applies the operation. Out-of-range → 400 from_turn_out_of_range.
  • Finalize-on-Finalized amend: calling /finalize with new content (finalText / tag / notes / fromTurn) on a Finalized session reopens the session, bumps RunGeneration, and re-finalizes — same record ID, same handle. Fresh idempotency boundary for the new gen.
  • Unified turns[] on PublicRecordResponse: every record DTO carries a synthesized turns array (run / revision / edit, indexed contiguously) reconstructed from the input + delta chain + final-copied-output. Resolves the V1.1 stitching gap where agents had to mentally combine inputText + finalCopiedOutput + deltas[].
  • POST /records/{id}/revise replaces /revise-again: rehydrate a fresh RunSession from a finalized record's snapshot and revise. Same body shape as /runs/{runId}/revise (carries intermediateOutput + fromTurn). Per-record advisory lock serializes concurrent rehydrate attempts (409 record_revise_in_progress on contention). Old /revise-again route HARD-REMOVED.
  • Image blob cleanup queue: fromTurn rewinds legitimately shrink record image history. Dropped per-turn blobs enqueue into blob_cleanup_queue (background worker, single-leader via pg_advisory_lock(3), reference-count guard against both ImageStoredPath storage formats before S3 DELETE).
  • Spend audit discriminator: UsageLog.Discriminator column reserved for billing-eligibility tagging; SpendQueryFilters.Billable drives all SUM rollups (admin LIST endpoints intentionally show every row for audit visibility).
  • MCP CLI surface: revise_again tool COLLAPSED into revise_run (accepts runId XOR recordId). run_prompt grows autoFinalize?: boolean. finalize_run + revise_run grow fromTurn?: number. Tool count: 25 → 24.

v1.1 — 2026-05-03

  • Per-prompt grants (Phase 6a): API keys can be restricted to a specific set of prompts. Configured via Settings → Developer Keys → Manage in the web/desktop apps. Three new session-only endpoints: GET/POST/DELETE /api/v2/keys/{keyId}/grants. Restricted-key access to non-granted prompts returns 403 grant_required.
  • Server-stateful runs: RunSession table replaces the V1 echo-back signed-blob model. Agents hold an opaque runId; the server keeps prompt + version snapshot frozen at /run time, immune to mid-flight prompt edits. POST /runs/{runId}/revise, /finalize, /abandon, /revise-again; GET /runs/{runId}/images/{N} for in-flight image fetch.
  • Idempotency-Key on the full execute surface (Phase 3e/3f): /run, /revise, /finalize all replay byte-identically on retry. Route-signature composition with @gen{N} on /finalize so reopen-on-revise creates a fresh idempotency boundary. SSE replay protocol via the new run_replayed event (terminal-with-info).
  • Record self-management (Phase 5): DELETE /records/{id} (24h, ApiKey-owned, hard-delete + cascade) and PATCH /records/{id} (notes + tag, no time window). HIPAA §164.312(b) audit row on every mutation with PHI-aware presence/length/SHA-256-prefix metadata.
  • DELETE on prompts/versions/attachments (Phase 6b): write-scope + grant check. DELETE /attachments/{id} blocks if any active RunSession references the blob (via VersionSnapshot OR CurrentImageBlobKeys) → 409 attachment_referenced_by_active_run. DELETE /prompts/{id}/versions/{vid} rejects current version atomically. DELETE /prompts/{id} + versions/{vid} reject when active runs are pinned (409 prompt_referenced_by_active_run / version_referenced_by_active_run).
  • Image-output runs for API-key callers (Phase 7): 409 image_runs_not_supported_v1 gate lifted on /run, /revise, /revise-again. Per-turn effectivePromptForImage accumulation persists into record.finalCopiedOutput on /finalize, restoring the desktop accumulated-prompt invariant.
  • Catalog enforcement (Phase 4): ModelSettingsValidator wired ADDITIVELY into POST /prompts + POST /prompts/{id}/versions + PATCH /prompts/{id}. Out-of-enum values like reasoning_effort: "xtreme" now 400 invalid_model_settings at write time instead of silently failing at /run.
  • Observability + cost precision (Phase 8): X-RateLimit-* headers on every response (both buckets reported). cost_micros (1/1000 cent) replaces cost_cents on public DTOs for sub-cent precision. response.usage SSE event reasoning_tokens fix for the Responses API shape.

v1.0.1 — 2026-04-29

  • Write scope + 5 new endpoints: prompt create / patch (RFC 7396) / current-version switch, version create, attachment upload.
  • Idempotency-Key header on all mutating POSTs (Stripe parity, 24 h TTL).
  • RFC 7807 problem+json errors with action_hint + invalid_params for self-healing agents.
  • GET /models — slim catalog endpoint with supportedOutputModalities.
  • CLI: prompts create / prompts patch / prompts switch-current / versions create / attachments add, plus YAML manifest mode.
  • Developer Keys management UI in the web + desktop apps.

v1 (alpha) — 2026-04-27

  • Initial public surface: read + execute scopes.
  • SSE protocol v1 with run_session / run_completed / run_failed taxonomy.
  • promethic CLI alpha (Node 18+).