Everything an agent needs to run
You author agents with the SDK; an Engine runs their reasoning — pluggable, ours or ADK / LangGraph; the Runtime is the harness that governs and persists every step; the Console is where you operate them; the Data plane gives governed access to your lakes; and the AI Cloud is the infrastructure it all runs in — your cloud, sovereign.
The runtime doesn't think — it governs, persists, and operates whatever does. Bring your own engine or use ours; either way every model, data, and tool call crosses the runtime's gateways (admission, budgets, audit), runs durably, and lands in one console — a complete harness around the model, composed and deployed as TypeScript infrastructure-as-code.
You're building a harness — do it right
Putting an LLM in production isn't prompt-writing — it's harness engineering: building the structure, governance, memory, and observability around the model that make it safe. That's the platform engineer's job, and Agentist is how you do it in code you own — not a pile of prompt hacks that drift.
Don't start from scratch: extend a base harness — on-call SRE, support, data analyst — with your own identity, charter, skills, and policies. Harness engineering by composition.
The model reasons; the harness keeps it bounded, observable, and accountable — the line between a demo and production, and between writing prompts and engineering a system.
Quickstart
From zero to a triaged alert in about ten minutes.
1 · Create a project
npx agentist init oncall
cd oncall2 · Run it locally
agentist dev starts a local runtime (in-memory) and a console at localhost:3000.
agentist dev
agentist run oncall.triage "checkout p99 latency is 5s" # positional prompt — no flag{ "severity": "sev2" }One typed contract, every caller
Every agent has a typed input schema. The runtime validates every invocation against it — from code, an event, or a person — so nobody hand-writes JSON.
The Console renders a form straight from the schema, so an operator can trigger a run without touching code.
The authoring layer
Define agents as typed TypeScript. The SDK compiles your code into a manifest the engine executes and the runtime governs — three distinct layers, one typed source.
Each gets its own section below — Agents, Tooling, Workflows, Governance, Approvals, Conversations.
Agents have an identity — and a culture
An agent isn't a pile of rules. It has an identity — who it is — and inherits your company's charter: the mission and values every agent shares. That's what gives it judgment and character, not rigidity.
// your company's charter — defined here, or pulled from Agentist Cloud
const acme = charter({
mission: "Keep systems reliable and customer trust intact.",
values: ["Bias to safety", "Explain your reasoning", "Escalate when unsure"],
})
const oncall = agent({
extends: harness("on-call"), // a base harness — extend, don't rebuild
id: "oncall",
identity: { name: "Otto", role: "On-call SRE", persona: "Calm, terse, evidence-driven." },
charter: acme, // shared mission & culture
skills: [k8s, prometheus, runbooks], // reusable capabilities
duties: { triage, remediate },
})- Identity — name, role, persona. The agent is someone, not a tone.
- Charter — your mission and values, inherited by every agent. Define it in code, or manage it centrally in Agentist Cloud and pull it in — one mission across every team.
- Duties — each is one typed call. Culture gives judgment; policies give hard guardrails. Character and control.
- Skills — reusable capabilities (tools + duties) you compose in. Extend a base harness (SRE, support, data) instead of starting from zero.
Tooling
Typed functions an agent may call. The runtime runs them — the agent never touches a path or a raw query.
const queryMetrics = tool({
id: "queryMetrics",
input: z.object({ service: z.string(), window: z.string() }),
output: z.object({ p99Ms: z.number(), errorRate: z.number() }),
run: ({ service, window }) => prometheus.query(service, window),
})
// inside a duty:
const m = await ctx.tools.queryMetrics({ service: a.service, window: "15m" })Workflows
Compose duties into a typed graph. The control flow is explicit — not hidden inside the model.
export const incident = workflow("incident")
.input(Alert)
.step("triage", oncall.duties.triage)
.delegate("findings", (alert) =>
["app","db","network"].map((area) => investigate.duties.scan.with({ alert, area })),
) // parallel sub-agents — durable & governed
.step("diagnose", oncall.duties.diagnose, (s) => ({ findings: s.findings }))
.approval("on-call")
.step("remediate", oncall.duties.remediate)
.commit().step·.branchon a typed field ·.delegateto sub-agents in parallel ·.approvalfor a human ·.criticto evaluate..delegateruns sub-agents concurrently — each a durable, admission-checked step, gathered under a shared budget and one trace..commit()freezes the graph into the manifest the engine runs and the runtime governs.
Governance
Governance is enforced with policies — hard gates the runtime checks before every step. Agents propose, the runtime commits, and every decision is recorded.
const remediationGate = policy({
id: "remediation-needs-approval",
applies: { duty: "oncall.remediate" },
check: ({ state }) =>
state.remediate.destructive
? { effect: "approval", reason: "Destructive remediation needs approval" }
: { effect: "allow" },
})
runtime({ policies: [remediationGate] }) // applies to every agent — no one can forget itApprovals
An approval pauses a run until a human signs off. The workflow suspends durably, waits in the Console, and resumes exactly where it paused — even days later.
workflow("remediate")
.step("plan", oncall.duties.plan)
.approval("on-call") // pauses → waits in the approvals inbox
.step("apply", oncall.duties.apply)
.commit()
await rt.resume(runId, "on-call", { by: "kc@kc.io", decision: "approve" })Approvals don't have to live in the Console — route one to Slack, Jira, GitHub, or email through connectors, wherever your on-call already is. Same audit trail, anywhere.
Agents you can talk to
Your agents are always on and addressable. You talk to them; they talk back; and when something needs doing, they can start the conversation. But this isn't a chatbot bolted on — the conversation runs on the same governed, durable, audited substrate as everything else. The chat is the control plane.
const otto = rt.agent("oncall") // every agent is addressable, always on
// talk to it — its typed duties run underneath the conversation
await otto.say("why is checkout slow right now?")
// it can open the thread itself — proactive, on its own triggers
await otto.notify("on-call", "p99 is climbing on checkout — I'd roll back 4f2a.")
// a destructive step surfaces as an approval *in the same thread*,
// admission-checked like any other step — reply to resume the runBecause a run is already an event-sourced thread and an agent already has an identity, "talk to your agent" and "continue this run" are the same thing. The approval, the chat, and the audit entry are one journaled trail — not three separate surfaces.
The reasoning engine — pluggable
The engine is the only layer that thinks: it runs the reason → act → observe loop and multi-agent orchestration. Everything around it — authoring, governance, durability, operations — is the runtime. The engine plugs in behind one contract — it proposes a step; the runtime commits it — so whether you run the Agentist engine or bring ADK, every reasoning step, tool call, and delegation is governed the same way.
Reasoning steps
Each turn of the loop is typed and governed: the engine proposes a step, the runtime admits it, the result is journaled. The whole reasoning trace is replayable — you see exactly what the agent thought and did.
- Typed steps — each step has typed inputs and outputs, not free text.
- Governed per step — admitted before it runs; budget enforced as it goes.
- Replayable — the full trace is journaled; re-issue any captured step.
Delegate to sub-agents
The engine hands work to sub-agents in parallel and gathers typed results. Each branch is a durable, admission-checked step under a shared budget — in one trace.
const [diagnosis, signals, logs] = await ctx.delegate([
k8s.duties.diagnose(a),
metrics.duties.analyze(a),
logs.duties.scan(a),
]) // parallel sub-agents — durable, governed, gatheredTool use, governed
The engine selects and calls tools — typed functions and MCP servers alike. Every call crosses the gateway, so it's validated, budgeted, and audited; the agent never touches a credential or a path.
const recentDeploys = tool({
input: z.object({ ns: z.string() }),
run: (a) => kube.deploys(a.ns), // server-side, validated I/O
})
// the engine calls it; the runtime admits & journals every invocationContext engineering
The hardest part of any agent loop is deciding what the model reasons over. The engine gives you tools to shape the working window — pin facts, pull in recall, compact old steps, drop noise — all budget-aware. (Distinct from durable state and long-term recall.)
// shape what the model reasons over, mid-loop
ctx.context.pin(runbook) // keep all run
const hits = await ctx.context.recall("past p99 incidents", { topK: 5 })
ctx.context.compact({ olderThan: 12 }) // summarize old steps
ctx.context.drop("raw.logs") // evict noisy outputOr declare a context policy on the engine, applied every step:
agentist({
context: {
budget: "32k", // token budget for the window
include: [charter, runbooks], // always in context
compact: "summarize", // when it fills, summarize the oldest
},
})- Pin & include — keep the charter, runbooks, or key facts in the window for the whole run.
- Recall into context — pull only the relevant, policy-checked slices from the data plane.
- Compact & drop — summarize old steps and evict noisy tool output to stay in budget.
Bring a proven engine
An adapter conforms an existing framework to the engine port — so it runs inside the Agentist harness, in your cloud, durable and governed, without rewriting your agents.
import { runtime } from "@agentist/sdk"
import { adk } from "@agentist/engine-adk" // also: langgraph · autogen · crewai
runtime({ engine: adk() }) // ADK reasons; the runtime governs every call- Conforms to the port — propose→commit; the harness, console, and contracts don't change.
- Durable at the boundary — each call is a journaled step (the native engine adds mid-loop durability).
- ADK · LangGraph · AutoGen · CrewAI — swap without touching governance.
The Agentist engine
The first-party engine goes where a wrapper can't — because it owns the loop, governance and durability reach inside the reasoning, not only the egress. It's the default.
import { runtime } from "@agentist/sdk"
import { agentist } from "@agentist/engine"
runtime({ engine: agentist() }) // typed reasoning, governed end to end- Admission at the reasoning step — shape what the model may propose, not only what it runs.
- Mid-loop durable execution — event-source every step; resume mid-thought.
- Typed end to end — the reasoning graph is typed;
delegateis a primitive. - Charter-native judgment — your mission & values govern the reasoning as it happens.
MCP & A2A
Whichever engine runs, your agents stay reachable and composable: expose them as MCP tools and A2A servers, and consume external MCP tools and A2A agents — all through the governed gateway.
rt.mcp({ port: 8080 }) // every agent → an MCP tool
oncall.a2a() // expose as an A2A server (Agent Card)
const peer = a2a("https://acme.dev/sre") // consume an external A2A agentThe harness — it governs, it doesn't think
The runtime is the harness, not the brain: it hosts the engine behind one contract — the engine proposes, the runtime commits — and runs every step as a durable state machine in your own cloud, committing only if policy, budget, and types allow. Swap the engine; the harness is unchanged.
Every run is a durable state machine
A trigger admits a run under an idempotency key; a worker leases each step, runs it through governance and the gateway, and journals the result before moving on. Crash mid-run — a deploy, an OOM, a lost node — and it resumes from the last journaled step. A step can also suspend for days awaiting an approval, a timer, or a signal, then resume at the exact point. Exactly once, every time.
Configure once — it applies to every agent
const rt = runtime({
store: postgres(process.env.DATABASE_URL), // durable, resumable state
llm: anthropic("claude-sonnet"), // default model
policies: [remediationGate], // admission control — runs before every step
memory: pgvector(), // scoped, per-tenant recall
isolation: "container", // sandbox each agent
audit: true, // OpenTelemetry + replayable log
})
rt.register(oncall, incident)
rt.serve() // exposes the API, MCP, and the schedulerEach concern below — durability, admission control, isolation, scale, observability, memory, triggers — is configured once here and enforced uniformly across every agent you register.
Local Development
One command. In-memory runtime, local console, instant feedback — then deploy the same manifest to your cloud.
agentist dev # runtime + console at localhost:3000Durability & recovery
Every run is event-sourced: each step's result is appended to a journal in Postgres before the next step begins. State lives in the database you already run — there's no separate workflow cluster to operate.
const rt = runtime({ store: postgres(process.env.DATABASE_URL) })
// crash mid-run? it resumes from the last journaled step —
await rt.resume(runId) // exactly once, on any workerAdmission control
Most frameworks watch an agent act and escalate afterward. Agentist runs policy as an admission check before every step and state change — the agent proposes an action, and the runtime commits it only if policy, budget, and types allow. Preventive, not forensic.
const gate = policy({
applies: { duty: "oncall.remediate" },
check: (a, ctx) => a.sev === "1"
? ctx.approval("on-call") // pause for a human
: ctx.allow(), // otherwise admit
}).approval(role) suspends the run durably and routes to Slack, Jira, GitHub, or the ConsoleIsolation & security
Agents are untrusted code by default. Each runs sandboxed, scoped to its tenant, with secrets and provider keys held by the runtime — never within agent reach.
runtime({ isolation: "microvm" }) // process · container · microvm
// per-agent sandbox; the runtime holds secrets, never the agentScaling & concurrency
Workers are stateless and the queue is durable, so throughput is just a function of how many workers you run. One pool runs both live agents and batch jobs; scale horizontally, autoscale on backlog, and keep tenants fair.
runtime({ workers: { min: 2, max: 50, scaleOn: "queueDepth" } })
// stateless workers lease a durable queue — add more to scaleObservability & tracing
Every step, call, and decision emits a span and a journal entry — so you can watch the fleet live, trace one request across many agents, and reconstruct any past run exactly.
runtime({ audit: true }) // OTel spans + append-only journal
const out = await rt.replay(seq) // re-issue the exact call
// same input, model & route — diff vs what was capturedDurable state
The runtime persists run and agent state with the journal — it survives restarts and is available on resume. Long-term recall lives in the data plane; the engine's short-term working context is separate.
await ctx.state.put("incident.severity", "sev1") // per-run, journaled
const sev = await ctx.state.get("incident.severity")Triggers & scheduling
A run can start from anywhere — and every entry point is idempotency-keyed, so a redelivered event or a double-click never fires an agent twice.
rt.cron("0 9 * * 1", weekly.report) // schedule
rt.on("deploy.failed", oncall.triage) // internal event
rt.serve() // + webhooks, API & MCP — all idempotency-keyedThe gateway
One governed boundary, both directions: agents reach out to models, data, and tools, and the world reaches in to call your agents. Every call — egress or invocation — is validated, policy-checked, and audited.
Pick a model per agent — or per duty. The gateway routes each to whatever your cloud runs.
const oncall = agent({
id: "oncall",
model: "claude-sonnet", // default for this agent
duties: {
triage: duty({ model: "claude-haiku", /* … */ }), // fast & cheap
diagnose: duty({ model: "claude-opus", /* … */ }), // deep reasoning
},
})
// gateway maps models per cloud: AWS→Bedrock · GCP→Vertex · Azure→OpenAI · bare-metal→vLLMHow it's deployed
The gateway ships inside the runtime — it deploys with it, in your cluster. Provider keys live in the gateway, in your boundary; models connect through each cloud's native service (Bedrock, Vertex, Azure OpenAI) or a self-hosted endpoint (vLLM, Ollama). Self-hosted, nothing routes through us.
With Agentist Cloud
Connect Agentist Cloud and the gateway gains a managed layer — pooled model rates, cross-cloud routing & GPU burst, a shared response cache, automatic provider fallback, and org-wide spend governance — while your prompts and data still never leave your boundary.
Security & compliance
Because every model and data call crosses the gateway, it's the one place compliance is both enforced and evidenced. Provider keys stay in-vault, services talk over mTLS, PII is redacted in-path, and every egress and invocation is journaled to the tamper-evident audit log — so SOC 2 evidence is a query, not a quarter-long scramble. Nothing leaves your boundary.
Invocation — one input, every caller
The same gateway governs the inbound side. An agent or duty declares one typed input schema; every caller — a human, another agent, your app, an API, or an MCP client — satisfies that one contract. Each entry is policy-checked, idempotency-keyed, and audited like any other call.
agentist run oncall.triage "p99 5s"; use flags or --json for typed multi-field inputtriage.with({ … }) — traced end to end across agentsPOST /v1/runs/<agent.duty> with a JSON body matching the schemainputSchema — call from Claude or Cursor// one typed input contract — a single field is positional on the CLI
const triage = duty({ input: z.object({ prompt: z.string() }), /* … */ })
// HUMAN · CLI — the prompt is the default; no flag needed
// agentist run oncall.triage "checkout p99 latency is 5s"
// AGENT → AGENT — typed, traced across agents
const out = await triage.with({ prompt: "checkout p99 5s" })
// API · SDK — same contract, over the network
const run = await client.run(oncall.triage, { prompt: "checkout p99 5s" })
// POST /v1/runs/oncall.triage { "prompt": "checkout p99 5s" }
// MCP — the schema is the tool; expose every agent
rt.mcp({ port: 8080 })The platform, composed in code
Defining agents is half the job; the other half is standing up what they run on — a runtime, a store, model endpoints, the gateway, a vector store, sandboxes. Agentist gives you that as TypeScript infrastructure-as-code: compose the platform's components in one typed file, and agentist deploy materializes them onto Kubernetes — in your cloud or ours, from the same code.
import { stack, gateway, models, vectors, postgres, image } from "@agentist/infra"
import { bedrock } from "@agentist/infra/aws"
export default stack({
target: "k8s", // the universal substrate
store: postgres({ instance: "db.r6g.large" }),
models: [
models.serve("llama-3.1-70b", {
gpu: "H100:2", // multi-GPU, tensor-parallel
image: image.vllm("0.6.3"), // the serving image, in code
autoscale: { min: 0, max: 8, scaledown: "60s" }, // scale-to-zero
concurrency: 32, // inputs in flight per replica
snapshot: true, // GPU snapshot → seconds-cold-start
}),
models.bind(bedrock("claude-sonnet-4")), // or bind your cloud's
],
gateway: gateway({ egress: "deny", redactPII: true }),
vectors: vectors.pgvector(),
sandboxes: { isolation: "microvm", idleTimeout: "5m" },
})
// agentist deploy --to your-cluster # your cloud or ours, same codeThe stack
A stack is the unit of infrastructure — every component an agent platform needs, declared in one typed file: runtime, store, models, gateway, vectors, sandboxes, tool backends, schedules and queues. agentist deploy plans the diff and applies it incrementally; the stack's typed outputs wire straight into runtime({…}).
import platform from "./stack" // the typed infra definition
await platform.plan() // show the diff first
const out = await platform.apply() // incremental & reversible
// the stack's typed outputs wire straight into the runtime
runtime({ store: out.store, llm: out.models.claude })Compute & autoscaling
Every workload — a model endpoint, a tool backend, a batch job — gets fine-grained, per-component control over resources and scaling. The granularity of a serverless platform, declared in code and run on your own Kubernetes.
models.serve("llama-3.1-70b", {
gpu: "H100:2", cpu: 8, memory: "32Gi", // per-workload resources
autoscale: { min: 0, max: 8, scaledown: "60s" },
concurrency: 32, // inputs in flight / replica
batch: { maxSize: 16, window: "10ms" }, // dynamic batching
snapshot: true, // seconds-cold-start
})Images & environments
Define the exact runtime a component runs in — base image, system packages, dependencies — in the same TypeScript. Builds are layered and content-addressed, so a one-line change rebuilds one layer, not the world.
const env = image
.debianSlim()
.apt("git", "ffmpeg")
.pip("vllm==0.6.3", "transformers")
.run("python -m warm_cache") // build-time step, cached layer
Models & GPUs
Serve open models on your own GPUs, or bind to your cloud's hosted models — the same models interface either way, every call routed through the governed gateway.
// self-host on your GPUs — OpenAI-compatible endpoint
const llama = models.serve("llama-3.1-70b", { gpu: "H100" })
// or bind your cloud's hosted model — no servers to run
const claude = models.bind(bedrock("claude-sonnet-4"))Sandboxes for untrusted code
When an agent writes code, it runs in a disposable sandbox — an isolated container provisioned on demand, scoped, and torn down. The execution substrate for tools that run generated code.
const box = await sandbox.create({ image: env, isolation: "microvm" })
const { stdout, exitCode } = await box.exec("python solve.py")
await box.snapshot("checkpoint") // pause & resume later
await box.terminate()Jobs & batch
Not everything is a live request. Embedding a corpus, an eval suite, or a nightly backfill is a batch job — run on the same worker pool as live agents, burst up for the work and back to zero when it drains. (Sub-agent parallelism within a run is delegate; this is data-level parallelism over a collection. Scheduling lives in Triggers.)
// map a function across a collection — data parallelism on the worker pool
const vectors = await job.map(corpus, embed) // 10k docs, results gathered
// or long-running work, collected later
const handle = await job.spawn(nightlyBackfill)Provider abstractions
A component declares an abstract need — object storage, a Postgres, a model. You bind it to a cloud-specific provider. Swap the binding, not the code.
stack({
store: rds("agentist-prod"), // or cloudSql() · alloyDb() · a URL
storage: s3("artifacts"), // or gcs() · azureBlob() · minio()
secrets: awsSecrets(), // or vault() · gcpSecrets()
})Deploy anywhere
agentist deploy compiles the stack to Kubernetes and rolls it out — your cloud, ours, on-prem, or air-gapped. Kubernetes is the only assumption.
agentist deploy --plan # dry-run: show the diff first
agentist deploy --to prod # compile → apply to your cluster
agentist deploy --to air-gapped --offlineagentist deployYour AI Cloud, composed in the cloud
The whole platform runs in your cloud — deploy the runtime with one Helm chart into AWS, GCP, Azure, DigitalOcean, or bare metal. Your agents and data never leave your boundary, and it's all stood up as infrastructure-as-code.
Optional: Agentist Cloud
Plug in Agentist Cloud — the managed control plane to govern the whole agentic estate (charter, identities, RBAC, policies, registry, cost) across every team, with connectors and burst capacity. Your data plane still never leaves your boundary.
# deploy to any cloud, then connect Agentist
agentist deploy --target k8s
agentist cloud connectSelf-host vs. Agentist Cloud
| Self-host (open core) | + Agentist Cloud | |
|---|---|---|
| Runtime & data | Your cloud / cluster | Your cloud — unchanged |
| Compute | Your cluster | + multi-cloud burst & GPUs |
| Control plane | Self-hosted Console | Managed / hosted Console |
| Connectors | Open core + DIY | Marketplace + vertical packs |
| Compliance | DIY | SOC2 / HIPAA packs + evidence |
| Support | Community | SLAs · LTS |
Deploy on AWS
Run the runtime on EKS, with RDS for Postgres and S3 for object storage. Models run on Amazon Bedrock (Claude, Llama, Titan); burst GPU on EC2 for self-hosted models.
agentist deploy --target eks \
--set postgres.url=$RDS_URL \
--set objectStore=s3://acme-agentist \
--set gateway.models=bedrock # Claude on BedrockDeploy on Google Cloud
Run on GKE, with Cloud SQL for Postgres and GCS for object storage. Models run on Vertex AI (Claude, Gemini); burst to GPU node pools on demand.
agentist deploy --target gke \
--set postgres.url=$CLOUDSQL_URL \
--set objectStore=gs://acme-agentist \
--set gateway.models=vertex # Claude / Gemini on VertexDeploy on Azure
Run on AKS, with Azure Database for PostgreSQL and Blob Storage. Models run on Azure OpenAI or Azure-hosted endpoints.
agentist deploy --target aks \
--set postgres.url=$AZURE_PG_URL \
--set objectStore=az://acme-agentist \
--set gateway.models=azure-openaiDeploy on DigitalOcean
Run on DOKS, with Managed Postgres and Spaces for object storage. Use the Anthropic API or self-host models on GPU droplets.
agentist deploy --target doks \
--set postgres.url=$DO_PG_URL \
--set objectStore=spaces://acme-agentist \
--set gateway.models=anthropicDeploy on bare metal
Air-gapped or on-prem. Run on k3s or kubeadm, with self-managed Postgres and MinIO. Self-host models with vLLM or Ollama — no external calls leave the building.
agentist deploy --target k8s \
--set postgres.url=$PG_URL \
--set objectStore=minio://agentist \
--set gateway.models=vllm \
--set airgapped=trueMulti-cloud & burst
Run the data plane in your primary cloud and burst compute, GPUs, or inference into others on demand — one logical fleet, one control plane, governed centrally.
Why fleets need it: agent load is spiky — one incident or batch can fire hundreds of agents at once, each making GPU-heavy inference calls. Bursting spills that spike to spare GPUs in another cloud instead of queuing or failing, so latency stays flat and you pay for peak capacity only when you actually hit it.
# primary cloud + burst targets — one fleet
agentist deploy --target eks --set primary=true
agentist cloud burst add --target gke --gpu
agentist cloud burst add --target k8s --models=vllmOperate · observe · coach
The control plane for your agents — watch every run, audit any decision, and coach agents to get better. CLI-first: launch agentist console from your terminal, or host it centrally in your cloud's data plane. Self-hosted, or managed via Agentist Cloud.
| run | agent | state | waiting-on | budget | trigger | age |
|---|---|---|---|---|---|---|
| incident #4821 | Cornelia·SRE | suspended(approval) | on-call · 4m 12s | 41k/60k | webhook·a3f | 4m 12s |
| deploy-gate #903 | Rune·Release | suspended(approval) | release · 1m 02s | 6k/60k | api·77c | 1m 02s |
| backup-verify #88 | Atlas·Ops | suspended(timer) | timer · 17h 51m | 2k/60k | cron·d10 | 6h 09m |
| incident #4822 | Cornelia·SRE | running | — | 22k/60k | webhook·b1e | 1m 40s |
| cost-report #77 | Vega·Cost | running | — | 9k/60k | cron·e22 | 0m 31s |
| reindex #410 | Atlas·Ops | failed | — step 3/6 | 18k/60k | event·f90 | 2m ago |
| incident #4820 | Cornelia·SRE | done | — | 31k/60k | webhook·9c2 | 3m 41s |
kubectl rollout undo deploy/checkoutscale checkout 6 → 10| agent | runs | p95 step | denied | suspended | budget cut-offs | spend |
|---|---|---|---|---|---|---|
| Cornelia·SRE | 612 | 2.6s | 9 | 3 | 1 | $22 |
| Rune·Release | 418 | 1.9s | 2 | 1 | 0 | $11 |
| Atlas·Ops | 254 | 3.1s | 3 | 1 | 1 | $9 |
| seq | time | agent · tenant | step / call | verdict | model / source | idem |
|---|---|---|---|---|---|---|
| 1184 | 14:02:14 | Cornelia·acme | resume:remediate | approved-by kc | — | a3f-04 |
| 1182 | 14:02:11 | Cornelia·acme | call:diagnose (infer) | admitted | claude-sonnet | a3f-03 |
| 1181 | 14:02:09 | Cornelia·acme | data:metrics.query (data) | redacted PII | prometheus·ro | a3f-03 |
| 1180 | 14:02:08 | Cornelia·acme | call:triage (infer) | admitted | claude-haiku | a3f-02 |
| 1176 | 14:01:40 | Vega·acme | step:scale 50→500 | denied · budget | — | e22-01 |
DiagnoseInput { errorRate: 0.092, p95: 340, deploy: v412 } …output(captured) Diagnosis { cause: "v412 regression", confidence: 0.86 }verdictadmitted · policy budget-per-run ok · gateway inference · claude-sonnetrunincident #4821 · span diagnose| agent output | your correction |
|---|---|
| severity sev3 "elevated latency, monitor" | severity sev2 "p99 5s on checkout is customer-facing → page" |
| policy | applies to | effect |
|---|---|---|
| remediation-needs-approval | *.remediate · destructive | require-approval(on-call) |
| no-public-buckets | iac.apply | deny |
| budget-per-run | * | cap-budget ≤ $2 |
| pii-redaction | gateway.inference + data | redact |
| prod-data-read-only | gateway.data | allow read · deny write |
| name | kind | version | owner | isolation | last eval |
|---|---|---|---|---|---|
| oncall · Cornelia | agent | 1.4.0 | platform | container | 34/37 ✓ |
| release · Rune | agent | 0.9.2 | deploys | container | 28/28 ✓ |
| k8s | skill | 2.1.0 | platform | — | — |
| runbooks | skill | 1.0.6 | sre | — | — |
| on-call | harness | 3.0.0 | agentist | container | — |
.approval("on-call") on *.remediate(destructive)— Prefer the smallest reversible change.
| route | model | provider | fallback | cache |
|---|---|---|---|---|
| Cornelia · diagnose | claude-sonnet | Bedrock | → haiku | 31% hit |
| Rune · review | gpt-4o | OpenAI | → sonnet | 12% hit |
| * (default) | claude-haiku | Bedrock | — | 44% hit |
| source | access | redaction | query audit |
|---|---|---|---|
| prometheus·ro | read | PII mask | 142 queries → Audit |
| pg · acme | read · deny write | PII mask | 37 queries → Audit |
| member | role | scope | approval authority |
|---|---|---|---|
| kc@kc.io | admin | org | all gates |
| sre-team | operator | oncall · approvals | .approval("on-call") |
| deploys | operator | release · policies | .approval("release") |
| viewer@acme | viewer | read-only | none |
Operate
Run the fleet live — watch executions, clear the approvals agents are waiting on, and talk to them directly. Every action is governed and lands on one trail.
Runs — list, then drill into the trace
A live list of every execution; open one to drop into its typed trace — spans, budgets, and the exact model and data calls.
Approvals
Every run suspended on a human, with just enough to decide in one keystroke — the proposed command, the rationale, and the diff.
Chat
Talk to any agent, or pick a run up as a conversation. A gated step shows up inline — answer it and the run resumes.
Observe
Watch the fleet's health, shape, and spend. Every tile is a real operator question with a threshold and a drill target — not a vanity number.
Signals
What's stuck on a human, where admission is denying, who's near budget — each tile clicks through to the run or policy behind it.
Fleet
The bird's-eye view — agents, how they call one another, and per-agent health and cost. Drill from the map into one agent's dashboard.
Cost
Spend and token throughput per agent and per model, ranked — find the expensive ones before finance does.
Govern
The trust surface — the rules that gate every step, the tamper-evident record of what happened, and who's allowed to do what.
Policies
Every admission policy, what it gates, and its live hit-rate — spot policy fighting the fleet or a misconfiguration at a glance.
Audit
The append-only, tamper-evident journal — every step, gateway call, and verdict, gapless and replayable. Re-issue any captured call.
Access
Who can do what — roles, scopes, and agent identities — with the SSO and SCIM behind them. Every change is itself audited.
Manage
Curate the fleet — the catalog of agents and versions, who each one is, and how they get better over time.
Registry
The catalog of every agent — its version, skills, and the policies bound to it. Deploy, pin, or roll back from here.
Identity
Each agent's identity and charter — name, role, persona, and the mission and values it inherits. The agent is someone.
Coach
Turn corrections into evals — review a run's reasoning, save the fix as a test case, or note it to the charter.
Pluggable connectors
Connectors plug the platform into the tools your teams already use — as triggers, actions, and approval channels. Install one, and your agents and your people reach it through a typed, governed interface.
Secure gateways to your data lakes
Agents are only as intelligent as the data they can reach. The data plane is a governed gateway to your lakes and warehouses — agents query real enterprise data through it, never with raw credentials, and every access is policy-checked and audited.
The event store
Durable state, the audit journal, and memory all live in one Postgres you run. The runtime owns the schema — you point it at a database.
runtime({ store: postgres(process.env.DATABASE_URL) })
// event-sourced: state, journal & vectors in one database
// SQLite for local dev · Redis for queuesMemory
Agents remember — durably, and scoped to whoever they're acting for. More than a chat buffer: working memory, semantic recall, and pinned facts, all policy-aware.
await ctx.memory.put("user.tier", "enterprise") // working memory
const tier = await ctx.memory.get("user.tier")
const hits = await ctx.memory.search("past incidents", { topK: 5 })Retrieval & RAG
A complete retrieval pipeline — chunk, embed, search, rerank — so agents reason over your documents and data with precision, not a similarity guess.
const kb = retrieval({
chunk: { strategy: "recursive", size: 800, overlap: 100 },
embed: "text-embedding-3-large",
store: vectors.pgvector(),
rerank: "rerank-v3", // precision pass before the model
})
await kb.add(docs)
const hits = await kb.query("why did checkout fail?", { topK: 8 })Vectors
Vector storage is built in on pgvector — no extra database to run — and pluggable to an external store when you want one.
const v = vectors.pgvector() // or pinecone() · qdrant()
await v.query(embedding, {
topK: 10,
filter: { tenant: "acme", kind: "runbook" }, // metadata filter
})Object storage
Large artifacts — files an agent reads or writes, model weights, run outputs — live in object storage, bound to your cloud's bucket.
const bucket = storage(s3("artifacts")) // or gcs · azureBlob · minio
await bucket.put(`runs/${runId}/report.pdf`, bytes)
const url = await bucket.signedUrl(key, { expires: "1h" })The whole use case, in one file
Each example is a single typed file — the entire implementation, infrastructure and all: charter, agent, tools, workflow, governance, the engine, and the stack that deploys it to your cloud. Not snippets — the real thing, top to bottom. Five builds, each leaning on a different part of the platform. Each ends with how it's invoked — a trigger, a typed call, or a prompt; one input contract, every caller. (More to come, plus a dedicated examples repo.)
Incident
Otto, the on-call SRE agent: a paging alert arrives, he triages it, delegates parallel investigators across app, db & network, diagnoses the root cause, and proposes a fix — but anything destructive waits for a human. Shows: charter · skills · governed tools · parallel delegation · admission control · approvals · memory · engine · infrastructure — one file.
// src/incident.ts — Otto, the on-call SRE: triage → investigate → fix, one file.
import { agent, duty, tool, workflow, policy, charter, harness, runtime, postgres } from "@agentist/sdk"
import { agentist } from "@agentist/engine"
import { stack, gateway, vectors } from "@agentist/infra"
import { bedrock } from "@agentist/infra/aws"
import { z } from "zod"
// ── Charter — the culture every agent inherits ───────────────────────────
const acme = charter({
mission: "Keep systems reliable and customer trust intact.",
values: ["Bias to safety", "Explain your reasoning", "Escalate when unsure"],
})
// ── Schemas ───────────────────────────────────────────────────────────────
const Alert = z.object({ service: z.string(), symptom: z.string(), severity: z.enum(["sev1","sev2","sev3"]) })
// ── Tools — server-side & validated; the agent never sees a credential ──
const queryMetrics = tool({
input: z.object({ service: z.string(), window: z.string().default("30m") }),
output: z.object({ p99: z.number(), errorRate: z.number() }),
run: (a, ctx) => ctx.tool("prometheus.query", a), // via the egress gateway
})
const rollback = tool({
input: z.object({ deploy: z.string() }),
meta: { destructive: true }, // surfaced to admission control
run: (a, ctx) => ctx.tool("k8s.rollback", a),
})
// ── Investigator — one cheap agent, fanned out across subsystems ─────────
const investigate = agent({
id: "investigate",
identity: { name: "Iris", role: "Investigator", persona: "Methodical, fast." },
charter: acme,
tools: [queryMetrics],
duties: {
scan: duty({
input: z.object({ alert: Alert, area: z.enum(["app","db","network"]) }),
output: z.object({ area: z.string(), finding: z.string() }),
run: (a, ctx) => ctx.llm({ prompt: `Inspect ${a.area} for ${a.alert.symptom}` }),
}),
},
})
// ── Otto — the on-call SRE, extends a base harness ───────────────────────
export const oncall = agent({
extends: harness("on-call"), // reusable SRE preset — extend, don't rebuild
id: "oncall",
identity: { name: "Otto", role: "On-call SRE", persona: "Calm, terse, evidence-driven." },
charter: acme,
tools: [queryMetrics, rollback],
duties: {
triage: duty({
input: Alert,
output: z.object({ summary: z.string() }),
run: async (a, ctx) => {
await ctx.context.recall("past incidents on " + a.service, { topK: 5 }) // remember
return ctx.llm({ prompt: `Triage: ${a.symptom}` })
},
}),
diagnose: duty({
input: z.object({ findings: z.array(z.object({ area: z.string(), finding: z.string() })) }),
output: z.object({ rootCause: z.string(), destructive: z.boolean() }),
run: (a, ctx) => ctx.llm({ prompt: `Root cause from: ${JSON.stringify(a.findings)}` }),
}),
remediate: duty({
input: z.object({ rootCause: z.string() }),
run: (a, ctx) => ctx.tool("k8s.rollback", { deploy: "latest" }),
}),
},
})
// ── Policy — admission control: destructive fixes wait for a human ───────
const remediationGate = policy({
id: "remediation-needs-approval",
applies: { duty: "oncall.remediate" },
check: ({ state }) => state.diagnose.destructive
? { effect: "approval", reason: "Destructive remediation needs approval" }
: { effect: "allow" },
})
// ── Workflow — the typed path an incident takes ──────────────────────────
export const incident = workflow("incident")
.input(Alert)
.step("triage", oncall.duties.triage)
.delegate("findings", (alert) =>
["app","db","network"].map((area) => investigate.duties.scan.with({ alert, area })),
) // parallel sub-agents — durable & governed
.step("diagnose", oncall.duties.diagnose, (s) => ({ findings: s.findings }))
.approval("on-call") // only fires when remediationGate demands it
.step("remediate", oncall.duties.remediate)
.commit()
// ── Infrastructure — provision the platform into your own cloud (IaC) ─────
const infra = stack({
target: "k8s",
store: postgres({ instance: "db.r6g.large" }),
models: [bedrock("claude-sonnet-4")],
gateway: gateway({ egress: "deny", redactPII: true }),
vectors: vectors.pgvector(),
})
// ── Runtime — engine + governance, wired to the infra above ──────────────
const rt = runtime({
engine: agentist(), // or adk() · langgraph()
store: infra.store,
llm: infra.models.claude,
policies: [remediationGate],
memory: infra.vectors,
})
rt.register(oncall, investigate, incident)
rt.on("deploy.failed", oncall.duties.triage) // page Otto on a failed rollout
rt.serve() // API · MCP · the Console
// agentist deploy --to your-cluster # provisions infra + rolls out the runtimeInvoke it. Start the local runtime and run the workflow from the CLI — it stops for a human before anything destructive:
Analyst
Dana turns a plain-English question into read-only SQL, runs it through the data gateway, and explains the result — but a critic checks the query first, and a policy makes writes impossible by construction. Shows: the data plane · governed SQL · a critic gate · read-only admission control — one file.
// src/analyst.ts — Dana, the data analyst: ask in English, get a governed answer.
import { agent, duty, tool, workflow, policy, charter, runtime, postgres } from "@agentist/sdk"
import { agentist } from "@agentist/engine"
import { stack, gateway } from "@agentist/infra"
import { bedrock } from "@agentist/infra/aws"
import { z } from "zod"
const acme = charter({
mission: "Keep systems reliable and customer trust intact.",
values: ["Show your work", "Cite the query", "Never guess at a number"],
})
// ── Tool — read-only SQL, executed server-side through the data gateway ──
const runSql = tool({
input: z.object({ sql: z.string() }),
output: z.array(z.record(z.any())),
meta: { readOnly: true }, // surfaced to admission control
run: (a, ctx) => ctx.db.query(a.sql), // PII redacted at the gateway
})
// ── Agent — turns a question into SQL, then reads the result back ────────
export const dana = agent({
id: "analyst",
identity: { name: "Dana", role: "Data analyst", persona: "Precise; cites the query." },
charter: acme,
tools: [runSql],
duties: {
plan: duty({
input: z.object({ question: z.string() }),
output: z.object({ sql: z.string() }),
run: (a, ctx) => ctx.llm({ prompt: `Write read-only SQL for: ${a.question}` }),
}),
explain: duty({
input: z.object({ rows: z.array(z.record(z.any())) }),
output: z.object({ answer: z.string() }),
run: (a, ctx) => ctx.llm({ prompt: `Summarize these rows: ${JSON.stringify(a.rows)}` }),
}),
},
})
// ── Policy — the analyst is read-only by construction; writes can't happen ─
const readOnly = policy({
id: "analyst-read-only",
applies: { tool: "*" },
check: ({ tool }) => tool.meta.readOnly
? { effect: "allow" }
: { effect: "deny", reason: "Analyst is read-only" },
})
// ── Workflow — plan → critic vets the SQL → run → explain ────────────────
export const ask = workflow("ask")
.input(z.object({ question: z.string() }))
.step("plan", dana.duties.plan)
.critic("sql-safe", (s) => `Is this SQL read-only and correct? ${s.plan.sql}`)
.step("run", runSql, (s) => ({ sql: s.plan.sql }))
.step("explain", dana.duties.explain, (s) => ({ rows: s.run }))
.commit()
const infra = stack({
target: "k8s",
store: postgres({ instance: "db.r6g.large" }),
models: [bedrock("claude-sonnet-4")],
gateway: gateway({ egress: "deny", redactPII: true }),
})
const rt = runtime({
engine: agentist(),
store: infra.store,
llm: infra.models.claude,
policies: [readOnly],
})
rt.register(dana, ask)
rt.serve() // ask it in the Console, or over MCPInvoke it. Run it locally — the question is one field, so it's positional on the CLI:
Review
Rex reviews a pull request — but first it clones the branch and runs the test suite inside a throwaway microVM sandbox, so untrusted code never touches your runtime. A GitHub webhook starts the run; the review posts back through an MCP tool. Shows: sandbox isolation · webhook triggers · MCP tools — one file.
// src/review.ts — Rex, the code reviewer: tests run in a sandbox before review.
import { agent, duty, tool, workflow, charter, runtime, postgres } from "@agentist/sdk"
import { agentist } from "@agentist/engine"
import { stack, gateway, sandbox } from "@agentist/infra"
import { bedrock } from "@agentist/infra/aws"
import { z } from "zod"
const acme = charter({
mission: "Keep systems reliable and customer trust intact.",
values: ["Evidence over opinion", "Be specific", "Be kind"],
})
const PR = z.object({ repo: z.string(), number: z.number(), sha: z.string() })
// ── Tool — run the PR's tests in a fresh microVM; nothing lingers ────────
const runTests = tool({
input: PR,
output: z.object({ pr: PR, passed: z.boolean(), log: z.string() }),
run: async (a, ctx) => {
const box = await sandbox.create({ image: "node:20", isolation: "microvm" })
await box.exec(`git clone ${a.repo} app && cd app && git checkout ${a.sha}`)
const { exitCode, stdout } = await box.exec("cd app && npm ci && npm test")
await box.terminate() // ephemeral — auto-cleaned regardless
return { pr: a, passed: exitCode === 0, log: stdout }
},
})
// ── Agent — reviews the diff, and posts the verdict back to GitHub ───────
export const rex = agent({
id: "reviewer",
identity: { name: "Rex", role: "Code reviewer", persona: "Terse, specific, kind." },
charter: acme,
tools: [runTests],
duties: {
review: duty({
input: z.object({ pr: PR, tests: z.object({ passed: z.boolean(), log: z.string() }) }),
output: z.object({ verdict: z.enum(["approve", "request-changes"]), notes: z.string() }),
run: async (a, ctx) => {
const r = await ctx.llm({ prompt: `Review PR #${a.pr.number}; tests: ${a.tests.log}` })
await ctx.tool("github.comment", { ...a.pr, body: r.notes }) // post via MCP tool
return r
},
}),
},
})
// ── Workflow — test in isolation, then review ────────────────────────────
export const reviewPR = workflow("review-pr")
.input(PR)
.step("tests", runTests)
.step("review", rex.duties.review, (s) => ({ pr: s.tests.pr, tests: s.tests }))
.commit()
const infra = stack({
target: "k8s",
store: postgres({ instance: "db.r6g.large" }),
models: [bedrock("claude-sonnet-4")],
gateway: gateway({ egress: "deny" }), // the sandbox reaches the net only through here
})
const rt = runtime({ engine: agentist(), store: infra.store, llm: infra.models.claude })
rt.mcp({ consume: ["github"] }) // GitHub's MCP server → agent tools
rt.register(rex, reviewPR)
rt.on("github.pull_request.opened", reviewPR) // webhook → typed, governed run
rt.serve()Invoke it. Test it locally on a PR — the tests run first, inside a sandbox:
Knowledge
Quinn answers questions from your docs with citations — and refuses to answer beyond its sources. It indexes your docs into pgvector, retrieves with hybrid search and reranking, and is always-on: talk to it in the Console, Slack, or over MCP. Shows: retrieval & vectors · grounded answers · a conversational, addressable agent — one file.
// src/docs.ts — Quinn, the knowledge assistant: grounded answers, with citations.
import { agent, duty, charter, runtime, postgres } from "@agentist/sdk"
import { agentist } from "@agentist/engine"
import { stack, gateway, vectors, retrieval } from "@agentist/infra"
import { bedrock } from "@agentist/infra/aws"
import { z } from "zod"
const acme = charter({
mission: "Keep systems reliable and customer trust intact.",
values: ["Cite your sources", "Say 'I don't know'", "Never invent an answer"],
})
// ── Retrieval — chunk, embed & index your docs into pgvector ────────────
const kb = retrieval({
chunk: { strategy: "recursive", size: 800, overlap: 100 },
embed: "text-embedding-3-large",
store: vectors.pgvector(),
rerank: "rerank-v3", // precision pass before the model
})
// ── Agent — grounded Q&A; answers only from what it retrieved ───────────
export const quinn = agent({
id: "docs",
identity: { name: "Quinn", role: "Knowledge assistant", persona: "Helpful, grounded, honest." },
charter: acme,
duties: {
answer: duty({
input: z.object({ question: z.string() }),
output: z.object({ answer: z.string(), sources: z.array(z.string()) }),
run: async (a, ctx) => {
const hits = await kb.query(a.question, { topK: 8 }) // hybrid search + rerank
return ctx.llm({ prompt: a.question, ground: hits }) // answer only from sources
},
}),
},
})
const infra = stack({
target: "k8s",
store: postgres({ instance: "db.r6g.large" }),
models: [bedrock("claude-sonnet-4")],
gateway: gateway({ egress: "deny" }),
vectors: vectors.pgvector(),
})
const rt = runtime({
engine: agentist(),
store: infra.store,
llm: infra.models.claude,
memory: infra.vectors,
})
rt.register(quinn)
await kb.add("./docs/**/*.md") // ingest once; re-runs are incremental
const q = rt.agent("docs") // every agent is addressable & always-on
await q.ask("how do I rotate the gateway's signing key?")
rt.serve() // Console chat · Slack · MCPInvoke it. Index your docs, then ask from the CLI — answers cite their sources:
Audit
Cass runs every night: it pulls your service catalog, audits all of them in parallel with delegate, and a critic throws out the noisy flags before it posts a waste report. Shows: cron triggers · parallel delegation · a critic for quality · a cheap model for batch — one file.
// src/audit.ts — Cass, the cost auditor: every night, flag over-provisioned services.
import { agent, duty, tool, workflow, charter, runtime, postgres } from "@agentist/sdk"
import { agentist } from "@agentist/engine"
import { stack, gateway } from "@agentist/infra"
import { bedrock } from "@agentist/infra/aws"
import { z } from "zod"
const acme = charter({
mission: "Keep systems reliable and customer trust intact.",
values: ["Bias to safety", "Show the numbers", "No false alarms"],
})
const Service = z.object({ name: z.string(), monthlySpend: z.number() })
// ── Tools — pull the catalog & post the report (external, via connectors) ─
const listServices = tool({
input: z.object({}),
output: z.object({ services: z.array(Service) }),
run: (a, ctx) => ctx.tool("billing.services", {}),
})
const postReport = tool({
input: z.object({ findings: z.array(z.any()) }),
run: (a, ctx) => ctx.tool("slack.post", { channel: "#finops", ...a }),
})
// ── Agent — judges one service against its utilization ───────────────────
export const cass = agent({
id: "auditor",
identity: { name: "Cass", role: "Cost auditor", persona: "Skeptical, exact." },
charter: acme,
duties: {
audit: duty({
input: Service,
output: z.object({ service: z.string(), wasteUsd: z.number(), reason: z.string() }),
run: async (a, ctx) => {
const util = await ctx.tool("prometheus.query", { service: a.name })
return ctx.llm({ prompt: `Is ${a.name} over-provisioned? util=${JSON.stringify(util)}` })
},
}),
},
})
// ── Workflow — catalog → audit all in parallel → critic → report ─────────
export const nightly = workflow("nightly-audit")
.input(z.object({}))
.step("catalog", listServices)
.delegate("findings", (s) => // every service, in parallel — durable & governed
s.catalog.services.map((svc) => cass.duties.audit.with(svc)))
.critic("no-false-alarms", (s) => // drop flags the evidence doesn't support
`Are these waste flags justified? ${JSON.stringify(s.findings)}`)
.step("report", postReport, (s) => ({ findings: s.findings }))
.commit()
const infra = stack({
target: "k8s",
store: postgres({ instance: "db.r6g.large" }),
models: [bedrock("claude-haiku-4")], // cheap model — it's a big nightly batch
gateway: gateway({ egress: "deny" }),
})
const rt = runtime({ engine: agentist(), store: infra.store, llm: infra.models.claude })
rt.register(cass, nightly)
rt.cron("0 6 * * *", nightly) // 6am daily — timezone-aware, with a missed-run policy
rt.serve()Invoke it. Run it on demand from the CLI to test it (in production a cron fires it nightly):
CLI
One binary for the whole lifecycle — author, run, deploy, operate.
agentist init <name> # scaffold a project
agentist dev # local runtime + console :3000
agentist run <agent.duty> "<prompt>" # positional prompt (default); --json for typed input
agentist deploy --target k8s|docker # deploy runtime + manifest
agentist cloud connect # plug in Agentist Cloud
agentist logs <run> # tail a run
agentist rollback <agent> <version> # roll back a version
agentist secrets set <key> # manage secretsHow Agentist compares
Today you'd stitch together a handful of tools — Mastra or Google's ADK for typed agents, Trinity or Kagent to run them in your own cloud, Modal for GPU infrastructure. Agentist is the only one that brings them together — everything a platform engineer needs in a single framework.
Marks: ✅ native · ◑ partial · ✕ none · — not applicable.
| Capability | Agentist | Mastra | Trinity | Modal | Kagent | ADK |
|---|---|---|---|---|---|---|
| Zod-typed boundaries everywhere | ✅ | ✅ | ✕ | — | ✕ | ✅ |
| Code-owned, not UI or YAML | ✅ | ✅ | ◑ | — | ◑ | ✅ |
| Typed, deterministic workflows | ✅ | ✅ | ✕ | — | ✕ | ✅ |
| A small primitive set | ✅ | ◑ | ◑ | — | ◑ | ◑ |
| Pluggable models, any provider | ✅ | ✅ | ◑ | ◑ | ✅ | ✅ |
| Critics that don't self-grade | ✅ | ◑ | ✕ | — | ✕ | ◑ |
| Capability | Agentist | Mastra | Trinity | Modal | Kagent | ADK |
|---|---|---|---|---|---|---|
| Durable, event-sourced execution | ✅ | ◑ | ✅ | ◑ | ✕ | ◑ |
| Runs in your cloud (sovereign) | ✅ | ✕ | ✅ | ✕ | ✅ | ◑ |
| Per-agent isolation tiers | ✅ | ✕ | ◑ | ✅ | ◑ | ◑ |
| Append-only, tamper-evident audit | ✅ | ✕ | ✅ | ◑ | ✕ | ◑ |
| Human approvals on one trail | ✅ | ◑ | ✅ | ✕ | ◑ | ◑ |
| Workers · leases · retries | ✅ | ✕ | ✅ | ✅ | ✕ | ✕ |
| Always-on agents you talk to | ✅ | ✕ | ✅ | — | ◑ | ◑ |
| Capability | Agentist | Mastra | Trinity | Modal | Kagent | ADK |
|---|---|---|---|---|---|---|
| Infra-as-code (compute & images) | ✅ | ◑ | ◑ | ✅ | ◑ | ✕ |
| Model endpoints on GPUs | ✅ | ✕ | ✕ | ✅ | ✕ | ✕ |
| Granular compute & autoscaling | ✅ | ◑ | ✕ | ✅ | ◑ | ◑ |
| Sandboxes for untrusted code | ✅ | ✕ | ◑ | ✅ | ◑ | ✅ |
| Scale-to-zero compute | ✅ | ◑ | ✕ | ✅ | ◑ | ◑ |
| Provisions its own components | ✅ | ✕ | ◑ | ◑ | ✕ | ✕ |
| Deploys into your cloud | ✅ | ✕ | ✅ | ✕ | ✅ | ◑ |
| Capability | Agentist | Mastra | Trinity | Modal | Kagent | ADK |
|---|---|---|---|---|---|---|
| Admission control, per step | ✅ | ✕ | ✕ | — | ✕ | ✕ |
| Typed and durable together | ✅ | ✕ | ✕ | — | ✕ | ✕ |
| Durable, no separate cluster | ✅ | ✕ | ✕ | ✕ | ✕ | ✕ |
| One log = audit = replay = SOC 2 | ✅ | ✕ | ◑ | ✕ | ✕ | ✕ |
| Conversation = approval = audit | ✅ | ✕ | ✕ | — | ✕ | ✕ |
| One typed invocation contract | ✅ | ✕ | ✕ | — | ✕ | ✕ |
| Charter inherited by every agent | ✅ | ✕ | ◑ | — | ✕ | ✕ |
Questions, answered
The things engineers ask first — answered plainly. For anything else, reach out.
When can I try it?
How is this different from LangGraph, CrewAI, or a prompt framework?
Can my AI coding assistant build Agentist agents, or just humans?
Do I have to rewrite my agents to adopt Agentist?
What language do I write agents in?
Which models can I use — and can I run my own?
Does my data ever leave my cloud?
What does "admission control" actually do?
What happens if an agent crashes mid-run?
Can I use my existing tools and MCP servers?
How does local development work?
agentist dev runs the full runtime and console on your machine, and agentist run <agent.duty> "<prompt>" invokes a duty straight from the CLI. See the examples.