Release notes · agent-harness Developer Docs

In development (not yet released)

In development

Working notes for the next release. Move entries below into a numbered version after the release ships.

Shipped

0.0.291

Removed the remaining public legacy compatibility aliases in favor of `session/request`: active public facade types, runtime helper surfaces, CLI request listing, and inspection helpers now expect canonical session / request names directly. This is a breaking cleanup pass for downstream code that still imports legacy public facade types, calls legacy request-list helpers, or depends on legacy inspection helper names.
Applications can now persist app-owned request artifacts through the public runtime facade: added recordArtifact(runtime, { sessionId, requestId, kind, path, content }) so long-running workspaces can attach app-owned files to the persisted request surface without reaching into private persistence internals.
Evaluation exports stay generic while carrying app-owned artifacts: exportEvaluationBundle(...) continues to export persisted artifacts and optional artifact contents without imposing harness-owned roles or coding-specific artifact semantics onto the public runtime contract.

0.0.285

Default local checkpointer now uses sqlite under `runtime/`: the built-in checkpointer/default preset now resolves to SqliteSaver at runtime/checkpoints.sqlite, so the default writable data layout keeps runtime records, checkpoint recovery state, durable knowledge, and vector persistence all in sqlite-backed stores instead of leaving checkpoint recovery on runtime/checkpoints.json.

0.0.282

Runtime storage now separates the shared application folder from the writable data folder: kind: Runtime now exposes spec.applicationRoot, spec.dataRoot, and spec.profile, so one packaged config/ + resources/ application folder can stay read-only while different runtime profiles point at different writable data folders. The runtime now treats the loaded workspace root as the application folder and anchors writable persistence under the resolved data root instead of assuming one mixed filesystem root.
Default local persistence now lands in clearer data subdirectories: runtime-owned SQLite records and the default sqlite-backed checkpointer now resolve under runtime/ inside the writable data root, while durable knowledge and vector persistence stay under knowledge/. The built-in defaults now resolve to paths such as knowledge/records.sqlite, knowledge/vectors.sqlite, and runtime/checkpoints.sqlite, which keeps shared application assets separate from profile-specific runtime state.

0.0.275

Durable memory now ships as an embeddable knowledge module: the package root now exports createKnowledgeModule(...), readKnowledgeRuntimeConfig(...), and the shared KnowledgeRuntimeContext contract, so downstream apps can run the same durable memorize/recall lifecycle outside AgentHarnessRuntime without importing private runtime/harness/system/* internals. The standalone module keeps runtime-governed merge/review, projection, and reindex behavior behind the same stable MemoryRecord surface instead of exposing backend-specific storage details.
Knowledge policy can now be packaged as a standalone singleton YAML object: durable memory policy readers now accept either the existing RuntimeMemory.spec object or a full kind: KnowledgeRuntime singleton document, and the repository now ships a matching config/knowledge/knowledge-runtime.yaml example plus bilingual memory docs that explain how to reuse the same policy in a future knowledge worker or knowledge server.
`agent-harness init` now scaffolds the standalone knowledge-policy mirror by default: newly generated workspaces now include both config/runtime/runtime-memory.yaml for runtime-owned durable-memory defaults and config/knowledge/knowledge-runtime.yaml for the same memorize/recall policy when teams later externalize knowledge handling into a dedicated worker or service.

0.0.274

Examples are now numbered by complexity: top-level example workspaces now use 00_ through 07_ prefixes so the repository reads as a clear progression from the smallest hello-skill starter up through the stock-research workspace. The README, docs, release automation, and example regression tests now point at the numbered paths so example discovery and version-sync automation stay aligned.
Repairable `write_todos` schema failures now retry inside the runtime instead of hard-failing the request: when DeepAgent's built-in write_todos tool is called without the required todo-item content field, the runtime now recognizes that validation failure as a repairable compatibility gap, appends a precise retry instruction telling the model to resend each todo item with both content and status, and re-invokes the request instead of terminating immediately. The same recovery behavior now applies consistently to both direct invoke flows and stream/listener-driven execution paths, while keeping the existing full-entry todo semantics intact instead of introducing status-only patch behavior. Issue #62

0.0.269

Governed tool approvals now support explicit runtime decision modes: runtime-governed tools can now stay on manual approval, auto-approve immediately, or auto-reject / deny-and-continue without opening a human approval inbox item, while governance snapshots and operator overview diagnostics expose the selected decision mode for each tool.
A2A discovery now covers lighter compatibility paths: the A2A bridge now serves discovery-friendly HEAD / OPTIONS responses on the agent-card path, emits supported-version headers, and accepts the JSON-RPC GetAgentCard alias in addition to GetExtendedAgentCard, making mixed A2A clients easier to wire without changing the persisted runtime task model.
A2A card discovery can now advertise registry and detached-signature hints: when configured, the A2A bridge can expose registry URLs plus detached signed-card metadata through agent-card payloads and card-response headers, so surrounding registries or gateways can experiment with signed-card discovery without turning agent-harness into a second identity system.

0.0.268

Public runtime/session/request outputs now standardize on `sessionId` and `requestId`: public runtime records, stream items, approvals, transcript messages, memory provenance, and protocol adapters now expose sessionId / requestId as the only public identity fields instead of duplicating sessionId / requestId aliases on the same payloads. Downstream clients can now consume one stable identity pair across runtime APIs, streaming, A2A, AG-UI, ACP, and persisted runtime records without alias fallback logic. Issue #61
Public stream/result payloads no longer mix duplicate identity aliases: public streamEvents(...), result snapshots, session/request lookups, and projected trace/runtime records now drop the old public sessionId / requestId fields entirely, so clients no longer need to guess which identity pair is canonical when reconstructing request state, matching deltas to results, or joining runtime evidence across protocol surfaces. Issue #61

0.0.266

Streaming listeners now preserve live assistant text deltas on the public harness stream path: streamEvents(...) now forwards streamed content, content-blocks, and tool-result items instead of filtering them down to lifecycle/upstream/result records, and successful text chunks now emit persisted output.delta runtime events as they arrive. This restores true incremental assistant streaming for downstream clients that consume the public harness stream directly, including desktop shells that bridge harness stream items into UI token updates.
Runtime YAML now exposes a public tool-module discovery scope: kind: Runtime can now declare toolModuleDiscovery.scope: recursive | top-level, so downstream workspaces can keep the default recursive resources/tools/ scan or intentionally restrict module discovery to files directly under each tool root without patching private installed dist/workspace/object-loader.js internals. The same policy now applies consistently to workspace-local tool modules, attached resource tool modules, and external resource tool-module hydration. Issue #60

0.0.263

Workspace tool scanning now follows symlinked local `resources/tools` modules: local workspace tool-module discovery now follows symlinked files and directories while still skipping node_modules, so dev workspaces that project resources/tools/*.mjs into the runtime workspace through symlinks no longer drop those function tools during discovery. This fixes bundle/toolset validation failures such as Tool wallee-web-search-toolset bundle ref webSearch does not exist when the referenced module tool exists but was skipped because the workspace-mounted file was a symlink. Issue #59

0.0.260

Toolset refs now resolve exported function tools even when `tool({...})` omits `name`: workspace tool compilation now falls back to a function tool's implementationName when the parsed tool object does not carry an explicit name, so attached resources/tools/*.mjs exports such as export const webSearch = tool({ ... }) remain addressable from YAML tool bundles/toolsets without duplicating name: webSearch inside the module definition. Issue #59
Workspace startup scans now consistently ignore vendored `node_modules` trees: attached resource tool-module discovery now skips tools/node_modules/@@P1@@, instead of traversing vendored dependency forests as if they were workspace-owned tools or skills. This removes a real startup blocker for downstream desktop shells that package large resource dependency trees under resources/tools/node_modules, and keeps startup timing focused on workspace-owned definitions rather than dependency payloads.
Startup readiness now stops blocking ready on background health/recovery work and exposes deeper phase timings: set AGENT_HARNESS_STARTUP_TIMING=1 to emit stage-level stderr timing for workspace load, validation, binding compilation, MCP hydration, persistence init, and startup recovery so downstream apps can see which bootstrap phase is consuming startup time. Ready no longer waits for health-monitor startup or startup recovery completion before returning from runtime initialization, which shortens clean-start readiness for downstream desktop shells. Workspace MCP hydration also skips remote listTools() enumeration when an agent MCP entry already declares an explicit MCP tool allowlist, preventing slow remote MCP discovery from blocking ready on that path. Internal health/operator reads now query only active request states plus pending approvals instead of scanning the full persisted request catalog for those background summaries. Issue #57

0.0.257

Unfiltered request-list reads now stay on a lightweight SQLite summary path under shared-runtime load: the SQLite catalog now creates a dedicated recency index and no longer deserializes runtime_snapshot_json for historical inactive requests while scanning the full catalog. Active requests still expose live runtime snapshots for operator/governance surfaces, but persisted desktop profiles no longer need to drag every completed request's inspection blob through the startup request-list path. Issue #57

0.0.254

Session list reads no longer depend on per-session scalar subqueries: SQLite session-summary reads now batch first/last message selection with one windowed message pass instead of running correlated subqueries per session, and the runtime schema now keeps recency-ordered session/session-summary reads on an indexed path under shared runtime load. Issue #56

0.0.252

Recursive YAML discovery now skips `node_modules`: readYamlItems(..., { recursive: true }) now ignores nested node_modules/** trees before descending, so attached resource tool scanning no longer walks vendored dependency forests under paths such as resources/tools/node_modules. This removes avoidable filesystem traversal during runtime inventory and session inspection while keeping workspace-owned YAML catalogs and attached resource tool definitions discoverable. Issue #55
Security audit now uses a patched transitive `axios`: package overrides now pin axios to ^1.15.0, so transitive consumers such as mem0ai, ibm-cloud-sdk-core, and retry-axios no longer leave the default-branch CI blocked on the NO_PROXY normalization SSRF advisory.

0.0.250

Request-first Mermaid exports are now the public flow-visualization API: the package root now exposes exportFlow(runtime, { sessionId, requestId }) and exportSequence(runtime, { sessionId, requestId }) as the product-facing helpers for request inspection diagrams, so callers no longer need to build or pass intermediate flow graph objects or visualization options through the public API.
Request trace items are now the canonical persisted inspection surface: completed requests now persist normalized trace items and expose them through listRequestTraceItems(runtime, { sessionId, requestId }), while public live listeners now use onTraceItem(...) so diagrams and post-request inspection read from the same product-facing trace vocabulary instead of a separate runtimeSurface view.
Inspection records no longer duplicate derived `history`: public getRequest(...) and getSession(...) now expose canonical traceItems plus runtimeTimeline without a second derived history array, so request inspection has one persisted trace source instead of parallel projections that can drift.
Trace item persistence is now append-only: file and SQLite persistence now append request trace items outside the mutable request-inspection metadata record, so streaming inspection no longer rereads and rewrites the full persisted trace blob on every trace update. This keeps onTraceItem(...), listRequestTraceItems(...), exportFlow(...), and exportSequence(...) aligned to one persisted trace source without the previous write amplification.
Runtime surface items now expose `action` and optional structured `detail`: live inspection, persisted request/session inspection, and flow graph metadata now publish RuntimeSurfaceItem.action instead of the older display-only label field, plus an optional detail object for future product-specific context such as handoff participants or original upstream step metadata.
Runtime-owned surface actions are now explicit and memory-aware: runtime surface projection now emits stable action verbs such as apply, call, execute, and handoff, while memory steps distinguish recall versus memorize when upstream semantics make that intent observable.
Sub-agent handoff surface items now stay under `kind: "agent"` and read as handoff progress: runtime surface projection now keeps sub-agent calls under kind: "agent" while emitting clearer handoff started/completed labels, and live/persisted inspection now pairs those handoff items with llm completed when chat-model end events arrive.
Removed native `better-sqlite3` runtime dependency: SqliteStore now uses the existing @libsql/client path instead of native better-sqlite3, so default installs no longer pull the native module at all. SqliteSaver checkpointers are temporarily unsupported and now fail fast with an explicit runtime error instead of loading @langchain/langgraph-checkpoint-sqlite.
Runtime PATH normalization for pre-tool git lookups: runtime startup now normalizes the current process PATH with common system, Homebrew, and version-manager executable directories before agent execution begins, so upstream or dependency-managed spawn("git") calls can resolve git even when delegated review flows fail before any explicit tool invocation. Function-tool subprocess launches now use the same normalized runtime env builder instead of a raw { ...process.env, ...env } merge, preventing tool-local PATH overrides from dropping git or other system binaries. Issue #54

0.0.248

Unified runtime surface items for live and persisted inspection: upstream listener items now expose surfaceItems with stable id, display name, label, status, and owning agent metadata for agent, llm, memory, skill, and tool. The same runtime-owned surface also appears in persisted request inspection and flow graph metadata so products can render live events and recovered runs from one canonical shape.

0.0.247

Persisted request handoff lineage and richer agent metadata: runtime persistence now records parentRequestId for later requests in the same session, keeps agent-owned upstream inspection events with both agentId and human-readable agentName, and carries that metadata through flow graphs and public upstream-event items so clients can render delegation and cross-request handoffs without local heuristics. Issue #53

0.0.246

Workspace config wins over attached tool dependencies: loadWorkspace() now ignores YAML catalogs under resources/tools/node_modules/** during conventional object discovery, so dependency-packaged builtin catalogs can no longer override workspace-owned entries such as model/default. This keeps workspace-local model selection stable even when attached tool dependencies vendor @botbotgo/agent-harness artifacts. Issue #50
Terminal stream state persistence and richer persisted flow projection: Streaming requests now persist request.state.changed before exposing the terminal stream result, so readers that inspect the request as soon as they receive the final assistant output no longer see stale running state. Persisted upstream flow projection also classifies middleware-style step names such as SkillsMiddleware.before_agent and MemoryMiddleware.before_agent using semantic hints from camel-case and dotted names, so product-view Mermaid exports no longer collapse to a lone LLM node when the persisted trace already contains meaningful skill and memory stages. Issue #51
Product-view Mermaid connectivity across hidden execution nodes: exportFlowGraphToMermaid(...) now preserves visible connectivity when the product view hides intermediate chain or memory nodes, so non-empty flow graphs no longer render as disconnected subgraphs just because the original edges pass through filtered runtime-only steps. Issue #52

0.0.243

Lazy SQLite runtime loading: Default FileStore and FileCheckpointer startup paths no longer import better-sqlite3 or @langchain/langgraph-checkpoint-sqlite at module top level. SQLite drivers now load only when SqliteStore or SqliteSaver is actually instantiated, removing Electron ABI startup hazards for non-SQLite workspaces. Issue #49

0.0.233

Governance evidence exports: exportRequestPackage(...) and exportSessionPackage(...) now include governance evidence alongside approvals, events, transcript, artifacts, and optional runtime health so operator tooling can export audit-ready runtime packages without rebuilding approval/risk context from persistence internals. The operator CLI adds agent-harness runtime export request|session for direct JSON evidence export during incident response and review workflows.
A2A v1.0 alignment: serveA2aHttp(runtime) now publishes an A2A v1.0-shaped agent card with supportedInterfaces protocol binding entries and capabilities.extendedAgentCard, accepts v1.0 ROLE_USER text parts, returns the { task } wrapper for SendMessage, emits TASK_STATE_* status enums, supports ListTasks status / pageSize / pageToken, and reports streaming or push-notification operations as unsupported instead of implying that the bridge owns those capabilities.
A2A extended card discovery: GetExtendedAgentCard now returns a real extended agent card instead of an unsupported error, including runtime-owned metadata for protocol surfaces plus agent/subagent inventory so external A2A clients can discover the control-plane shape without relying on out-of-band docs.
A2A service-parameter compatibility: the HTTP bridge now advertises both 1.0 and 0.3 JSON-RPC interfaces on the same endpoint, validates A2A-Version with VersionNotSupportedError, and records A2A-Extensions into runtime invocation metadata so operator tooling can inspect negotiated protocol context.
A2A push notifications: the A2A bridge now exposes CreateTaskPushNotificationConfig, GetTaskPushNotificationConfig, ListTaskPushNotificationConfigs, and DeleteTaskPushNotificationConfig, advertises capabilities.pushNotifications = true, accepts inline push notification configs on send calls, and sends best-effort webhook task snapshots for configured receivers.
A2A capability advertisement fix: the published agent card now correctly sets capabilities.streaming = true so external clients can discover the new SSE bridge instead of being told streaming is unavailable.
A2A streaming event shape: v1 SSE calls now send an initial { task } payload followed by { statusUpdate } events with final terminal hints, keeping bridge discovery, streaming behavior, and subscribe semantics aligned without changing legacy lowercase task snapshots.

Runtime, security, and interoperability

Bulk session history summaries: Added runtime-owned listSessionSummaries(...) so apps can load history/sidebar-ready session summaries in one call without hydrating every session through getSession(...). The new summary projection keeps listSessions(...) lightweight while exposing stable runtime fields such as entryAgentId, title, snippet, messageCount, hasVisibleMessages, and lastMessageRole for conversation lists. Issue #48
Release Node version sync: CI and release workflows now read the runtime Node major from the shared repo-root .node-version file instead of duplicating node-version literals, with regression coverage to keep workflow setup and package.json.engines.node aligned.
Recovered inspection records: Persisted request/session recovery now keeps canonical runtime inspection evidence for completed requests by storing per-request inspectable execution traces and returning runtimeTimeline alongside the existing frozen runtimeSnapshot through getRequest(...) and getSession(...). This removes the need for renderer-local fallback storage just to rebuild a completed request's runtime tab after restart. Issue #47
Upstream dependency alignment: Upgraded langchain to 1.3.1 and deepagents to 1.9.0, updated the deepagents backend compatibility bridge to the structured ls / grep / glob protocol introduced upstream, and added regression coverage for direct VfsSandbox grep plus composite backend read/list behavior after the upgrade.
Async subagent passthrough: Deepagent workspaces can now declare remote async subagents directly in spec.subagents using upstream AsyncSubAgent fields such as graphId, url, and headers; the runtime preserves those specs through compilation and passes them to upstream createDeepAgent(...) without inventing a harness-owned delegation layer.
DeepAgents feature alignment: Declarative middleware now maps completionCallback to upstream createCompletionCallbackMiddleware, builtin middleware tools accept structured backend protocol v2 results, and inline DeepAgents backend catalogs can materialize LangSmithSandbox through the existing runtime backend factory contract.
`runtime.sqlite` errors: Include the database file path; constraint-class errors (or AGENT_HARNESS_RUNTIME_SQLITE_DEBUG=1) append failing SQL for easier diagnosis (#45).
Toolchain / CI: TypeScript 6 shared compiler compatibility with regression coverage for baseUrl deprecation acknowledgement; npm run security:ci on pushes/PRs; dependency review for high-severity advisories; dependency overrides as needed for a clean production audit.
Semantics and docs tests: DeepAgents vs LangGraph boundary semantics documented with regression tests (trace vs execution graph).
ACP interoperability validation: Added broader ACP transport regression coverage for HTTP and stdio, including reference-client-style submit/list/get flows, invalid JSON handling, and notification calls without JSON-RPC response envelopes.
ACP reference client: Added createAcpStdioClient(...) as a small public reference client for IDE and CLI sidecars that need ACP stdio request/notification demultiplexing without hand-rolled JSON-line parsing.

0.0.190

Added a stable operator-overview control-plane projection that aggregates runtime health, queue pressure, pending approvals, active requests, and governance risk into one runtime-owned snapshot for dashboards and operator tooling.
Added getOperatorOverview(runtime, { limit }) to the public API and agent-harness runtime overview to the CLI so operators can load one actionable summary before drilling into raw request, session, approval, and event records.

0.0.188

Tightened runtime remote MCP admission control so runtime/default.governance.remoteMcp can now deny remote tools by trust tier, tenant scope, prompt-injection risk, and OAuth scope policy instead of stopping at server and transport filters.
Upgraded the operator CLI summaries so agent-harness runtime health, runtime approvals list|watch, and runtime requests list|tail now show check status, symptom summaries, request timestamps, current delegated agent, and resumability without forcing operators to parse raw JSON first.
Hardened the shipped workspace runtime defaults to deny untrusted, cross-tenant, and high prompt-injection-risk MCP surfaces by default while still keeping approval-by-transport and transport risk metadata explicit in YAML.

0.0.187

Upgraded serveA2aHttp(runtime) so the A2A bridge now publishes richer agent-card metadata, supports tasks/list and tasks/subscribe, and accepts newer compatibility aliases such as SendMessage, GetTask, ListTasks, CancelTask, and SubscribeToTask without changing the stable runtime-owned session/request model.
Extended runtime MCP governance metadata so MCP server catalogs and inline MCP configs can now declare trust tier, access mode, tenant scope, approval policy, prompt-injection risk, labels, and OAuth scopes, with those fields projected into runtime governance bundles for operator inspection and approval reasoning.
Added thin operator CLI inspection commands: agent-harness runtime health, agent-harness runtime approvals list|watch, and agent-harness runtime requests list|tail, so persisted runtime health, approval queues, and active request state can be inspected without writing ad hoc SDK glue.
Added runtime observability defaults under runtime/default.observability.tracing, including exporter metadata and propagation mode that now flow into frozen runtimeSnapshot.tracing metadata for downstream correlation and export-aware operator tooling.
Extended serveAgUiHttp(runtime) so the AG-UI SSE surface now projects upstream thinking, step progress, and tool-call lifecycle events in addition to request and text messages, making the bridge usable for richer UI state and agent-activity rendering.
Promoted session and request to the canonical runtime-facing terminology in contracts and docs, while keeping thread and run aliases available for compatibility inside the existing runtime implementation.
Extended public memory inputs so memorize(...), recall(...), and listMemories(...) now accept sessionId and requestId alongside legacy sessionId and requestId.
Added request-first public inspection aliases so listRequestEvents(runtime, { sessionId, requestId }) and exportRequestPackage(runtime, input) now project runtime lifecycle data as request.* events with sessionId and requestId fields, while legacy listRunEvents(...) and exportRunPackage(...) aliases remain available for compatibility.
Updated the public helper-layer listener and cancellation surfaces so subscribe(...), request(..., { listeners }), replayEvaluationBundle(...), and cancelRequest(...) all return session/request-shaped records instead of leaking legacy session/request field names.
Extended runtime resource loading so runtime.spec.resources can attach multiple direct resource folders, not only package roots that nest resources/. file: resource locators now also accept direct resource-folder targets with their own package.json.
Fixed resource isolation snapshot rebuilds so live polling no longer rebuilds nested skill/reference trees in place under one shared cache directory. Refreshes now publish a new complete snapshot and keep prior snapshots readable for in-flight resource reads. Issue #33
Fixed isolated tool({...}) resource modules so their model-facing tool schemas now preserve required fields like query during runtime tool binding, instead of degrading to empty passthrough schemas when exported from isolated resource packages. Issue #32
Fixed runtime model-facing tool binding for raw-shape tool schemas such as schema: { query: z.string() }, so these tools keep their required object fields instead of being downgraded to passthrough schemas during resolution. Issue #32
Extended strict tool-call recovery so runtime invoke/stream retries also handle structured tool-argument validation failures such as missing required single-field args, not only malformed JSON parse errors. Issue #32
Fixed local function tool replay so single-field schemas such as { query: z.string() } can recover from malformed scalar tool-call arguments instead of silently reaching execution as {} and failing on missing required fields. Issue #32
Fixed local tool({...}) function tools so runtime governance and tool-execution policy now recognize module-defined schemas as schema-bound metadata without requiring duplicated YAML inputSchema.ref, matching the execution path already used by loaded tool modules. Issue #31
Added a minimal public createAcpServer(runtime) JSON-RPC adapter that maps session, request, approval, artifact, and runtime-event flows onto the existing harness persistence records instead of inventing a second protocol state model.
Added public listArtifacts(runtime, { sessionId, requestId }) and getArtifact(runtime, { sessionId, requestId, artifactPath }) helpers so external protocol adapters and operator surfaces can read persisted runtime artifacts without reaching into internal persistence classes.
Added public exportEvaluationBundle(runtime, input) so CI, offline evaluators, and operator tooling can export one stable package of session/request projections, transcript, events, artifacts, and current runtime health without bypassing harness persistence.
Added replayEvaluationBundle(runtime, { bundle, ... }) so exported evaluation packages can be replayed against the stable runtime surface instead of requiring custom harness-private replay scripts.
Added session-scoped LangChain filesystem continuity via config.filesystem.sessionStorage, including per-session runnable cache keys and session-root path rendering such as {runtimeRoot}/sessions/{sessionId}/filesystem.
Extended frozen runtimeSnapshot inspection records and policy-engine decisions with runtime governance bundles that summarize tool approval requirements and coarse risk categories without changing execution semantics.
Added serveAcpStdio(runtime) plus agent-harness acp serve --transport stdio so ACP clients can connect over newline-delimited JSON-RPC stdio instead of only using the in-process adapter.
Added serveAcpHttp(runtime) plus agent-harness acp serve --transport http so ACP clients can connect over HTTP JSON-RPC with SSE runtime events when stdio embedding is not the right deployment shape.
Added serveA2aHttp(runtime) plus agent-harness a2a serve so external agent platforms can discover an agent card and use minimal A2A-style JSON-RPC task submission, polling, and cancellation over the stable runtime surface.
Added serveAgUiHttp(runtime) plus agent-harness ag-ui serve so UI clients can consume a minimal AG-UI-compatible HTTP SSE bridge projected from the existing runtime request/event surface.
Added createRuntimeMcpServer(runtime) plus serveRuntimeMcpOverStdio(runtime) and agent-harness runtime-mcp serve so stateful runtime inspection and approval/export workflows are available as MCP tools instead of only through local SDK calls.
Added runtime-default governance YAML support under runtime/default.governance, including deny rules and tool-policy overrides that feed the existing governance snapshot and policy gate.
Added public request-event and evidence-package helpers so operator tooling can export stable request/session evidence packages without reaching into persistence internals or reimplementing transcript/artifact joins.
Extended runtime/default.governance with remoteMcp policy templates so products can deny or allow specific MCP servers, require approval by transport, and surface transport-based MCP risk tiers inside runtime governance bundles.
Added ACP regression coverage for request submission, session/request lookup, approval resolution, event notifications, and artifact lookup through the new server adapter surface.
Added regression coverage for runtime MCP server tools and CLI serving, AG-UI HTTP transport, ACP HTTP transport, request/session package export, remote MCP governance bundles, policy-engine remote MCP denial, evaluation export packaging, session-scoped filesystem continuity, runtime snapshot governance bundles, and policy-engine governance bundle aggregation.
Added public flow-inspection utilities buildFlowGraph(...) and exportFlowGraphToMermaid(...) so products can turn persisted runtime events plus optional upstream step projections into structured flow graphs or Mermaid exports.
Added exportFlowGraphToSequenceMermaid(...) so the same inspection graph can also be rendered as a Mermaid sequence diagram for runtime message traces.
Extended the flow-inspection utilities to surface best-effort delegation transitions from raw upstream task events, including explicit flowchart delegation nodes and separate sequence-diagram participants for delegated agents.
Changed Mermaid exporters to default to a product view that keeps only user-defined agent, sub-agent, model, tool, and skill paths visible, with view: "debug" available for runtime-heavy inspection output.
Improved best-effort delegation attribution in flow graph construction by recognizing delegated agent names embedded in nested upstream AI-message payloads, reducing missed subagent ownership when raw listener events omit a direct agent field.
Added regression coverage for ACP stdio transport, evaluation export and replay packaging, session-scoped filesystem continuity, runtime snapshot governance bundles, and policy-engine governance bundle aggregation.
Added public flow-inspection utilities buildFlowGraph(...) and exportFlowGraphToMermaid(...) so products can turn persisted runtime events plus optional upstream step projections into structured flow graphs or Mermaid exports.
Added exportFlowGraphToSequenceMermaid(...) so the same inspection graph can also be rendered as a Mermaid sequence diagram for runtime message traces.
Extended the flow-inspection utilities to surface best-effort delegation transitions from raw upstream task events, including explicit flowchart delegation nodes and separate sequence-diagram participants for delegated agents.
Changed Mermaid exporters to default to a product view that keeps only user-defined agent, sub-agent, model, tool, and skill paths visible, with view: "debug" available for runtime-heavy inspection output.
Improved best-effort delegation attribution in flow graph construction by recognizing delegated agent names embedded in nested upstream AI-message payloads, reducing missed subagent ownership when raw listener events omit a direct agent field.
Added stable public listMemories(runtime, input?), updateMemory(runtime, input), and removeMemory(runtime, input) helpers so products can inspect, curate, and delete learned durable knowledge without importing internal memory-store modules.
Made public memory admin operations rebuild runtime-managed structured projections and semantic vector indexes after updates or removals, keeping MemoryRecord inspection consistent with recall behavior.
Changed workspace config discovery so every YAML document under config/** is now loaded recursively by object kind, metadata.name or id, instead of relying on special filenames such as models.yaml or special subfolders such as config/agents/.
Kept the documented config/catalogs/* and config/agents/* layouts as recommended organization only, while allowing equivalent Agent, Models, EmbeddingModels, and other config objects anywhere under config/.
Added regression coverage for nested config YAML discovery, .yml recursive loading, and continued rejection of root-level YAML files outside config/ as workspace config sources.
Added LangMem-style runtime memory formation defaults with explicit formation.hotPath and formation.background policy so durable memory can be captured during the run or reflected after request completion and approval resolution through one harness-owned manager path.
Added a runtime-owned memory manager layer with rules and model strategies so memory candidates can be normalized or rejected before they enter durable storage.
Added background runtime memory reflection that writes structured episodic records from completed requests instead of limiting post-request memory work to markdown-only summaries.
Extended runtime-owned durable memory organization to rebuild structured projections for user and project scope memories in the same way as session, agent, and workspace scopes.
Switched the default durable store preset from file-backed JSON to SqliteStore, so runtime-owned long-term memory now defaults to SQLite-backed storage instead of store.json.
Added optional mem0-backed semantic recall augmentation so recall(...) and prompt memory assembly can blend mem0 search hits into the stable MemoryRecord[] contract without changing SQLite-backed canonical durable storage.
Switched runtime durable memory recall to an embedding-first path backed by the configured vector store, while keeping SQLite-backed structured MemoryRecord storage as canonical truth and lexical scoring as a fail-open fallback.
Added runtime-managed vector indexing for active durable memory records so canonical memory writes now rebuild the semantic recall substrate automatically.
Kept mem0 and Qdrant as optional integrations instead of default runtime dependencies, preserving the zero-sidecar default path based on SQLite + sqlite vector.
Added stable public memorize(runtime, input) and recall(runtime, input) helpers so applications can use runtime-owned durable memory without importing internal runtime/harness/system/* modules.
Upgraded the runtime and example resource packages to zod@^4 so local tool({...}) raw-shape schemas execute correctly inside isolated resource packages without mixed zod3/zod4 parser failures.
Added regression coverage for isolated resource tools that declare zod4 raw-shape validators and then execute through the runtime tool path, matching the failure mode reported in issue #29.
Exposed stable public memory contracts including MemoryRecord, MemoryDecision, MemorizeInput, and RecallInput, while keeping consolidation, maintenance, and storage layout runtime-managed.
Extended runtime memory persistence to record workspaceId, userId, and projectId provenance on durable records so public recall can filter stable memory scopes without reaching into internal store details.
Added first-phase structured runtime memory records so durable memory candidates now persist as typed MemoryRecord and MemoryDecision objects in the runtime store instead of only human-readable digest markdown.
Added canonical-key indexes for structured durable memory records so later consolidation and retrieval work can match persisted knowledge through a stable runtime-owned identity layer.
Added first-phase runtime memory decisions so matching candidates can now store, refresh, merge, or enter review based on exact content, canonical identity, and source reference overlap.
Added lightweight runtime memory consolidation that can mark stale records, archive duplicate active records, and rebuild prompt-oriented projections from the current structured memory state.
Upgraded runtime memory retrieval to rank active structured records by relevance, scope, freshness, and confidence instead of relying only on the older digest-only prompt assembly path.
Added normalizeUserChatInput(...) so products can project one chat-style user turn onto the stable request(..., { input, invocation }) surface without introducing a second harness-owned chat API.
Clarified that normal user-visible multimodal chat content belongs in input, while invocation.attachments is for auxiliary invocation-scoped attachment payloads rather than the primary chat-content path.
Added a multimodal image-chat example plus tests showing the canonical text and image_url flow and preserving that user-visible content as the source of truth for persistence and replay.
Shipped this multimodal input guidance and helper for issue #27.
Added structured syntheticFallback continuity metadata to stream results when the runtime recovers from streaming into invokeWithHistory, so downstream products can distinguish recovered versus failed stream-to-invoke transitions.
Emitted the same structured fallback payload on runtime.synthetic_fallback events instead of only a flat reason string, keeping stream continuity inspection aligned across events and final request results.

0.0.146

Added stable request-level tracing metadata to frozen runtimeSnapshot records so downstream products can correlate one persisted request with external logs or tracing systems using a runtime-owned correlationId.
Limited snapshot tracing fields to backend-neutral inspection data such as enabled, correlationId, generic tags, and runtime-owned metadata instead of exposing provider API keys, raw spans, or backend-native tracing handles.
Added runtime snapshot coverage so new requests persist tracing correlation data at startup.
Added runtime approval defaults for sensitive durable memory writes and write-like MCP tools so high-risk persistence and remote side effects can pause on the existing approval surface even when tool YAML omits explicit hitl.enabled.

0.0.144

Added LangMem-aligned memory kind normalization so runtime memory now rolls candidate knowledge into semantic, episodic, and procedural durable digests instead of keeping only one harness-specific summary bucket.
Added namespace-template-based runtime memory routing with defaults such as memories/sessions/{sessionId} and memories/workspaces/{workspaceId} so durable memory layout aligns more closely with LangGraph and LangMem store organization.
Added ranked runtime memory retrieval that scores stored memories against the current request and limits prompt injection to the most relevant session, workspace, and agent memories before invocation.
Added runtime memory policy tests and public API coverage for taxonomy-specific durable memory persistence and retrieval.

0.0.143

Added tool-level memory eligibility so local tool({...}) definitions can declare memory.enabled plus kind, scope, and tag hints for durable runtime memory candidates.
Added memory candidate extraction from local tool execution results, including explicit memoryCandidates payload support and automatic propagation into finalized request metadata.
Added runtime persistence for tool-derived memory candidates plus session, workspace, and agent digest rollups so durable memory can be reused across later requests.
Added memory retrieval injection so later requests can receive relevant durable memory context automatically before planning and execution.

0.0.140

Added a runtime-owned RuntimeMemorySync projection so durable memory summaries can be written automatically on request completion and approval resolution instead of relying only on session status projections.
Added a configurable runtime memory hook parser for ingestion.writeOnRequestCompletion, ingestion.writeOnApprovalResolution, ingestion.backgroundConsolidation, and ingestion.maxMessagesPerRequest.
Added durable run summary records and optional session digest rollups in the runtime memory store so the first memory lifecycle defaults now have a concrete runtime implementation.

0.0.137

Added stable runtime inspection fields for downstream products and operator UIs, including currentAgentId, delegationChain, startedAt, endedAt, and lastActivityAt on persisted request and session inspection records.
Added a frozen per-request runtimeSnapshot so downstream renderers can show the exact model, tools, skills, and memory configuration used by one request instead of reading mutable current inventory.
Updated streaming inspection so delegation-related upstream events can advance the persisted current-agent and delegation-chain projection without inventing a second execution protocol.
Shipped this runtime inspection expansion for issue #25.