Назад в блог
CortexPrism v0.51.0 — Agent Autonomy: Runtime Tool Forging, Multi-Agent Orchestration, HEXACO Personalities

CortexPrism v0.51.0 — Agent Autonomy: Runtime Tool Forging, Multi-Agent Orchestration, HEXACO Personalities

CortexPrism v0.51.0 is here — the agent autonomy release that fundamentally transforms what agents can do.

jacobJune 23, 20267 мин. чтения3 просмотров

CortexPrism v0.51.0 is here — the agent autonomy release that fundamentally transforms what agents can do. Instead of being passive executors of pre-defined tools, agents can now create their own tools at runtime, orchestrate complex multi-agent workflows, express distinct personalities through a scientifically-grounded HEXACO system, and participate in standardized memory benchmarking. Plus 10 built-in specialist agents, checkpoint time-travel UI, and a streamlined navigation experience.

Runtime Tool Forging — Agents Create Tools

The most transformative feature in v0.51.0 is runtime tool forging. Agents can now create, test, and export custom tools on-demand through four new built-in tools:

  • tool_forge takes a name, description, and TypeScript code. It runs a static safety scan against unsafe patterns, optionally calls an LLM security judge, executes pure-compute code in a Deno Worker (sandboxed) or shell-touching code in the Docker sandbox, and registers the result in a session-scoped forged-tool registry.

  • forged_call invokes a previously forged tool by name with arbitrary arguments.

  • tool_export promotes a forged tool to the persistent skills system (lifecycle: candidate) so it survives across sessions.

  • tool_list_forged lists all forged tools registered in the current session.

This enables adaptive tooling: when an agent encounters a task that no existing tool handles well, it can create a specialized tool for that task, test it, and if successful, export it as a reusable skill. The system includes comprehensive safety measures: static pattern scanning, optional LLM security review, sandboxed execution, and session-scoped isolation.

Multi-Agent Orchestration — 6 Composable Strategies

A single orchestrate tool now exposes six composable multi-agent execution strategies, all backed by the robust spawnSubAgent system:

  • Sequential chains agents; each receives the previous agent's output as context. Perfect for multi-step workflows where each step builds on the previous.

  • Parallel runs agents concurrently via Promise.allSettled; a synthesiser agent merges outputs. Ideal for gathering multiple perspectives or approaches simultaneously.

  • Debate assigns N agents to argue positions for R rounds; an impartial judge synthesises the final answer. Excellent for exploring multiple viewpoints and reaching consensus.

  • Review-loop has a writer agent draft and a reviewer agent critique, iterating up to max_iterations times until the reviewer emits an approval keyword. Perfect for iterative refinement.

  • Hierarchical uses a coordinator agent to decompose the task, worker agents execute sub-tasks in parallel, and the coordinator synthesises results. Great for complex project coordination.

  • Graph accepts user-defined DAGs of {id, task, dependsOn[]} nodes with topological execution and dependency context injection. Maximum flexibility for custom workflows.

HEXACO Personality System

Agents can now be configured with a six-factor HEXACO personality (honesty, emotionality, extraversion, agreeableness, conscientiousness, openness — each ∈ [0, 1]). This scientifically-grounded personality model drives:

  • System prompt injectionbuildPersonalityPrompt() generates a natural-language paragraph describing the agent's voice, honesty, emotional tone, extraversion, agreeableness, conscientiousness, and openness, prepended to the system prompt on every turn.

  • Memory retrieval biasgetMemoryBiasWeights() returns per-tier multipliers (episodic, semantic, procedural, preference) and BM25/vector balance weights derived from personality scores.

  • Response style hintsbuildResponseStyleHints() produces brief post-processing nudges (structured output, warmth, perspective acknowledgement, creative alternatives).

  • MQM routing hintsgetMqmPersonalityHints() returns accuracyWeight, creativityWeight, and preferFast signals for the Model Quartermaster.

The personality field is optional on AgentConfig; absent or neutral scores (0.5) produce no change in behavior.

Memory Benchmark Runner — LongMemEval Compatible

Memory systems need standardized evaluation. v0.51.0 introduces a comprehensive benchmarking subsystem:

  • Core runner (src/eval/memory-bench.ts) supports configurable concurrency, token-overlap + Jaccard scoring, per-category aggregation, and JSON persistence to ~/.cortex/data/memory_bench_results.json and memory_bench_history.json.

  • CLI command cortex eval memory supports --suite <file>, --sample <n>, --full, and --json flags.

  • REST APIGET /api/eval/memory/results, GET /api/eval/memory/history, POST /api/eval/memory/run.

  • Web UI — new Memory Benchmark page with summary stat cards, per-category accuracy bar chart, per-question result table, and historical run trend table. One-click ▶ Run Benchmark button triggers a live run via the API.

  • CI workflow.github/workflows/memory-bench.yml runs the benchmark weekly (Monday 06:00 UTC) and on manual dispatch; results are uploaded as a GitHub Actions artifact and summarised in the job step summary.

The benchmark is compatible with LongMemEval standards, enabling cross-system comparison of memory performance.

10 Built-in Agents (5 New, 5 Refined)

The agent roster now ships with 10 selectable built-in agents. Five new specialist agents join the existing five:

  • Writer ✍️ — technical documentation, changelogs, READMEs, API references

  • DevOps 🚀 — Docker, Kubernetes, Terraform, CI/CD pipelines

  • Security 🔐 — OWASP Top 10 auditing, CVE scanning, compliance review (read-only)

  • Code Reviewer 👁️ — structured BLOCKER/SUGGESTION/NITPICK/QUESTION review format (read-only)

  • QA / Tester 🧪 — test generation, coverage analysis, regression discipline

All five existing agents (Assistant, Developer, Researcher, Architect, Analyst) received deep soul rewrites adding Capabilities, Guardrails, and Limitations sections, explicit sub-agent delegation hints, and improved output format specs.

Checkpoint Time-Travel UI

The Memori page (/memori) now renders a full two-panel timeline: a session-grouped checkpoint list on the left and a rich detail view on the right. Each checkpoint shows turn number, goals, message count, tool calls, and workspace snapshot. Two action buttons — Resume here (restore the checkpoint into the current session) and Branch from here (fork into a new child session) — are available on every checkpoint. This makes it easy to experiment with different approaches from any point in a conversation.

The UI underwent significant consolidation to reduce navigation complexity:

  • Sandbox now includes a Code Runner tab (previously standalone coderunner page)

  • Remote & Computer merges the former remote (Remote Agents) and computer (Computer Use) pages into one page with two tabs

  • MCP merges mcp (Connections) and mcp-gateway (Gateway) into one page with two tabs

  • System Health (formerly Daemons) merges daemon process monitoring and OS health metrics into one page with two tabs

  • Automation expands to a 5-tab hub: Hooks, Triggers, Workflows, Jobs, and Eval — replacing four separate nav entries

  • Extensions gains a Panels tab, absorbing the standalone Plugin Panels page

  • Activity (Lens) moved from the Knowledge category to System, where audit/observability tooling belongs

Plugin System Enhancements

The plugin system received major improvements:

  • Extensions top-nav category — plugins now have a dedicated Extensions top-nav tab (sixth tab in the header). Plugin-contributed panels appear as first-class sub-nav items under Extensions.

  • Plugin sidebar slot injection — plugins declaring ui:panel now have their panels registered in the ui-slots registry at load time. A new GET /api/plugins/slots endpoint exposes live slot registrations. Sidebar plugins are clickable and open in inline modal iframes.

  • Plugin middleware pipeline hooks — ESM plugins can now export middlewarePre and middlewarePost functions. When loaded, plugins declaring these capabilities have their functions automatically registered as pre-tool/post-tool pipeline hooks.

  • Plugin event bus wiring — the plugin event bus now receives live agent lifecycle events: agent:turn-start, tool:pre-execute, tool:post-execute, and agent:turn-end.

Technical Excellence

CortexPrism maintains its commitment to technical excellence:

  • Deno 2.x strict TypeScript throughout the codebase — single binary, no Docker required

  • SQLite (WAL mode) via libSQL for reliable data persistence

  • 6 workspace packages with 41 pure TypeScript contract interfaces in a clean dependency graph

  • 24 LLM providers with a unified streaming interface

  • 5-tier persistent memory with hybrid search, automatic learning, and health monitoring

  • Rigorous security — Parallax policy validator + LLM supervisor + 18 recently resolved security issues

  • Zero telemetry — everything runs on your hardware

Get Started

Ready to experience agent autonomy? Installation is simple:

# Install
curl -fsSL https://cortexprism.io/install.sh | bash

# Setup and start
cortex setup
cortex serve

# Open http://localhost:3000

Already running? Upgrade in place:

cortex self update

The project is Apache 2.0 licensed, fully open source, and has zero telemetry. Everything runs on your hardware.

GitHub: github.com/CortexPrism/cortex Changelog: CHANGELOG.md


Built with Deno. 6 packages. Agent autonomy. Runtime tool forging. Multi-agent orchestration. HEXACO personalities. Zero telemetry.

J

jacob

Related posts

CortexPrism v0.53.0 — Team-Ready: Multi-User Collaboration, Teams, API Tokens, Resource Scoping, Federation, and Internationalization
+8

CortexPrism v0.53.0 — Team-Ready: Multi-User Collaboration, Teams, API Tokens, Resource Scoping, Federation, and Internationalization

CortexPrism v0.53.0 is here — the team-ready release that fundamentally redefines who can use the platform.

jacobJune 24, 20266 мин. чтения4
CortexPrism v0.52.0 — Distributed Intelligence: WASM Plugin Runtime Overhaul, Multi-Instance Swarm Orchestration, 6 New LLM Providers
+8

CortexPrism v0.52.0 — Distributed Intelligence: WASM Plugin Runtime Overhaul, Multi-Instance Swarm Orchestration, 6 New LLM Providers

CortexPrism v0.52.0 is here — the distributed intelligence release that fundamentally expands what the platform can do. WASM plugins can now be compiled from C, Rust, or Zig with a production-grade runtime.

jacobJune 23, 20267 мин. чтения5