WebSocket Protocol

The CortexPrism WebSocket provides real-time streaming chat, audio communication, file upload, and tool call reasoning inspection.

Connection

ws://127.0.0.1:3000/ws          # Client WebSocket
ws://127.0.0.1:3000/ws/node     # Node WebSocket (Hub ↔ Node)

Authentication: when webAuth.requireAuth is enabled, the /ws endpoint checks session cookies before upgrading connections.

Client → Server Messages

{ "type": "chat", "message": "Hello", "sessionId": "sess_abc123", "files": [...] }
{ "type": "ping" }
{ "type": "new_session" }
{ "type": "select_agent", "agentId": "agent-1" }
{ "type": "audio_chunk", "data": "<base64>" }
{ "type": "audio_end" }
{ "type": "speak", "text": "Hello world" }

Chat Message Fields

Field	Type	Required	Description
`type`	`"chat"`	Yes	Message type
`message`	string	Yes	User message text
`sessionId`	string	No	Resume existing session
`files`	array	No	Uploaded files `[{filename, mimeType, data (base64)}]`

File Upload

Files are received as base64 over WebSocket alongside chat messages. Saved to working directory and agent workspace. PDFs get text auto-extracted. Images included as multimodal content blocks for supported providers.

Server → Client Messages

{ "type": "connected" }
{ "type": "session", "sessionId": "sess_abc123" }
{ "type": "start" }
{ "type": "chunk", "delta": "Hello" }
{ "type": "reasoning", "content": "Agent is considering..." }
{ "type": "tool_call", "tool": "web_search", "args": {"query": "..."} }
{ "type": "tool_result", "tool": "web_search", "result": "..." }
{ "type": "done", "tokensIn": 100, "tokensOut": 50, "costUsd": 0.001, "durationMs": 800 }
{ "type": "error", "error": "Something went wrong" }
{ "type": "pong" }
{ "type": "audio", "data": "<base64 mp3>", "format": "mp3" }
{ "type": "voice_state", "listening": true, "enabled": true }
{ "type": "file_change", "path": "/workspace/file.ts" }

Done Message Fields

Field	Type	Description
`tokensIn`	number	Input tokens used
`tokensOut`	number	Output tokens generated
`costUsd`	number	Estimated cost in USD
`durationMs`	number	Total turn duration
`modelMode`	`'manual' \| 'auto'`	Model selection mode
`resolvedProvider`	string	LLM provider used
`resolvedModel`	string	LLM model used
`autoFallback`	boolean	Whether Auto mode fell back to heuristic
`autoFallbackReason`	string	Reason for fallback

Reasoning Message

The reasoning message type delivers the agent's internal decision-making process as a separate stream. In the Web UI, this appears in a collapsible panel toggled by a 🔬 Reasoning button.

Voice/Audio Messages

Client → Server: audio_chunk / audio_end
Server → Client: speak / audio / voice_state

Transcribed speech is dispatched directly into the agent loop as a user message. Auto-TTS synthesizes agent responses to audio before the done signal.

Session Resume

Include an existing sessionId in a chat message to resume across WebSocket reconnects:

{ "type": "chat", "message": "Continue our conversation", "sessionId": "sess_abc123" }

Protocol Notes

Tool call XML (<tool_call>) and bare JSON are stripped from chunks using a brace-depth walker algorithm
Streaming is buffered internally when tools are registered; only clean prose reaches the client
Tool calls split across multiple WebSocket chunks are properly buffered and stripped
The file_change event broadcasts on file edits, renames, and deletes
WebSocket connections are upgraded from standard HTTP at /ws
Node WebSocket at /ws/node uses token-based registration with heartbeat/ACK protocol