WebSocket Protocol

The CortexPrism WebSocket provides real-time streaming chat, audio communication, file upload, and tool call reasoning inspection.

Connection

ws://127.0.0.1:3000/ws          # Client WebSocket
ws://127.0.0.1:3000/ws/node     # Node WebSocket (Hub ↔ Node)

Authentication: when webAuth.requireAuth is enabled, the /ws endpoint checks session cookies before upgrading connections.

Client → Server Messages

{ "type": "chat", "message": "Hello", "sessionId": "sess_abc123", "files": [...] }
{ "type": "ping" }
{ "type": "new_session" }
{ "type": "select_agent", "agentId": "agent-1" }
{ "type": "audio_chunk", "data": "<base64>" }
{ "type": "audio_end" }
{ "type": "speak", "text": "Hello world" }

Chat Message Fields

FieldTypeRequiredDescription
type"chat"YesMessage type
messagestringYesUser message text
sessionIdstringNoResume existing session
filesarrayNoUploaded files [{filename, mimeType, data (base64)}]

File Upload

Files are received as base64 over WebSocket alongside chat messages. Saved to working directory and agent workspace. PDFs get text auto-extracted. Images included as multimodal content blocks for supported providers.

Server → Client Messages

{ "type": "connected" }
{ "type": "session", "sessionId": "sess_abc123" }
{ "type": "start" }
{ "type": "chunk", "delta": "Hello" }
{ "type": "reasoning", "content": "Agent is considering..." }
{ "type": "tool_call", "tool": "web_search", "args": {"query": "..."} }
{ "type": "tool_result", "tool": "web_search", "result": "..." }
{ "type": "done", "tokensIn": 100, "tokensOut": 50, "costUsd": 0.001, "durationMs": 800 }
{ "type": "error", "error": "Something went wrong" }
{ "type": "pong" }
{ "type": "audio", "data": "<base64 mp3>", "format": "mp3" }
{ "type": "voice_state", "listening": true, "enabled": true }
{ "type": "file_change", "path": "/workspace/file.ts" }

Done Message Fields

FieldTypeDescription
tokensInnumberInput tokens used
tokensOutnumberOutput tokens generated
costUsdnumberEstimated cost in USD
durationMsnumberTotal turn duration
modelMode'manual' | 'auto'Model selection mode
resolvedProviderstringLLM provider used
resolvedModelstringLLM model used
autoFallbackbooleanWhether Auto mode fell back to heuristic
autoFallbackReasonstringReason for fallback

Reasoning Message

The reasoning message type delivers the agent's internal decision-making process as a separate stream. In the Web UI, this appears in a collapsible panel toggled by a 🔬 Reasoning button.

Voice/Audio Messages

  • Client → Server: audio_chunk / audio_end
  • Server → Client: speak / audio / voice_state

Transcribed speech is dispatched directly into the agent loop as a user message. Auto-TTS synthesizes agent responses to audio before the done signal.

Session Resume

Include an existing sessionId in a chat message to resume across WebSocket reconnects:

{ "type": "chat", "message": "Continue our conversation", "sessionId": "sess_abc123" }

Protocol Notes

  • Tool call XML (<tool_call>) and bare JSON are stripped from chunks using a brace-depth walker algorithm
  • Streaming is buffered internally when tools are registered; only clean prose reaches the client
  • Tool calls split across multiple WebSocket chunks are properly buffered and stripped
  • The file_change event broadcasts on file edits, renames, and deletes
  • WebSocket connections are upgraded from standard HTTP at /ws
  • Node WebSocket at /ws/node uses token-based registration with heartbeat/ACK protocol

See Also