Each agent now maintains its own session cache file instead of sharing
a single agent-sessions.json. This prevents session ID conflicts when
multiple agents operate on the same thread+role pair.
Changes:
- getCachePath() now takes agentName parameter
- getCachedSessionId/setCachedSessionId require agentName as first param
- Cache files named <agent>-sessions.json (e.g., hermes-sessions.json)
- Agent wrappers inject their agent name into cache calls
- Add comprehensive tests for session cache isolation
- Handle malformed JSON gracefully (treat as empty cache)
Fixes#461
Changed uwf thread read to wrap role prompts and agent outputs in XML tags
(<prompt> and <output>) instead of markdown headings (### Prompt, ### Content).
This prevents Claude Code from treating step outputs as structural headings.
- Updated formatStepPrompt to use <prompt>...</prompt> tags
- Updated formatStepContent to use <output>...</output> tags
- Added comprehensive test suite in thread-read-xml-tags.test.ts
- Updated existing tests to verify XML tag behavior
Fixes#459
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactored runBuiltinLoop function to reduce cognitive complexity from 30 to below 15 by extracting helper functions:
- shouldInjectDeadlineWarning: checks if deadline warning should be shown
- shouldProcessToolCalls: determines if tool calls should be processed
- extractFinalText: extracts last assistant message content
- injectDeadlineWarning: injects deadline warning message
- handleTextOnlyTurn: handles text-only turn logic
- handleToolCallTurn: handles tool call turn logic
- processLoopIteration: processes a single loop iteration
Added 24 new unit tests for the extracted helper functions, bringing total test count to 41 (all passing). All existing behavior is preserved.
Fixes#444
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit implements issue #456, adding two related capabilities to the uwf CLI:
1. **Background execution mode** for `uwf thread step` (via `--background` flag)
- Spawns agent execution in a detached background process
- Returns immediately with thread ID and background status
- Maintains marker files to track running processes
- Supports `--count` option to run multiple steps in background
- Prevents concurrent execution of the same thread
2. **Running threads query** command (`uwf thread running`)
- Lists all threads currently executing in background
- Returns thread ID, workflow, current role, PID, and start time
- Automatically filters out stale markers (dead processes)
- Empty list when no threads are running
**Key changes:**
- **workflow-protocol**: Added `RunningThreadItem`, `RunningThreadsOutput` types
Updated `StepOutput` to include `background: boolean | null` field
- **cli-workflow/background**: New module for process management
- Marker file creation/deletion (atomic operations)
- PID liveness checking
- Stale marker cleanup
- Running threads query
- **cli-workflow/commands/thread**:
- Updated `cmdThreadStep` to support `--background` and `--_background-worker` flags
- Added `cmdThreadStepBackground` for spawning detached processes
- Added `cmdThreadRunning` to list running threads
- Updated `cmdThreadKill` to terminate background processes
- **cli-workflow/cli**: Added CLI routing for new commands and flags
**Integration:**
- `uwf thread kill` now terminates background processes before archiving
- Foreground execution checks for existing background process and fails if found
- Background worker creates/cleans up marker files automatically
- Marker files stored in `~/.uncaged/workflow/running/*.json`
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The turns array items in CLAUDE_CODE_DETAIL_SCHEMA were missing
format: 'cas_ref', so expandDeep in step-details couldn't resolve
turn hashes to their payloads. Hermes schema already had this.
小橘 <xiaoju@shazhou.work>
Per CLAUDE.md convention, use `string | null` instead of `?:` in the
isFirstConditionalSibling helper function parameter types.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Extract helper functions (resolveThreadId, getThreadHead, listThreadSteps,
displayStepDetails, displayThreadRead) to reduce nesting and improve
readability. Also adds test coverage for the refactored functions.
Fixes#446
- Root README: add all 9 packages to table, update architecture diagram,
refresh CLI reference from uwf --help
- New READMEs for 8 packages (cli-workflow, workflow-protocol,
workflow-moderator, workflow-agent-kit, workflow-agent-hermes,
workflow-agent-builtin, workflow-agent-claude-code, workflow-dashboard)
- Updated workflow-util README to match current exports
- All API sections verified against src/index.ts exports
小橘 🍊(NEKO Team)
- Nudge turns don't consume turn budget (up to MAX_NUDGES=3), prevents
wasting agent work capacity on bookkeeping
- Inject deadline warning when 3 turns remain, telling agent to wrap up
- Agent can use status:failed to gracefully exit if it can't finish
- Inject user message when 3 turns remain, telling agent to wrap up
- Prompt tells agent to use status:failed if it can't finish in time
- Prevents wasting all turns without producing any frontmatter output
- Remove stale test file from dogfood agent run
copilot-api returns tool_calls even when tools field is omitted from
the request (infers from message history). Now the loop explicitly
nullifies tool_calls when noTools=true.
When the agent's first run output fails frontmatter extraction, the
retry loop (via options.continue) would replace agentResult entirely,
causing the 1-turn continuation detail to overwrite the original
multi-turn detail containing all tool-call history.
Now we capture primaryDetailHash from the first run and always use it
for the persisted StepNode, regardless of how many retries occur.
Fixes#439
LLM sometimes emits plain text (e.g. 'Now I'll write the tests...')
without calling tools, which the loop treated as final output. Now
the loop detects this and injects a user message nudging the LLM
to either continue using tools or output frontmatter with ---.
Agent was using all continue turns to keep calling tools instead of
outputting the required frontmatter. Now continue runs with noTools=true,
forcing LLM to emit text-only response.
Also supports null tools in chatCompletionWithTools to omit tools from
the API request entirely.
Agent was wasting turns exploring the filesystem because it didn't
know its working directory. Now the system prompt includes:
'Your working directory is: /path/to/cwd'
- Add stripPreamble() to handle LLM output with text before ---
- Strengthen system prompt: CRITICAL instruction for --- at position 0
- Fixes frontmatter parsing failures on first output turn
Previously runBuiltinWithMessages deleted the session jsonl after each
run/continue call. This meant the createAgent retry mechanism (which
calls continue on frontmatter validation failure) would lose all
previous turn data — each continue started with an empty jsonl.
Now the session jsonl accumulates across run + continue calls, so the
final storeBuiltinDetail captures all turns. The jsonl file is left
behind for debugging; it's small and can be cleaned up on next startup.
Also add a workflow hint to the system prompt reminding the LLM to use
tools before outputting frontmatter, preventing premature text-only
responses on the first turn.
- Add 4-strategy resolution priority: CAS hash → file path → local discovery → global registry
- Add helper functions: isFilePath, workflowFileExists, findWorkflowInDir, findWorkflowInParents
- Refactor resolveWorkflowCasRef to support direct hash, explicit paths, and parent traversal
- Add comprehensive test suite with 24 tests covering all strategies and edge cases
- Support .workflow/ and .workflows/ directories with .yaml/.yml extensions
- All 60 tests pass across 5 test files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Disable aliasDuplicateObjects in YAML stringify to prevent &a1/*a1
anchors when multiple steps have identical output
- Fix unused discoverAgents function (prefixed with _) and format issue
in setup.ts
Skip acp-client 'prompt() collects structured messages' and
resume-e2e 'resume() after close' — both require live LLM calls
and fail intermittently in CI.
Each turn (assistant response / tool result) is appended to a JSONL file
at ~/.uncaged/workflow/sessions/<sessionId>.jsonl during the loop.
On completion, the JSONL is read back, each turn is stored as a CAS node,
and the detail payload references them as a flat turns[] array in
chronological order. The session file is then deleted.
Benefits:
- Real-time observability: tail -f the JSONL to watch loop progress
- Crash recovery: partial JSONL survives process death
- Zero write contention: one file per session
- Detail stays a flat array for easy consumption by CLI/dashboard
Changes:
- New session.ts: initSessionDir, appendSessionTurn, readSessionTurns, removeSession
- loop.ts: append JSONL each turn instead of accumulating in-memory
- detail.ts: reads session JSONL → persists turns to CAS → stores detail
- agent.ts: passes storageRoot/sessionId to loop, cleans up session on completion
- types.ts: remove index from TurnPayload (order is implicit in JSONL/array)
- schemas.ts: sync with type changes
Ref: #433
buildClaudeCodePrompt was dropping ctx.edgePrompt entirely — the graph
transition instruction (e.g. 'Implement the plan') never reached the agent.
Now appended as '## Current Instruction' at the end of the prompt.
- StepRecord adds edgePrompt field (backward compat: defaults to "")
- StepNode CAS schema includes edgePrompt
- writeStepNode persists ctx.edgePrompt
- buildHistory exposes edgePrompt in StepContext
- buildBuiltinMessages reconstructs multi-turn moderator↔agent conversation:
system = role prompt + output format (stable prefix)
per prior visit: user (edgePrompt + inter-step summary) + assistant (output)
current: user (edgePrompt + recent summary)
- Zero extra persistence — pure function of CAS chain
- Stable prefix for LLM prompt cache hits
- 10 builtin tests pass, all other package tests pass
System message = agent identity (role prompt + output format instruction)
User message = moderator speech (task + edge prompt + history)
This reflects the workflow's core model: moderator speaks to agent
via the graph's edge prompt. Previously all content was in a single
system message with no user message, causing Claude API 400 errors.
- buildBuiltinPrompt now returns { system, user } instead of string
- agent.ts sends system + user as separate messages
- Tests updated accordingly
Switch from --output-format json to stream-json --verbose to capture
per-turn data. Detail now includes:
- model name
- usage (input/output/cache tokens)
- stopReason
- turns[] as individual CAS nodes with role, content, tool calls
Also addresses PR #421 review fixes:
- sessionId guard: skip cache write when sessionId is empty/undefined
- silent catch: log resume failures with debug tag 5VKR8N3Q
- atomic write: session cache uses temp+rename for crash safety
Closes#422
- Replace resolvePathInWorkspace with simple resolvePath (no boundary check)
- Remove UWF_BUILTIN_ALLOW_SHELL env gate from run_command
- Update tests accordingly
Per review: sandbox was false security with shell=true, and path
restrictions are unnecessary for a trusted agent environment.
Move getCachedSessionId/setCachedSessionId from workflow-agent-hermes
into workflow-agent-kit so all agent adapters can share the same
session cache logic.
Add cross-process session resume to workflow-agent-claude-code:
on re-entry (isFirstVisit=false), look up the cached sessionId and
use 'claude --resume' to continue with full conversation history.
Cache file renamed from hermes-sessions.json to agent-sessions.json
to reflect its shared nature.
Refs #418
Built-in role agent that uses workflow config models directly,
with its own tool-calling run loop. No external agent dependency.
- OpenAI-compatible chat completion client with tool_calls support
- P0 toolkit: read_file, write_file, run_command
- Integrates via createAgent factory from workflow-agent-kit
- CAS detail recording for each turn
- Path sandboxing and shell opt-in (UWF_BUILTIN_ALLOW_SHELL)
Hermes ACP _restore fails for custom providers — resolve_runtime_provider
throws and base_url/api_mode are lost, causing resume to silently create a
new session with no history. Prompt then returns empty text or refusal.
Disable resume by default. Set UWF_HERMES_RESUME=1 to opt back in.
Includes investigation notes in docs/investigations/.
Refs #418