- evaluate() reads $status instead of status, defaults to _ when missing
- Update all YAML examples and .workflows to use $status
- Update cli-workflow resolveEvaluateArgs to use $status
- 10 moderator tests pass including new default _ test
- Single-exit roles no longer need to declare status field
Phase 1 of #499 (closes#500)
Replace all JSONata/ConditionDefinition/ConditionalEdge references with
status-based routing terminology across 8 files:
- README.md, CLAUDE.md: moderator description, key terms
- docs/architecture.md: dependency jsonata→mustache, evaluate signature
- docs/wf-stateless-design.md: type definitions, routing context
- packages/workflow-moderator/README.md: full rewrite for new API
- packages/workflow-protocol/README.md: Target type, remove Transition
- packages/workflow-dashboard/context.md: StatusEdge, graph type
- docs/builtin-agent-research.md: stale JSONata references
- Rename ConditionalEdge → StatusEdge, condition → status throughout
- Rename conditional.tsx → status.tsx, edge label shows status value
- Update trans-in/trans-out to use status field instead of condition
- Update validate to check status edges
- Align server/workflow.ts with new WorkflowPayload.graph format
- 20 dashboard tests pass
Phase 3 of #490 (closes#493)
- Migrate solve-issue.yaml, analyze-topic.yaml, debate.yaml to new format
- Add status enum field to all role frontmatter schemas
- Use {{{ }}} (triple mustache) for prompt templates with user content
- Disable mustache HTML escaping globally (prompts are plain text, not HTML)
- Add 2 new tests for HTML escape behavior
- 9 moderator tests pass
Phase 2 of #490 (closes#492)
Extract four helper functions from cmdStepRead to reduce cognitive
complexity from 27 to ≤15:
- loadStepDetail: Load and validate step detail node
- loadTurnData: Load all turn nodes and extract content
- selectTurnsForQuota: Select turns within quota (≥1 always shown)
- formatStepMarkdown: Assemble final markdown output
All 6 existing tests pass. Zero Biome warnings. CLAUDE.md compliant.
Fixes#487
Replace custom buildHistorySummary with shared buildContinuationPrompt
from workflow-agent-kit. This aligns claude-code agent with hermes agent:
- First visit: includes step content (within 32k quota)
- Re-entry: shows only steps since last visit (meta only, session has context)
Previously developer could not see reviewer's detailed feedback on
re-entry, only {"approved":false}. Now gets full review text.
Fixes#486
Implements the `uwf step read` command to render a single step's turns
as human-readable markdown with quota enforcement.
Changes:
- Implement cmdStepRead() in step.ts with quota enforcement
- Renders step metadata (hash, role, agent)
- Loads and formats turns from detail node
- Enforces quota by selecting most recent turns
- Always shows at least one turn even if it exceeds quota
- Gracefully handles steps with no detail or no turns
- Register `step read` command in cli.ts with --quota flag (default 4000)
- Add comprehensive test suite in step-read.test.ts (6 tests covering
basic functionality, quota enforcement, edge cases, and special chars)
- Update README.md CLI Reference table to include `step read`
- Update package-level README.md with command documentation and example
Closes#484
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Issue #480: The --quota flag on 'uwf thread read' was not properly
limiting output size due to an off-by-one error in selectByQuota().
Root cause:
- Items were added to selected array BEFORE checking if they would
exceed the quota
- This meant the last item that exceeded quota was still included
- Prompt deduplication tracking was mutated during quota calculation,
causing prompts to not render in final output
Fix:
- Check quota BEFORE adding items to selected array
- Always include at least one step even if it exceeds quota
- Calculate step lengths using actual rendering format
- Account for start section and separators in quota calculation
- Use temporary Set during length calculation to avoid mutating
the prompt deduplication tracking
Tests:
- Added comprehensive test suite (thread-read-quota.test.ts)
- Covers quota enforcement, boundary conditions, edge cases
- Tests interaction with --before and --start flags
- All tests pass
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changed the exit behavior of 'uwf cas has' to return exit code 1 when
a hash doesn't exist, while preserving the JSON output {exists:false}.
This enables proper use in shell conditionals like 'if uwf cas has $HASH'.
Fixes#481
Fixed issue #469 where `uwf step list`, `uwf step show`, and `uwf thread read`
failed with "thread not active" error when called on completed threads.
The root cause was that resolveHeadHash() in shared.ts only checked threads.yaml
(active threads index) but never fell back to history.jsonl (completed threads log).
Changes:
- Updated resolveHeadHash() in shared.ts to check history.jsonl as fallback
- Changed error message from "thread not active" to "thread not found"
- Added comprehensive test coverage:
- Unit tests for resolveHeadHash() with active/completed/missing threads
- Integration tests for cmdStepList() with completed threads
- Integration tests for cmdStepShow() with completed threads
- Regression tests for cmdThreadRead() with completed threads
All commands now work identically for active and completed threads.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a role participates for the first time (e.g. committer), it previously
only received the system prompt + last step output, missing the full thread
history. This caused hallucination as the role had to guess what happened.
Changes:
- build-continuation-prompt.ts: detect first-time roles and include all
steps' meta + content for last 2-3 steps (within quota)
- context.ts: add isFirstVisit detection helper
- types.ts: add isFirstVisit field to AgentContext
- hermes.ts: pass isFirstVisit through to prompt builder
Fixes#473
Implements enhanced filtering and pagination for the `uwf thread list` command
to support workflows with large numbers of threads.
Changes:
- Add --page, --page-size parameters for pagination (default: page 1, size 20)
- Add --since, --until time filters supporting multiple formats (ISO8601, relative like "2h", "1d")
- Add --workflow filter to show threads for specific workflow
- Add --sort parameter (newest-first, oldest-first, alphabetical)
- Add pagination metadata in JSON output (page, pageSize, totalThreads, totalPages, hasMore)
- Implement parseRelativeTime() for human-friendly time expressions (1h, 30m, 2d, 1w)
- Add comprehensive unit tests for filters, pagination, and time parsing
- Update CLI help text with new parameters and examples
Fixes#471
- Add `content: string | null` to RoleStep type
- Resolve contentHash → text for the last step when building ThreadContext
- Update buildAgentPrompt to include <output> tag with step content
- Add 16k content quota with truncation
- Update tests
Previously step list showed raw CAS refs for detail fields.
Now detail is recursively expanded (like output already was),
since every turn is individually hashed and walkable.
Refs #463
Each agent now maintains its own session cache file instead of sharing
a single agent-sessions.json. This prevents session ID conflicts when
multiple agents operate on the same thread+role pair.
Changes:
- getCachePath() now takes agentName parameter
- getCachedSessionId/setCachedSessionId require agentName as first param
- Cache files named <agent>-sessions.json (e.g., hermes-sessions.json)
- Agent wrappers inject their agent name into cache calls
- Add comprehensive tests for session cache isolation
- Handle malformed JSON gracefully (treat as empty cache)
Fixes#461
Changed uwf thread read to wrap role prompts and agent outputs in XML tags
(<prompt> and <output>) instead of markdown headings (### Prompt, ### Content).
This prevents Claude Code from treating step outputs as structural headings.
- Updated formatStepPrompt to use <prompt>...</prompt> tags
- Updated formatStepContent to use <output>...</output> tags
- Added comprehensive test suite in thread-read-xml-tags.test.ts
- Updated existing tests to verify XML tag behavior
Fixes#459
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactored runBuiltinLoop function to reduce cognitive complexity from 30 to below 15 by extracting helper functions:
- shouldInjectDeadlineWarning: checks if deadline warning should be shown
- shouldProcessToolCalls: determines if tool calls should be processed
- extractFinalText: extracts last assistant message content
- injectDeadlineWarning: injects deadline warning message
- handleTextOnlyTurn: handles text-only turn logic
- handleToolCallTurn: handles tool call turn logic
- processLoopIteration: processes a single loop iteration
Added 24 new unit tests for the extracted helper functions, bringing total test count to 41 (all passing). All existing behavior is preserved.
Fixes#444
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit implements issue #456, adding two related capabilities to the uwf CLI:
1. **Background execution mode** for `uwf thread step` (via `--background` flag)
- Spawns agent execution in a detached background process
- Returns immediately with thread ID and background status
- Maintains marker files to track running processes
- Supports `--count` option to run multiple steps in background
- Prevents concurrent execution of the same thread
2. **Running threads query** command (`uwf thread running`)
- Lists all threads currently executing in background
- Returns thread ID, workflow, current role, PID, and start time
- Automatically filters out stale markers (dead processes)
- Empty list when no threads are running
**Key changes:**
- **workflow-protocol**: Added `RunningThreadItem`, `RunningThreadsOutput` types
Updated `StepOutput` to include `background: boolean | null` field
- **cli-workflow/background**: New module for process management
- Marker file creation/deletion (atomic operations)
- PID liveness checking
- Stale marker cleanup
- Running threads query
- **cli-workflow/commands/thread**:
- Updated `cmdThreadStep` to support `--background` and `--_background-worker` flags
- Added `cmdThreadStepBackground` for spawning detached processes
- Added `cmdThreadRunning` to list running threads
- Updated `cmdThreadKill` to terminate background processes
- **cli-workflow/cli**: Added CLI routing for new commands and flags
**Integration:**
- `uwf thread kill` now terminates background processes before archiving
- Foreground execution checks for existing background process and fails if found
- Background worker creates/cleans up marker files automatically
- Marker files stored in `~/.uncaged/workflow/running/*.json`
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The turns array items in CLAUDE_CODE_DETAIL_SCHEMA were missing
format: 'cas_ref', so expandDeep in step-details couldn't resolve
turn hashes to their payloads. Hermes schema already had this.
小橘 <xiaoju@shazhou.work>
Per CLAUDE.md convention, use `string | null` instead of `?:` in the
isFirstConditionalSibling helper function parameter types.
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Extract helper functions (resolveThreadId, getThreadHead, listThreadSteps,
displayStepDetails, displayThreadRead) to reduce nesting and improve
readability. Also adds test coverage for the refactored functions.
Fixes#446
- Root README: add all 9 packages to table, update architecture diagram,
refresh CLI reference from uwf --help
- New READMEs for 8 packages (cli-workflow, workflow-protocol,
workflow-moderator, workflow-agent-kit, workflow-agent-hermes,
workflow-agent-builtin, workflow-agent-claude-code, workflow-dashboard)
- Updated workflow-util README to match current exports
- All API sections verified against src/index.ts exports
小橘 🍊(NEKO Team)
- Nudge turns don't consume turn budget (up to MAX_NUDGES=3), prevents
wasting agent work capacity on bookkeeping
- Inject deadline warning when 3 turns remain, telling agent to wrap up
- Agent can use status:failed to gracefully exit if it can't finish
- Inject user message when 3 turns remain, telling agent to wrap up
- Prompt tells agent to use status:failed if it can't finish in time
- Prevents wasting all turns without producing any frontmatter output
- Remove stale test file from dogfood agent run
copilot-api returns tool_calls even when tools field is omitted from
the request (infers from message history). Now the loop explicitly
nullifies tool_calls when noTools=true.
When the agent's first run output fails frontmatter extraction, the
retry loop (via options.continue) would replace agentResult entirely,
causing the 1-turn continuation detail to overwrite the original
multi-turn detail containing all tool-call history.
Now we capture primaryDetailHash from the first run and always use it
for the persisted StepNode, regardless of how many retries occur.
Fixes#439
LLM sometimes emits plain text (e.g. 'Now I'll write the tests...')
without calling tools, which the loop treated as final output. Now
the loop detects this and injects a user message nudging the LLM
to either continue using tools or output frontmatter with ---.