Fixes hallucination issues observed in thread 06F7FSTXQGY3D5CY5YPQFK2Y3W:
1. Developer self-verification (critical): Added step 12 requiring
mandatory verification of branch, file existence, and git status
before reporting done status. Prevents hallucinated completions
without actual tool execution.
2. Reviewer hard-check enforcement (critical): Added critical warning
and step 0 requiring cd/pwd verification before review. Prevents
false rejections based on assumptions without actual path checks.
3. Test debugging escalation (medium): Added structured debugging
guidance with escalation path after 3 test cycles. Prevents
infinite retry loops by providing strategy and fail-fast guidance.
Also added 3 test cases to verify the new procedure steps exist.
Based on change plan 9EVZPDTS16PMG analyzing execution anomalies
that resulted in 58% waste (13 of 23 minutes).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Handle the case where Claude Code exits without producing a result line
(timeout, OOM, signal kill). Previously returned null and threw an error;
now returns incomplete result with best-effort output extraction.
Changes:
- Add "incomplete" as new ClaudeCodeResultSubtype value
- Extract output from last assistant turn when no result line exists
- Enhanced error messages distinguish incomplete vs unparseable output
- Store incomplete results in CAS with appropriate metadata
- Add 10 comprehensive test cases for incomplete result handling
Fixes#574
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This resolves issue #573 by moving uwf's CAS directory from
~/.uncaged/workflow/cas/ to the shared ~/.uncaged/json-cas/ location.
Changes:
- Added getGlobalCasDir() function with UNCAGED_CAS_DIR support
- Updated createUwfStore() to use global CAS directory
- Added comprehensive test coverage (11 new tests)
- Updated all existing tests for environment isolation
- Updated documentation (CLAUDE.md, README.md)
Benefits:
- Cross-tool visibility: json-cas CLI can read uwf-created nodes
- Schema sharing: both tools access same schema registry
- Future-proofing: enables json-cas render/verbose for uwf data
Fixes#573
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This resolves issue #573 by moving uwf's CAS directory from
~/.uncaged/workflow/cas/ to the shared ~/.uncaged/json-cas/ location.
Changes:
- Added getGlobalCasDir() function that respects UNCAGED_CAS_DIR env var
- Updated createUwfStore() to use the global CAS directory
- Updated all tests to set UNCAGED_CAS_DIR for test isolation
- Added comprehensive test suite for global CAS functionality
- Updated documentation (CLAUDE.md, README.md) to reflect new architecture
Benefits:
- Cross-tool visibility: json-cas CLI can now read uwf-created nodes
- Schema sharing: both tools access the same schema registry
- Future-proofing: enables json-cas render/verbose features for uwf data
Workflow metadata (threads.yaml, registry.yaml, history.jsonl) remains in
~/.uncaged/workflow/ as intended.
All tests pass. No breaking changes to existing functionality.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add GROUND RULES section to all role procedures: require real command output, no fabrication
- Add 'skipped' status for roles where everything is already configured
- Add skipped routing in graph so workflow continues normally
- Add build artifact detection in committer: scan for .d.ts/.js.map/.js before commit
- Add verification enforcement notes to all roles
Fixes hallucination issue where agents reported completing work without actually writing files.
Store the fully assembled prompt sent to each agent in CAS as a text
node, referenced from StepNodePayload.assembledPrompt. This enables
exact reproduction of what the agent received for debugging hallucinations.
Changes:
- workflow-protocol: StepRecord + STEP_NODE_SCHEMA add assembledPrompt field
- workflow-util-agent: AgentRunResult includes assembledPrompt, run.ts stores it
- workflow-util-agent: schemas register TEXT_SCHEMA for prompt storage
- workflow-agent-claude-code: return assembled prompt from buildClaudeCodePrompt
- workflow-agent-hermes: return assembled prompt from buildHermesPrompt
- workflow-agent-builtin: return empty prompt (no prompt assembly)
- cli-workflow: step read --prompt renders the stored prompt
- All test fixtures updated for new field
Legacy steps without assembledPrompt show 'Prompt not recorded' message.
小橘 🍊
- Add git ls-remote verification after push
- Switch from tea pr create to Gitea API (tea fails in worktrees)
- Add PR creation verification (check response JSON has number field)
- Explicitly mark steps 4 and 6 as verification gates
小橘 🍊
Root cause: committer role had to parse owner/repo from git remote URL,
which failed in worktrees with token-embedded URLs. Agent hallucinated
a fake PR URL instead of reporting the error.
Fix:
- planner extracts repoRemote from git remote, stores in frontmatter
- repoRemote flows through all roles via graph prompts
- committer uses repoRemote directly for tea/API calls
- Added Gitea API fallback when tea CLI fails
小橘 <xiaoju@shazhou.work>
- Place workflow YAML in .workflows/ (dot-prefix convention)
- Do not run uwf workflow add — use directly via uwf thread start
- Fixes agent hallucination issue where registration was faked
小橘 <xiaoju@shazhou.work>
- Enhanced Biome config with test file override for noConsole
- Applied Biome auto-fixes (8 files: formatting, template literals, optional chains)
- Updated all package repository URLs to git.shazhou.work/uncaged/workflow.git
- Added workflow-agent-claude-code to publish order in scripts/publish-all.mjs
- Added --ignore-scripts flag to publish command to bypass prepublishOnly guard
- Installed vitest in root devDependencies for test infrastructure
- Created vitest.config.ts for all 8 packages with passWithNoTests: true
- Fixed 3 test files to use vitest imports instead of bun:test
- Added test and test:ci scripts to packages missing them
- Added missing build step to .gitea/workflows/ci.yml
- Renamed CI job from 'test' to 'check' for clarity
- Created workflows/solve-issue.yaml with TDD-driven issue resolution workflow
- Registered solve-issue workflow with uwf (hash: 084YVM60BR8G6)
- Added packageManager: bun@1.3.14 to root package.json
- Added preinstall guard to block npm/pnpm/yarn
- Added prepublishOnly guard to root and all 7 public packages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When mustache variables in edge prompts resolve to empty strings (because
upstream output lacks the fields), the engine now returns a Result.error
instead of passing an empty --prompt to the agent.
- evaluate.ts: check rendered prompt is non-empty after mustache.render()
- run.ts: improve parseArgv error message for empty --prompt
- Export parseArgv for testability
- Add 7 tests covering all cases from the spec
Add comprehensive tests verifying that `uwf step show` produces valid
JSON output even when step detail nodes contain control characters
(newlines, tabs, carriage returns, etc.) in tool call args and content
fields.
Tests cover:
- Basic control characters (newlines, tabs, CR+LF)
- Backslashes and quotes
- Unicode control characters (U+0001-U+001F)
- Nested CAS refs with control characters
- Large steps with multiple tool calls
- Empty/null values
- YAML output format (unaffected by escaping)
The tests confirm that JSON.stringify() already handles control
character escaping correctly when serializing JavaScript objects
to JSON. No code changes needed - these tests serve as regression
guards to ensure the behavior remains correct.
Fixes#557
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add ThreadStatus type to workflow-protocol
- Update StepOutput type to include status field alongside deprecated done/background fields
- Implement status computation in cmdThreadShow (idle/running/completed/cancelled)
- Update cmdThreadStepOnce to include status in return values
- Add comprehensive test suite for thread show status scenarios
Fixes#559
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Exposes the existing cwd parameter from cmdThreadStart to the CLI layer,
allowing users to specify a custom working directory for thread execution.
- Added --cwd <path> option to uwf thread start
- Option defaults to process.cwd() when not provided
- Added comprehensive test suite with 4 test cases
- All 328 tests in cli-workflow package pass
Fixes#561
Implement thread-level and edge-level working directory management:
- Thread-level cwd (required, defaults to process.cwd())
- Captured at uwf thread start time
- Stored in StartNodePayload
- Inherited by all steps unless overridden
- Edge-level location (optional, supports mustache templates)
- New location: string | null field on Target type
- Resolved by moderator using previous step's output
- Example: location: "{{{repoPath}}}"
- Step audit trail
- Each StepNodePayload records actual cwd where agent executed
Changes:
- workflow-protocol: Add cwd to StartNodePayload & StepRecord, location to Target
- cli-workflow: Thread start captures cwd, moderator resolves location, step execution uses resolved cwd
- workflow-util-agent: Expose cwd in agent context
Tests:
- Protocol type tests (3 scenarios)
- Moderator location resolution tests (5 scenarios)
- Thread-location integration tests (3 scenarios)
All tests pass. Build successful. Backward compatible.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add agentOverrides (minDepth 3) and modelOverrides (minDepth 2) to VALID_CONFIG_KEYS
- Support per-key minDepth instead of hardcoded 3
- No knownFields for either key (sub-keys are user-defined)
- Add 5 new tests covering valid/invalid paths for both keys
小橘 <xiaoju@shazhou.work>
- Add engines.bun >= 1.0.0 to workflow-agent-hermes package.json
- Update README to explain uwf-hermes is an adapter, not hermes itself
- Update uwf setup --agent help text to mention adapter concept
- Add tests for engines field, shebang, and adapter docs
- Patch uncaged-workflow-cli skill with Agent Adapters section
Adds 'uwf skill user' command for agents/humans using the uwf CLI.
Covers setup, workflow management, thread lifecycle, step operations,
CAS queries, logging, and global options with a Quick Start guide.
Refs #538