united-workforce

Author	SHA1	Message	Date
xiaoju	a47871ec4e	chore: remove unused moderator-reference and yaml-reference CI / check (pull_request) Successful in 2m1s Details These generate* functions were exported from util but never consumed by any code. Dead exports are maintenance burden. Refs #101	2026-06-05 09:44:50 +00:00
xiaoju	5851e5d162	docs: update wf-stateless-design.md to reflect new/resume semantics CI / check (pull_request) Successful in 2m23s Details Refs #101	2026-06-05 09:38:01 +00:00
xiaomo	61dfb40933	Merge pull request 'feat: replace $START `_` status with `new`/`resume` semantics' (#102 ) from feat/101-start-new-resume into main CI / check (push) Successful in 2m42s Details	2026-06-05 09:35:35 +00:00
xiaoju	fbfd31a042	feat: replace $START `_` status with `new`/`resume` semantics CI / check (pull_request) Successful in 2m27s Details BREAKING: All workflow YAML files must update $START._ to $START.new + $START.resume. The resume edge prompt replaces the previously hardcoded resume message. - evaluate.ts: remove START_ROLE/START_STATUS special case, use $status like all nodes - thread.ts: resolveEvaluateArgs passes 'new', cmdThreadResume passes 'resume' - validate.ts: reject '_' everywhere (no longer valid) - validate-semantic.ts: require 'new' and 'resume' edges on $START - All workflow YAMLs and test fixtures updated Fixes #101	2026-06-05 09:30:09 +00:00
xiaomo	d99a376b60	Merge pull request 'fix: simplify prompt subcommands, framework-agnostic bootstrap' (#100 ) from fix/99-prompt-cleanup into main CI / check (push) Successful in 3m19s Details	2026-06-05 09:03:56 +00:00
xiaoju	a536efee00	fix: simplify prompt subcommands, framework-agnostic bootstrap CI / check (pull_request) Successful in 3m24s Details - `uwf prompt usage` now outputs only the usage skill (was three combined) - `uwf prompt bootstrap` replaces `setup` with framework-agnostic instructions - Remove `usage-reference` and `setup` subcommands - Remove `generateBootstrapReference` from util (moved to cli) Fixes #99 小橘 🍊（NEKO Team）	2026-06-05 08:52:35 +00:00
xiaoju	9260d81084	chore: version bump for --version fix CI / check (push) Successful in 3m2s Details agent-hermes@0.1.2 agent-claude-code@0.1.1 agent-builtin@0.1.1 agent-mock@0.1.1 eval@0.1.3 util@0.1.1 小橘 🍊（NEKO Team） agent-builtin-v0.1.1 agent-claude-code-v0.1.1 agent-hermes-v0.1.2 agent-mock-v0.1.1 eval-v0.1.3 util-v0.1.1	2026-06-05 08:12:50 +00:00
xiaomo	c8d884072a	Merge pull request 'fix: acp-client reports agent-hermes own version in MCP clientInfo' (#98 ) from fix/acp-client-own-version into main CI / check (push) Successful in 2m27s Details	2026-06-05 08:10:57 +00:00
xiaoju	abeb465f46	fix: acp-client reports own package version, not util VERSION CI / check (pull_request) Successful in 2m36s Details Address review nit from PR #97: clientInfo.version should be agent-hermes's own version for correct identification under independent versioning. 小橘 🍊（NEKO Team）	2026-06-05 07:50:03 +00:00
xiaomo	28427a973f	Merge pull request 'fix: add --version to adapter CLIs, read VERSION from package.json' (#97 ) from fix/adapter-version into main CI / check (push) Successful in 3m3s Details	2026-06-05 07:36:15 +00:00
xiaoju	794f9db568	fix: add --version to adapter CLIs, read VERSION from package.json CI / check (pull_request) Successful in 3m29s Details - All uwf-* adapter CLIs now support --version / -V - util VERSION constant reads from package.json at runtime - agent-hermes ACP clientInfo uses dynamic VERSION 小橘 🍊（NEKO Team）	2026-06-05 07:29:54 +00:00
xiaoju	cd585a26f1	Merge pull request 'fix: read eval CLI version from package.json' (#96 ) from fix/95-eval-version into main CI / check (push) Successful in 3m28s Details	2026-06-05 06:46:32 +00:00
xiaoju	1cf8f350d0	fix: read eval CLI version from package.json CI / check (pull_request) Successful in 3m30s Details Fixes #95 小橘 🍊（NEKO Team）	2026-06-05 06:43:27 +00:00
xiaoju	427568a21d	chore: version bump agent-hermes@0.1.1 cli@0.1.1 eval@0.1.2 CI / check (push) Successful in 2m37s Details 小橘 🍊（NEKO Team） agent-hermes-v0.1.1 cli-v0.1.1 eval-v0.1.2	2026-06-05 06:29:25 +00:00
xiaomo	d3a2353acf	Merge pull request 'fix: read token usage from ACP response instead of DB' (#94 ) from fix/usage-tokens-from-acp into main CI / check (push) Successful in 3m25s Details	2026-06-05 06:18:05 +00:00
xiaoju	8085d1d6e0	fix: read token usage from ACP response instead of DB CI / check (pull_request) Successful in 3m10s Details Tokens (inputTokens, outputTokens) now come from ACP PromptResponse.usage which is populated synchronously from run_conversation() — no WAL race. Turns still come from DB before/after snapshot. Previously both were read from hermes state.db after ACP prompt returned, but WAL write lag caused incomplete token data (e.g. 235 vs actual 26,080). Refs #91	2026-06-05 06:08:11 +00:00
xiaomo	8764d7bda3	Merge pull request 'chore: add changeset for #92 agent override alias fix' (#93 ) from chore/changeset-agent-override into main CI / check (push) Successful in 3m33s Details	2026-06-05 05:17:36 +00:00
xiaoju	850a3b2f25	chore: add changeset for #92 agent override alias fix CI / check (pull_request) Successful in 3m8s Details	2026-06-05 04:36:41 +00:00
xiaomo	3d6a517e83	Merge pull request 'fix: resolve --agent override via config alias before raw command' (#92 ) from fix/agent-override-alias into main CI / check (push) Successful in 3m30s Details	2026-06-05 04:31:50 +00:00
xiaoju	825f0c641a	fix: resolve --agent override via config alias before raw command CI / check (pull_request) Successful in 3m37s Details When --agent is passed to uwf thread exec, try config.agents[alias] first (e.g. 'hermes' → config.agents.hermes = {command: 'uwf-hermes'}), then fall back to parseAgentOverride for raw command names. Also change eval CLI default --agent from 'hermes' to 'uwf-hermes' so it works without config alias lookup. Refs #91	2026-06-05 04:20:09 +00:00
xiaoju	81bbe1178f	chore: release @united-workforce/eval@0.1.1 CI / check (push) Successful in 2m45s Details eval-v0.1.1	2026-06-05 03:02:05 +00:00
xiaoju	a0e139935e	Merge pull request 'fix: frontmatter judge handles parsed object output' (#90 ) from fix/frontmatter-judge-object-output into main CI / check (push) Successful in 2m12s Details	2026-06-05 03:01:30 +00:00
xiaoju	a08775896f	fix: frontmatter judge handles parsed object output CI / check (pull_request) Successful in 2m38s Details The extract pipeline stores step output as a JSON object in CAS, but the frontmatter judge only checked for raw markdown strings. Now accepts both formats: parsed objects check $status directly, raw strings go through YAML frontmatter extraction. Fixes eval frontmatter-compliance scoring 0 on valid outputs.	2026-06-05 02:55:58 +00:00
xiaoju	c892b9125b	chore: remove prepublishOnly guards (proman handles release) CI / check (push) Successful in 2m26s Details v0.1.0	2026-06-05 02:29:53 +00:00
xiaoju	8c5e12c5c8	Merge pull request 'chore: prepare 0.1.0 release' (#89 ) from chore/prepare-release into main CI / check (push) Failing after 12s Details	2026-06-05 02:28:08 +00:00
xiaoju	5edb67b79d	chore: prepare 0.1.0 release CI / check (pull_request) Successful in 2m12s Details - Remove legacy .changeset/ directory (no longer used) - Add eval package to proman.yaml - Set eval package to public for npm publishing	2026-06-05 02:21:24 +00:00
xiaoju	3d8df5c8e2	Merge pull request 'fix: remove _ single-exit for user roles' (#88 ) from fix/86-remove-single-exit-underscore into main CI / check (push) Successful in 2m16s Details	2026-06-05 02:09:50 +00:00
xiaoju	63cb4d3645	fix: remove _ single-exit for user roles CI / check (pull_request) Successful in 3m7s Details $START keeps _ (special entry node). All user-defined roles now require explicit $status enum in frontmatter + matching graph keys. - moderator: remove UNIT_STATUS fallback, error on missing $status - validate: reject _ graph keys for non-$START roles - validate-semantic: remove checkSingleExitRole(), require $status enum - update all test fixtures to use explicit status values - fix examples/analyze-topic.yaml Fixes #86	2026-06-05 02:00:45 +00:00
xiaomo	f373945304	Merge pull request 'feat: eval package scaffold — CLI + schemas + types + task loader' (#85 ) from feat/69-eval-scaffold into main CI / check (push) Successful in 1m46s Details feat: eval package scaffold — CLI + schemas + types + task loader (#85)	2026-06-05 00:23:56 +00:00
xiaoju	ae81e4b5ac	feat: eval report, diff, list commands CI / check (pull_request) Successful in 1m44s Details Implement the 3 read commands for eval framework: - report: read eval-run from CAS, render formatted text (task, overall, config, judges table, thread ID) - diff: side-by-side comparison with ▲/▼ delta indicators and config change markers - list: scan @uwf/eval/*/latest variables, sort by timestamp desc, --task filter, --limit pagination Architecture: pure formatting functions (format.ts) + data access (read.ts) + thin CLI handlers. Types in types.ts. 11 new tests (formatReport, formatDiff, formatList, selectEntries) Refs #72	2026-06-05 00:19:25 +00:00
xiaoju	8c26f16716	feat: builtin judges — frontmatter + token-stats (deterministic) + upstream/hallucination (stubs) CI / check (pull_request) Successful in 1m45s Details Implement 4 builtin judges for eval framework: - frontmatter-compliance: validates YAML frontmatter with $status field, score = stepsValid / stepsTotal - token-stats: aggregates Usage from step nodes, always score 1.0 (informational only) - upstream-consumption: LLM-as-judge stub (score 0, TODO) - hallucination: LLM-as-judge stub (score 0, TODO) Infrastructure: - judge/builtin/read-steps.ts — shell out to uwf step list - judge/builtin/types.ts — BuiltinJudge, BuiltinJudgeOutput - runner/collect.ts — dispatch builtin judges by name 9 new tests (frontmatter validation + token aggregation) Refs #71	2026-06-05 00:09:06 +00:00
xiaoju	fae9e9ed3a	feat: eval run command — prepare, execute, collect pipeline CI / check (pull_request) Successful in 1m45s Details Implement the uwf-eval run <task-dir> command with 3-phase pipeline: - prepare: read task.yaml, copy fixture/ to temp workdir - execute: shell out to uwf thread start + exec - collect: run judges, compute weighted score, store CAS node, set @uwf/eval/<task>/latest variable Changes: - src/runner/ — types, prepare, execute, collect, index - src/storage/store.ts — createEvalStore(), setEvalLatest() - src/commands/run.ts — full pipeline wiring with --agent/--model/--count - 9 new tests (prepare + collect + weighted scoring) Builtin judges return placeholder score 0 (Phase 1c). Refs #70	2026-06-04 23:59:21 +00:00
xiaoju	99619d85db	feat: eval package scaffold with CLI, schemas, types, task loader CI / check (pull_request) Successful in 1m42s Details New package @united-workforce/eval (uwf-eval CLI): - CLI skeleton: run/report/diff/list subcommands (stubs) - 5 OCAS schemas: eval-run, judge-frontmatter, judge-upstream, judge-hallucination, judge-token-stats - TaskManifest type + parser/validator for task.yaml - JudgeOutput/JudgeInput types for judge contract - EvalRunPayload/EvalRunConfig/EvalJudgeRecord storage types - 19 unit tests: task loader validation + schema definitions Refs #69	2026-06-04 23:42:16 +00:00
xiaomo	b94234652a	Merge pull request 'feat: agent-hermes reads real token counts from session DB' (#84 ) from feat/76-hermes-real-tokens into main CI / check (push) Successful in 1m41s Details feat: agent-hermes reads real token counts from session DB (#84)	2026-06-04 23:31:09 +00:00
xiaoju	1593dbb521	fix: compute usage as delta for session re-entry CI / check (pull_request) Successful in 1m41s Details On session resume, turns/inputTokens/outputTokens were cumulative (entire session history) instead of per-step increments. Now we snapshot metrics before prompt, compare after, and report the delta. Changes: - acp-client: add getSessionId() accessor - hermes: extract snapshotUsage() + computeUsageDelta() pure functions - hermes: runPrompt/runHermes/continueHermes use before/after snapshots - 9 new unit tests for usage delta computation Refs #68	2026-06-04 23:22:16 +00:00
xiaoju	d1c523c442	feat: agent-hermes reads real token counts from session DB CI / check (pull_request) Successful in 1m41s Details - Add inputTokens/outputTokens to HermesSessionJson type - Query input_tokens, output_tokens from sessions table in loadHermesSessionFromDb - Update test fixture schema with token columns - runPrompt now reports real token counts from Hermes state.db Refs #76, #68	2026-06-04 23:06:52 +00:00
xiaomo	4283e6766b	Merge pull request 'feat: agent-claude-code reports real $usage from stream-json' (#83 ) from feat/77-claude-code-usage into main CI / check (push) Successful in 1m42s Details feat: agent-claude-code reports real $usage from stream-json (#83)	2026-06-04 22:55:15 +00:00
xiaomo	4e4fb61ff5	Merge pull request 'feat: agent-hermes reports $usage (turns + duration)' (#82 ) from feat/76-hermes-usage into main CI / check (push) Successful in 1m40s Details feat: agent-hermes reports $usage (turns + duration) (#82)	2026-06-04 22:55:13 +00:00
xiaoju	be92cb2dd2	feat: agent-claude-code reports real $usage from stream-json output CI / check (pull_request) Successful in 1m40s Details - Map parsed numTurns, inputTokens, outputTokens, durationMs to Usage type - Add @united-workforce/protocol dependency + tsconfig reference - 747 tests pass Fixes #77 Refs #68	2026-06-04 22:36:44 +00:00
xiaoju	7681e8b8e2	feat: agent-hermes reports $usage (turns + duration) CI / check (pull_request) Successful in 1m40s Details - Count assistant turns from session messages - Measure wall-clock duration per prompt call - inputTokens/outputTokens remain 0 (ACP protocol doesn't expose token data yet) - Both runPrompt and continueHermes report usage Fixes #76 Refs #68	2026-06-04 22:30:14 +00:00
xiaomo	780005ad65	Merge pull request 'feat: agent-mock emits fixed $usage stats' (#81 ) from feat/75-mock-usage into main CI / check (push) Successful in 1m42s Details feat: agent-mock emits fixed $usage stats (#81)	2026-06-04 22:23:42 +00:00
xiaoju	248ac710fd	feat: agent-mock emits fixed $usage stats CI / check (pull_request) Successful in 1m41s Details - Mock agent returns {turns:1, inputTokens:0, outputTokens:0, duration:0} - E2E test 1 (linear workflow) asserts usage in CAS step nodes - 747 tests pass Fixes #75 Refs #68	2026-06-04 22:19:29 +00:00
xiaomo	172c232e61	Merge pull request 'feat: add $usage field to adapter protocol' (#80 ) from feat/74-usage-in-protocol into main CI / check (push) Successful in 1m41s Details feat: add $usage field to adapter protocol (#80)	2026-06-04 22:14:12 +00:00
xiaomo	5fe97591de	Merge pull request 'fix: agent bin fields point to dist/cli.js instead of src/cli.ts' (#79 ) from fix/agent-bin-78 into main CI / check (push) Successful in 2m55s Details fix: agent bin fields point to dist/cli.js instead of src/cli.ts (#79)	2026-06-04 15:41:45 +00:00
xiaoju	99f40c2488	feat: add $usage field to adapter protocol CI / check (pull_request) Successful in 2m28s Details - Add Usage type to protocol (turns, inputTokens, outputTokens, duration) - Add usage to StepRecord, StepNodePayload, StepEntry, STEP_NODE_SCHEMA - Thread usage through util-agent extract pipeline (writeStepNode → persistStep → createAgent) - All adapters return usage: null as placeholder (mock, hermes, claude-code, builtin) - 746 tests pass, no breaking changes (usage not in schema required array) Fixes #74 Refs #68	2026-06-04 15:41:07 +00:00
xingyue	bf489c59a5	fix: agent bin fields point to dist/cli.js instead of src/cli.ts CI / check (pull_request) Successful in 3m23s Details All three agent packages had bin pointing to ./src/cli.ts (bun-era leftover). Node cannot execute .ts files directly, causing ERR_MODULE_NOT_FOUND when spawning agents. Closes #78	2026-06-04 23:25:39 +08:00
xiaomo	9908d069ec	Merge pull request 'refactor(prompt): rename subcommands and add frontmatter output' (#67 ) from feat/prompt-refactor-66 into main CI / check (push) Successful in 5m15s Details refactor(prompt): rename subcommands and add frontmatter output (#67)	2026-06-04 14:51:12 +00:00
xingyue	83bcda60ff	refactor(prompt): rename subcommands and add frontmatter output CI / check (pull_request) Successful in 3m1s Details - Rename: user→usage-reference, author→workflow-authoring, adapter→adapter-developing - Remove: developer (content lives in CLAUDE.md) - All prompts output complete SKILL.md with YAML frontmatter - Setup instructions simplified: uwf prompt bootstrap > SKILL.md - Remove all bun references, use pnpm/npm - Fix CLAUDE.md: fixed→independent versioning - Delete old reference files (user/author/developer/adapter) Closes #66	2026-06-04 22:46:11 +08:00
xiaomo	17f7f44c43	Merge pull request 'chore: rebranding cleanup — reset versions to 0.1.0, bun→pnpm in docs' (#64 ) from chore/rebranding-cleanup into main CI / check (push) Successful in 3m5s Details chore: rebranding cleanup — reset versions to 0.1.0, bun→pnpm in docs (#64)	2026-06-04 13:13:03 +00:00
xiaoju	3401873051	chore: rebranding cleanup — reset versions to 0.1.0, bun→pnpm in docs CI / check (pull_request) Successful in 2m49s Details - All 9 packages reset to version 0.1.0 - CLAUDE.md: bun→pnpm, fixed→independent versioning, proman commands - docs/architecture.md: bun→pnpm in toolchain table - docs/sync-readme.md: bun→pnpm in conventions	2026-06-04 13:05:26 +00:00

1 2 3 4 5 ...

1072 Commits