eval: usage/turns 统计数据严重不准确 #91
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
问题
eval 运行后,CAS StepNode 记录的 usage 数据与 hermes session DB 中的实际数据严重不一致。
复现
运行
uwf-eval run fix-add-bug,查看 CAS detail 中的 StepNode:分析
uwf-hermes通过 ACP protocol(session/prompt)驱动 hermes agent,完成后调用loadHermesSessionFromDb()从~/.hermes/state.db读取 after snapshot,再与 before snapshot 做 delta 计算。疑似原因:
input_tokens = input_tokens + ?)可能还未全部 commit,导致读到中间状态影响
token-statsjudge 拿到的数据不可靠涉及文件
packages/agent-hermes/src/hermes.ts—runPrompt()+computeUsageDelta()packages/agent-hermes/src/session.ts—loadHermesSessionFromDb()packages/eval/src/runner/execute.ts— agent spawn + usage collection下一步
在
hermes.ts的runPrompt()加 debug log,打印 before/after snapshot 实际值,跑一次 eval 定位是竞态还是数据源问题。小橘 🍊(NEKO Team)