fix: read token usage from ACP response instead of DB

Tokens (inputTokens, outputTokens) now come from ACP PromptResponse.usage which is populated synchronously from run_conversation() — no WAL race. Turns still come from DB before/after snapshot. Previously both were read from hermes state.db after ACP prompt returned, but WAL write lag caused incomplete token data (e.g. 235 vs actual 26,080). Refs #91
2026-06-05 06:07:39 +00:00
parent 8764d7bda3
commit 8085d1d6e0
5 changed files with 125 additions and 70 deletions
@@ -0,0 +1,16 @@
+---
+"@united-workforce/agent-hermes": patch
+---
+
+fix: read token usage from ACP PromptResponse instead of DB
+
+Token counts (inputTokens, outputTokens) now come from the ACP
+`PromptResponse.usage` field, which is populated synchronously from
+`run_conversation()` return data — no WAL race condition.
+
+Turns (assistant message count) still come from the DB via
+`snapshotTurns()` before/after delta.
+
+Previously both tokens and turns were read from the Hermes state DB
+after the ACP prompt returned, but due to WAL write lag the DB often
+had incomplete token data at read time (e.g. 235 vs actual 26,080).