fix: detect crashed threads as failed instead of stuck running #171

Closed
xiaoju wants to merge 0 commits from fix/170-thread-status-detection into main
Owner

What

Fix thread status detection so crashed/failed workflows show correct status instead of being stuck as 'running'.

Why

When a worker crashes (e.g. module resolution failure), the .running marker file was never cleaned up, causing the dashboard to show the thread as 'running' forever. (#170)

Changes

  • thread-scan.ts — new resolveThreadListStatus() checks CAS chain head for __end__ node; new readThreadTerminalFromHead() utility; listRunningThreads filters out stale markers; HistoricalThreadRow gains head field
  • routes-thread.ts — uses resolveThreadListStatus() instead of raw .running file check
  • worker.ts — SIGINT/SIGTERM handlers call unlinkSync on all .running markers; finally block cleans up before removing from threads map
  • worker-spawn.tsresolveRunningHashForThread checks CAS terminal state before reporting running

Status values

  • running — actively executing (marker + no terminal node)
  • completed — finished successfully (returnCode === 0)
  • failed — finished with error (returnCode !== 0)
  • active — registered but not currently running

Ref

Fixes #170


小橘 🍊(NEKO Team)

## What Fix thread status detection so crashed/failed workflows show correct status instead of being stuck as 'running'. ## Why When a worker crashes (e.g. module resolution failure), the `.running` marker file was never cleaned up, causing the dashboard to show the thread as 'running' forever. (#170) ## Changes - **`thread-scan.ts`** — new `resolveThreadListStatus()` checks CAS chain head for `__end__` node; new `readThreadTerminalFromHead()` utility; `listRunningThreads` filters out stale markers; `HistoricalThreadRow` gains `head` field - **`routes-thread.ts`** — uses `resolveThreadListStatus()` instead of raw `.running` file check - **`worker.ts`** — SIGINT/SIGTERM handlers call `unlinkSync` on all `.running` markers; `finally` block cleans up before removing from threads map - **`worker-spawn.ts`** — `resolveRunningHashForThread` checks CAS terminal state before reporting running ## Status values - `running` — actively executing (marker + no terminal node) - `completed` — finished successfully (returnCode === 0) - `failed` — finished with error (returnCode !== 0) - `active` — registered but not currently running ## Ref Fixes #170 --- 小橘 🍊(NEKO Team)
xiaoju added 1 commit 2026-05-09 12:28:55 +00:00
- resolveThreadListStatus() checks CAS chain for __end__ node
- Stale .running markers no longer cause false 'running' status
- Distinguish 'failed' (returnCode != 0) from 'completed'
- Worker signal handlers (SIGINT/SIGTERM) clean up .running files
- listRunningThreads filters out terminated threads with stale markers

Fixes #170

小橘 <xiaoju@shazhou.work>
xiaoju closed this pull request 2026-05-11 06:28:03 +00:00

Pull request closed

Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: uncaged/workflow#171