This repository has been archived on 2026-06-01. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
nerve/.knowledge/adapter-isolation.md
xiaoju 9c832b0e21 docs(knowledge): update cards via knowledge-extraction workflow (5q/round)
7 cards updated, 4 new cards added. Topics: signal-routing,
worker-isolation, storage-layer, adapter-isolation, sense contracts,
workflow runtime enforcement, coding conventions details.

小橘 <xiaoju@shazhou.work>
2026-04-30 05:56:29 +00:00

6.6 KiB

Adapter Process Isolation

Describes sandboxing, process isolation, resource limits, and timeout enforcement for adapter invocations in the Nerve workflow system.

Process Isolation Model

Adapters run in a two-tier isolation model:

  1. Workflow Worker Process — Each workflow runs in a dedicated Node.js worker process (workflow-worker.ts) forked from the main daemon
  2. Adapter Child Process — Each adapter spawns CLI tools as child processes via spawnSafe() with shell: false

Resource Limits & Timeouts

Adapter-Level Timeouts

  • Default timeout: 300 seconds (300,000ms) for both cursor and hermes adapters
  • Configurable via AgentConfig.timeout in adapter factory functions
  • Wall-clock enforcement using setTimeout() — kills child process with SIGTERM on timeout
  • AbortSignal support — external cancellation triggers immediate SIGTERM

Timeout Behavior

// Timeout resolution priority (packages/core/src/spawn-safe.ts):
// 1. Explicit timeoutMs value
// 2. AbortSignal presence → no internal timer (relies on external abort)
// 3. DEFAULT_TIMEOUT_MS (300_000) fallback
  • Child process terminated with SIGTERM on timeout/abort
  • Returns { kind: "timeout", stdout, stderr } error result
  • No grace period — immediate kill
  • No SIGKILL escalation — relies entirely on SIGTERM effectiveness

SIGTERM Limitations

If a child process ignores or blocks SIGTERM (e.g., signal handlers, blocked delivery):

  • No fallback to SIGKILL — process may remain alive indefinitely
  • No escalation timer — spawnSafe() does not implement progressive signal escalation
  • Potential zombie/orphan risk — unresponsive processes continue consuming resources
  • OS-level cleanup only — relies on parent process death or OS reaping mechanisms

Sandboxing Characteristics

What's Isolated

  • File system: Child process runs in specified cwd (workflow working directory)
  • Environment: Controlled env vars via nerveCommandEnv() + optional overrides
  • Network: No explicit restrictions (inherits parent process network access)
  • Process tree: Child processes are direct children, not containerized

What's NOT Sandboxed

  • No resource quotas (CPU, memory, disk I/O limits)
  • No filesystem chroot/containers — full filesystem access within user permissions
  • No network isolation — can make arbitrary network calls
  • No syscall filtering — no seccomp or similar restrictions

Runtime Resource Enforcement

No active resource monitoring or constraints:

  • No cgroups (Linux) — no CPU, memory, or I/O limits enforced
  • No job objects (Windows) — no resource quotas or process tree limits
  • No worker_threads resource tracking — Node.js worker processes run unrestricted
  • Pure timeout-based enforcement — only wall-clock time limits via setTimeout()
  • OS-scheduled resource sharing — relies entirely on operating system process scheduling

Adapters can consume unlimited:

  • CPU time (until timeout)
  • Memory (until OOM)
  • Disk I/O (no quotas)
  • Network bandwidth (no throttling)
  • File descriptors (until ulimit)

Environment Variable Security

The nerveCommandEnv() function provides minimal sanitization:

// spawn-safe.ts lines 47-55
export function nerveCommandEnv(): SpawnEnv {
  const home = homedir();
  const pnpmHome = join(home, ".local/share/pnpm");
  return {
    ...process.env,           // ← Full parent environment inherited
    PNPM_HOME: pnpmHome,
    PATH: `${pnpmHome}:${process.env.PATH ?? ""}`,
  };
}
  • No filtering of sensitive keysNODE_OPTIONS, LD_PRELOAD, PYTHONPATH passed through unchanged
  • Full environment inheritance — all parent process environment variables copied
  • Injection risk — malicious env vars (e.g., NODE_OPTIONS=--require=evil.js) affect Node.js child processes
  • Path manipulation — sensitive PATH entries remain accessible to adapters

Security Model

Execution Context

  • Uses shell: false to prevent shell injection attacks
  • Arguments passed as separate array elements (not shell-parsed)
  • PATH includes ~/.local/share/pnpm for tool discovery
  • Inherits parent process user/group permissions

File Descriptor Management

// spawn-safe.ts line 122
stdio: ["ignore", "pipe", "pipe"]
  • stdin closed: Child receives no input (stdio[0]: "ignore")
  • stdout/stderr captured: Piped to parent for collection (stdio[1,2]: "pipe")
  • No explicit fd closing: Node.js default behavior — inherits other file descriptors
  • Parent sockets/pipes accessible: Child can access parent's open network connections, database handles, etc.
  • Security risk: Adapter processes may access unintended parent file descriptors

Attack Surface

  • CLI tools have full user-level filesystem access
  • Can spawn additional processes (not tracked/limited)
  • Network requests unrestricted
  • Resource consumption relies on OS-level limits

Worker Process Management

Workflow Isolation

  • Each workflow type gets dedicated worker process
  • Worker processes handle multiple concurrent threads (runIds)
  • Kill flags enable per-thread cancellation without killing worker
  • Graceful shutdown waits up to 10 seconds for in-flight operations

Cross-RunId Contamination Risks

Shared mutable state poses contamination risks between concurrent runIds:

  • process.env mutations: Environment changes affect all subsequent runIds in same worker
  • require.cache pollution: Module cache shared across all runIds — side effects persist
  • Global variables: Any global state mutations from one runId visible to others
  • process.cwd() changes: Working directory changes affect entire worker process
  • File descriptors: Open files/sockets shared between runId executions

No runId-specific scoping implemented:

  • Worker reuses single Node.js process for efficiency
  • Each role execution sees cumulative environment from previous runIds
  • Mitigation relies on adapter discipline — clean implementations avoid global mutations

Error Handling

  • Adapter failures don't crash the worker process
  • Timeout/abort errors are isolated to specific role execution
  • Worker process survives adapter failures and continues serving other threads

Configuration

# Example nerve.yaml configuration for timeout overrides
workflows:
  my-workflow:
    roles:
      coder:
        adapter:
          type: cursor
          timeout: 600000  # 10 minutes in milliseconds

Timeout configuration happens at the adapter creation level, not as a system-wide sandbox policy.