Guides

OpenClaw Fixes OAuth Token Refresh Race in Multi-Agent Setups

A new cross-agent file lock in OpenClaw serializes OAuth token refreshes, eliminating the refresh_token_reused storms that plagued large Codex deployments.

Filed under Guides 3 min read Updated Jun 3, 2026

Filed under Guides fixes oauth token

OpenClaw archive Editorial standards RSS feed

Cody April 17th 2026

OpenClaw Fixes OAuth Token Refresh Race in Multi-Agent Setups

Multi-agent OpenClaw deployments — particularly those running large pools of Codex agents against a shared GitHub Copilot OAuth profile — have long been plagued by a subtle but disruptive failure mode. A new fix merged today puts an end to it.

PR #67876, contributed by visionik and co-authored with HeroSizy, lands a cross-agent OAuth refresh serialization layer that resolves the long-tracked issue #26322.

The Problem: Refresh Token Storms

When a shared OAuth token expires and multiple agents hold it simultaneously, every agent races to refresh it at the same time. Providers like GitHub rotate the refresh token on each successful call, which means only the first agent's refresh wins. Every other agent receives a refresh_token_reused HTTP 401 — and cascades into model fallback, even though fresh credentials are now available.

For anyone running 10–20 Codex agents sharing a single Copilot profile, this created a token-expiry storm roughly every 12 hours: a burst of cascading fallback errors that required manual recovery.

The Fix: Three Layers of Serialization

The solution stacks three distinct serialization mechanisms:

1. Cross-process file lock
A new lock path at $STATE_DIR/locks/oauth-refresh/sha256-<hex> ensures that agents across separate OS processes queue up on a single coordination point before touching a shared profile's token. The lock key now includes both the provider name and the profile ID (NUL-separated to prevent concatenation collisions), so distinct providers never needlessly block each other.

2. In-process Promise queue
Within a single OpenClaw process, a keyed Promise chain prevents concurrent async calls from slipping past the file lock simultaneously — a scenario that's easy to hit in the async JS runtime when many tool calls land at once.

3. Credential mirroring with identity validation
After a successful refresh, the fresh credential is mirrored back into the main-agent store. Peers that acquire the lock afterward skip their own HTTP refresh and adopt the already-fresh credential instead. This collapses N serialized refreshes into 1 real refresh + (N-1) cheap adoptions.

A new isSafeToCopyOAuthIdentity gate guards the mirror and adoption paths: it allows credential copies only when there is no positive identity mismatch and the incoming credential carries at least as much identity evidence (accountId, email) as the existing entry. This prevents a misconfigured sub-agent from overwriting the main store with foreign-account tokens — closing a CWE-284-class authorization issue that Aisle flagged during review.

Lock Timeout Safety

The file lock carries a 3-minute stale timeout (stale = 180,000ms). A hard OAUTH_REFRESH_CALL_TIMEOUT_MS = 120,000ms cap on the underlying HTTP call guarantees the invariant: every legitimate refresh completes before the lock can be reclaimed by a waiting peer. The two constants are explicitly tested to enforce the call_timeout < stale relationship.

Who Is Affected?

This fix is most impactful for:

Multi-agent Codex setups — 10+ openai-codex agents sharing a GitHub Copilot OAuth profile
Team deployments — isolated agent sessions that share a single provider account
Automated pipelines — long-running agent pools with periodic token expiration

Single-agent installs will see no behavioral change. The lock overhead is negligible for the common case.

Test Coverage

The PR ships 80+ new tests across 11 files, including seeded-RNG fuzz suites covering ~4,500 adversarial inputs. The headline test — oauth.concurrent-20-agents.test.ts — fires 20 agents simultaneously against a single profile and asserts that exactly one HTTP refresh call fires while all 20 agents receive the same fresh token. Lock path safety is validated against 2,700+ adversarial profile ID inputs.

The fix is available in the current main branch and will ship in the next OpenClaw release.

OpenClaw Fixes OAuth Token Refresh Race in Multi-Agent Setups

The Problem: Refresh Token Storms

The Fix: Three Layers of Serialization

Lock Timeout Safety

Who Is Affected?

Test Coverage

Latest in Guides

OpenClaw on YouTube This Week: Upgrades, Local LLMs, and Autonomous Agents

HolaClaw Brings OpenClaw to Mac in One Click — No Terminal Required

OpenClaw Beta Brings Grok OAuth, Localization, and 50+ Fixes

OpenClaw v2026.5.22: Model Listing Gets 4,100x Faster

OpenClaw v2026.5.12-beta.6: Gateway v4 Protocol, Copilot Image Fixes, and Cron Inspection API

Continue Reading

OpenClaw release coverage

Security alerts and hardening guides

OpenClaw setup and migration guides

Browse the full OpenClaw archive

Get the Open-Source Briefing

OpenClaw Fixes OAuth Token Refresh Race in Multi-Agent Setups

The Problem: Refresh Token Storms

The Fix: Three Layers of Serialization

Lock Timeout Safety

Who Is Affected?

Test Coverage

Latest in Guides

OpenClaw on YouTube This Week: Upgrades, Local LLMs, and Autonomous Agents

HolaClaw Brings OpenClaw to Mac in One Click — No Terminal Required

Related OpenClaw coverage

OpenClaw Beta Brings Grok OAuth, Localization, and 50+ Fixes

OpenClaw v2026.5.22: Model Listing Gets 4,100x Faster

OpenClaw v2026.5.12-beta.6: Gateway v4 Protocol, Copilot Image Fixes, and Cron Inspection API

Continue Reading

OpenClaw release coverage

Security alerts and hardening guides

OpenClaw setup and migration guides

Browse the full OpenClaw archive

Get the Open-Source Briefing