Applies to:
- Plan -
- Deployment -
Summary
Invoke calls hung for 400–600s while model generation completed in seconds. The root cause is a Node.js runtime/HTTP2 streaming bug in certain Node versions. Upgrade Node to a version that includes the bug fix (recommended) or apply a temporary client-side workaround.What is happening
The SDK’s invoke path can block for minutes when Node’s HTTP/2/streaming path misbehaves. Model generation and first-token timings remain fast. The long delay happens before or after the model spans — in the SDK invocation/streaming layer — so client-side timeouts (for example 120s) fire even though the gateway and model succeeded. Conditions that trigger it:- Running a Node.js runtime that contains a known HTTP/2/streaming bug (see Node.js issue #63989).
- Using the SDK invoke function or streaming responses over HTTP/2.
- Long-lived blocking in the invocation/streaming layer on the client process.
- Short LLM/generation spans (seconds).
- Long overall invoke span (minutes).
- Client-side timeout errors (e.g., timeouts) from your wrapper.
Fix or suggestion
Actionable steps:- Check your Node version:
- node -v
- Upgrade to a Node release that includes the fix (use the latest LTS or a release newer than the problematic version):
- With nvm:
- nvm install —lts
- nvm use —lts
- Or use your distro/package manager to install a supported release.
- With nvm:
- Restart your application process(s) after upgrading.
- Re-run requests that previously hung.
- Node.js 24.17.0 exhibited the problem.
- Downgrading to Node.js 24.16.0 resolved the issue for some users.
How to confirm it worked
- Reproduce the prior request and verify the overall invoke span is now short (matching the model span). Time-to-first-token should be in seconds, not minutes.
- Confirm your client logs no longer show the client-side timeout error (e.g., “timed out after 120000ms”) for the same requests.
Notes
- The issue has been linked to a Node.js runtime bug (Node.js issue #63989). If possible, monitor Node.js release notes for the specific fix and choose a patched LTS release.
- If problems persist after upgrading, collect a minimal repro and the relevant traces showing invoke vs LLM spans and submit them for further analysis.