Skip to content

[aw-failures] P0: Smoke Gemini + Smoke Crush both at 100% failure rate (2026-05-02) #29816

@github-actions

Description

@github-actions

Executive Summary

Two smoke test workflows — Smoke Gemini and Smoke Crush — are failing on every run as of 2026-05-02. Combined, 9 Smoke Crush runs and 8 Smoke Gemini runs have failed today (all engine-resolved runs). Both failures are systemic infra issues, not PR-related.

Failure Clusters

Cluster Workflow Runs Failed First Seen Root Cause Priority
A Smoke Gemini 8/8 (100%) 2026-05-02T00:57Z GEMINI_API_KEY invalid/expired P0
B Smoke Crush 9/9 (100%) 2026-05-02T00:57Z Firewall blocks generativelanguage.googleapis.com + catwalk.charm.sh P0

Evidence

Cluster A — Smoke Gemini: Invalid API Key

All 8 engine-resolved Smoke Gemini runs return HTTP 400 from the Gemini API:

{
  "error": {
    "code": 400,
    "message": "API key not valid. Please pass a valid API key.",
    "status": "INVALID_ARGUMENT",
    "details": [{"reason": "API_KEY_INVALID", "service": "generativelanguage.googleapis.com"}]
  }
}

The error fires on both BaseLlmClient.generateJson (routing call) and GeminiChat.makeApiCallAndProcessStream (streaming call), meaning both API pathways fail. No recent successful Smoke Gemini run exists as a comparator.

Affected runs (sample): §25259396994, §25250656971, §25250269467

Cluster B — Smoke Crush: Firewall Blocking Gemini Calls

The Crush CLI v0.59.0 makes undocumented outbound calls to generativelanguage.googleapis.com:443 for session title generation and model routing (using gemini-3-flash-preview and gemini-3.1-pro-preview-customtools), and to catwalk.charm.sh:443 for telemetry. Neither domain is in the Smoke Crush firewall allowlist.

Firewall summary across 9 affected runs:

Domain Requests Blocked Cause
generativelanguage.googleapis.com:443 15 Not in allowlist
catwalk.charm.sh:443 5 Not in allowlist

Crush CLI error from workflow logs:

ERRO Error generating title with small model; trying big model err="...Forbidden"
ERRO Error generating title with large model err="...Forbidden"
Agent processing failed: failed to start agent processing stream: ...Forbidden.

Affected runs (sample): §25259396970, §25249581392, §25248852737

Proposed Fix Roadmap

Priority Fix Effort
P0 Rotate or re-validate GEMINI_API_KEY repository secret Low — ops/secret rotation
P0 Add generativelanguage.googleapis.com + catwalk.charm.sh to Smoke Crush firewall allowlist Low — config change (tracked in #29817)

Sub-issues

References:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions