Skip to content

[aw-failures] [aw] Failure Report 2026-05-03: Smoke Gemini/Crush systematic failures + Changeset Generator safe_outputs #29851

@github-actions

Description

@github-actions

Executive Summary

Investigation of agentic workflow runs in the last 6 hours (2026-05-02T19:20 – 2026-05-03T01:20 UTC) identified 3 run failures across 31 total runs. Two failures are systematic and will recur on every run (Smoke Gemini, Smoke Crush), and one represents a failed output delivery stage after a successful 31-minute agent execution (Changeset Generator). A fourth run (Smoke Codex) succeeded but flagged a missing web-fetch MCP tool capability gap.

⚠️ Note: Existing open agentic-workflows issues could not be read due to API permission constraints (403). Sub-issues below may duplicate existing tracking — please verify.


Failure Clusters

Run ID Workflow Engine Status Priority Root Cause
§25263690512 Smoke Gemini Google Gemini CLI FAILED P0 94% firewall block rate — all traffic routed through localhost:8080 / 172.30.0.30:10003 (internal proxy incompatible with firewall)
§25263690516 Smoke Crush Crush FAILED P0 IMDS probe 169.254.169.254 (4× blocked) + catwalk.charm.sh:443 (1× blocked)
§25263690514 Changeset Generator Codex FAILED P1 safe_outputs job failed after 31-minute successful agent run; ab.chatgpt.com telemetry blocked (30×)
§25263690507 Smoke Codex Codex SUCCESS P1 Missing web-fetch MCP tool (run succeeded, capability gap reported via missing_tool)

All 4 runs above triggered from PR branch copilot/bump-firewall-version-to-v0-25-35 (firewall v0.25.35 bump).


Evidence

Smoke Gemini (P0) — 94% firewall block rate
  • Run: §25263690512, duration 7.2m, firewall v0.25.35
  • Block breakdown: localhost:8080 = 279 blocked, 172.30.0.30:10003 = 15 blocked, unknown = 1 blocked (295/315 total = 94%)
  • Only allowed traffic: play.googleapis.com:443 (20 requests)
  • Client error artifacts: gemini-client-error-Turn.run-sendMessageStream (50KB), gemini-client-error-generateJson-api (34KB)
  • Root cause: Gemini CLI internally routes all requests through a local proxy (localhost:8080) as its MCP gateway. The firewall blocks all loopback/RFC-1918 traffic not explicitly allow-listed. This is a fundamental architectural incompatibility.
  • Failure stage: agent job (5.6m) → FAILURE; detection, safe_outputs cascaded as skipped
  • blocked_request_at_cap: true — blocked request counter hit its cap
Smoke Crush (P0) — IMDS probe + charm.sh blocked
  • Run: §25263690516, duration 2.5m, firewall v0.25.35
  • Block breakdown: 169.254.169.254 = 4 blocked (link-local IMDS), catwalk.charm.sh:443 = 1 blocked (Charm CLI telemetry/rendering)
  • Block rate: 71% (5/7 total requests)
  • Root cause: Crush engine probes the instance metadata service (IMDS) — likely for cloud credentials or instance identity — and reaches Charm CLI's telemetry endpoint. Neither is allow-listed.
  • Failure stage: agent job (1.1m) → FAILURE before agent activation (no agent logs available)
Changeset Generator (P1) — safe_outputs failure after successful agent run
  • Run: §25263690514, duration 34.0m, engine: Codex v0.128.0 / gpt-5.4-mini
  • Agent stage: SUCCESS (31.1m) — heavy exploratory session using github.search_repositories, github.get_release_by_tag, github.list_releases, safeoutputs.push_to_pull_request_branch, safeoutputs.update_pull_request
  • Failed stage: safe_outputs job (1.1m) → FAILURE
  • Blocked traffic: ab.chatgpt.com:443 = 30 blocked (Codex telemetry), chatgpt.com:443 = 1 blocked, ghcr.io:443 = 2 blocked
  • Anomaly: 1 anomalous log template in tool_call stage (score 0.65, rare cluster), resource_heavy_node flagged
  • Pattern: Same ab.chatgpt.com telemetry blocks also seen in Smoke Codex run 25263690507
Smoke Codex (P1 capability gap) — missing web-fetch MCP tool
  • Run: §25263690507, duration 8m — overall SUCCESS
  • Gap: Smoke test explicitly probes for web-fetch MCP tool; tool not present in session config → safeoutputs.missing_tool reported
  • Secondary: ab.chatgpt.com:443 = 3 blocked (same Codex telemetry pattern); cache_memory_miss (first-run, expected)
  • All MCP servers healthy: safeoutputs, serena, github, playwright ✅

Existing Issue Correlation

Could not query existing issues (API 403). Sub-issues created below may overlap with existing tracking — please cross-reference manually.


Proposed Fix Roadmap

Priority Issue Owner Area Fix
P0 Smoke Gemini: Gemini proxy localhost:8080 not reachable Firewall / Gemini integration Add firewall allow-list entry for Gemini's internal MCP proxy OR reconfigure Gemini CLI to route through allowed endpoint
P0 Smoke Crush: IMDS probe + catwalk.charm.sh blocked Firewall / Crush integration Determine if IMDS access is required by Crush; allow-list catwalk.charm.sh:443 if Charm telemetry is expected
P1 Changeset Generator: safe_outputs failure Codex / safeoutputs Investigate safe_outputs job logs for error; verify safeoutputs.push_to_pull_request_branch and safeoutputs.update_pull_request tool availability
P1 Smoke Codex: web-fetch MCP tool missing Smoke workflow config Add web-fetch MCP server to Smoke Codex session configuration
P2 Codex telemetry: ab.chatgpt.com blocked in all Codex runs Firewall / Codex Decide: allow ab.chatgpt.com telemetry OR suppress Codex A/B telemetry at the Codex level

Sub-Issues Created

Links will be updated below as sub-issues are created.

References:

Generated by [aw] Failure Investigator (6h) · ● 448.8K ·

  • expires on May 10, 2026, 1:32 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions