Skip to content

date: 2026-06-21 tags: [ci, qa-artifacts, secrets, github, bot-account] status: archived graduated_to:

Inline PR screenshots vanish when the bot session token expires — fails SOFT, so look at the token first (RETIRED — pipeline removed)

Retired 2026-06-24 — the gh-image inline-screenshot pipeline this learning is about (qa-artifacts, qaScreenshot(), the bot account + GH_SESSION_TOKEN secret, the "Inline QA screenshots into the PR body" CI step) was removed in the Storybook/VRT epic (#643, Sprint 6). The visual-regression (VRT) review surface supersedes it: every component/page state is a Storybook story captured per-PR into a reg-viz report — no bot account, no expiring token, no soft-fail. Kept for provenance; the footgun no longer exists.

Symptom — A PR's body carried a <!-- qa-image:NAME --> marker and a matching qaScreenshot() capture, but the image never appeared after CI. The frontend run was green. The "Inline QA screenshots into the PR body" step log showed:

uploadToken not found on repo page — do you have write access to Tomat-Labs/Tempo?
(or, if Tomat-Labs enforces SAML SSO, authorize at https://github.com/orgs/Tomat-Labs/sso)

It had worked on the same mechanism roughly an hour earlier (the #631 dogfood), so nothing about the code or the markers had regressed.

Root cause — Inlining an image into a private repo's PR body only works through GitHub's user-attachments, which needs a real logged-in session — so the upload runs through gh-image authenticated by a dedicated bot account's user_session cookie, stored in the GH_SESSION_TOKEN repo secret. That cookie expires roughly fortnightly, and the org's SAML SSO authorization lapses on a similar cadence. Either one going stale breaks the upload with the uploadToken not found error — it is an auth-expiry problem, not a code or gh problem.

The trap: the inline step is continue-on-error (images are a bonus, never a gate), so an expired token does not turn CI red. The only visible symptom is the screenshot silently missing from the PR body — the marker stays an invisible HTML comment. Easy to misread as a marker/capture bug.

Fix — Refresh the bot session, then re-trigger the affected PRs. The exact, don't-guess steps live in the runbook: docs/agents/bot-session-refresh.md. In short: log in as the bot in a private browser → authorize SSO at https://github.com/orgs/Tomat-Labs/sso → copy the fresh user_session cookie → update the GH_SESSION_TOKEN secret → push a no-op commit (or re-run the frontend workflow) on each PR with unfilled markers so the inline step runs again.

Guard — Convention + a planned warning job (the token has no expiry alarm today):

  • When an inline screenshot doesn't appear but CI is green, suspect the token first — read the "Inline QA screenshots" step log for uploadToken not found before touching the marker or capture. Don't guess; the symptom looks like a code bug but is almost always auth expiry.
  • The failure is soft by design — never "fix" it by making the step a hard gate; a green PR must not depend on a fortnightly-expiring cookie. The real guard is a proactive "session expiring soon" check (follow-up) that warns before a PR silently drops its image.
  • user_session is full account access to the bot — treat it like a password. It only ever belongs in the GitHub secret box, never in chat, a commit, a log, or a file.