Production Switch
Cutover is the moment your production browser GA4 events stop going directly to www.google-analytics.com and start routing through your Cloud Run tagging service. The destination GA4 property doesn't change — same property, same measurement ID, same reports — only the path the events take. The app's Cutover page (/p/<slug>/go-live) renders a six-section guided card list driven by a state machine: one start action, one manual GTM change you make in GTM, then a sequence of operator-clicked verify, cleanup, validate, attest, and finalize actions.
Pick a low-traffic window if you can — it makes anomalies easier to spot and emergency rollback simpler. Plan ~30-60 minutes of active attention. Once cutover is finalized, the project is permanently in live mode: re-cutover is not supported (the canonical path for a redo is to archive this project and create a successor).
Page overview
The page header reads Production Switch while cutover is active. The intro paragraph explains that going live moves visitor traffic onto the server-side path you've been testing. Below the header, the page renders the six section cards in a guided list.
Two non-default views replace the section list when needed:
- Emergency rollback checklist — replaces the section cards when the cleanup step fails after the manual GTM change is already published ("failure mode B"), or when the user enters Cutover via Settings → Roll back this Go Live. Covered in Rollback Readiness on the stage overview.
- Cutover record — after Finalize, a decision-record panel labeled "Cutover record" renders above the section list with the durable production-switch evidence. See After Finalize below.
A bottom-of-page footer holds links to Health and Monitoring after finalize. Before finalize, the footer shows a disabled link reading "Health and Monitoring unlock after cutover".
Happy path
- Click Start the cutover; GSS captures pre-cutover baseline and a web container fingerprint.
- In GTM, edit the production Google Tag's
server_container_urlconfiguration parameter to your tagging URL; submit and publish the workspace to environment Live. - Click Verify change; GSS confirms the container fingerprint changed and the new URL is in place.
- Click Remove test tags; GSS removes the test-pipeline tags from your web container and publishes a cleanup version.
- Click Validate server routing; GSS hits the Cloud Run endpoint and re-reads the web container to confirm the test tags are gone.
- For each critical event, trigger it on your live site, watch for it in GA4 Realtime, check the box, then click Save.
- Click Finalize the cutover; GSS captures post-cutover baseline plus setup snapshot, retires GTM staging workspaces, and marks go-live complete.
Glossary▶
- Manual change
- The GTM tag edit you make in section 2 — adding or updating the
server_container_urlconfiguration parameter on each production Google tag, then publishing the workspace. GSS never writes to your live web container during this section. - Test route / test pipeline
- The set of forwarder tags GSS staged in your web container during Build to forward data to a test server property for validation. Cutover removes these in section 3 — they were scaffolding, not permanent. The app lists the exact tag names that will be removed.
- Pre-baseline
- A snapshot of your web + server GTM container version IDs taken at Start the cutover. Used as the rollback evidence anchor.
- Post-baseline
- A snapshot taken at Finalize. Records the live container versions confirmed at completion.
- Setup snapshot
- A frozen JSONB record of every binding, profile_config field, and check_run row at finalize time. Written to
setup_snapshotwithUNIQUE(project_id)— exactly one per project, ever. - Tier 3 lock
- The locked state of the project that prevents Tier 2 cascade. Cumulative — projects become Tier 3 locked from Build stage forward, not specifically at cutover.
Not covered on this page: post-cutover operations (Health and Monitoring) or emergency rollback (covered on the Go Live overview).
Before you start
- Confirm live traffic passing recently and Publish complete (with content-readiness validation green). If Setup isn't ready, section 1 shows a "Setup not ready" pill and a "Review Setup" link instead of the Start button.
- Have your test-baseline reference values handy — you'll mentally compare post-switch primary data against them.
- Edit access on the GTM web container plus Publish access on both containers.
- The browser GA4 measurement ID is the SAME before and after cutover. Confirm this so you don't accidentally type a different MID into the GTM tag during the manual change.
1 Start the production cutover
Section 1 has three visual states. If Setup isn't ready, the card shows an attention pill and links back to Setup. If Setup is ready and no cutover row exists yet, it shows a Start the cutover button with the action note "Records the start of the cutover." If cutover has already started, the card is marked Done and shows a Redo the cutover start button (note: "Cancels the current attempt and records a fresh start.") that's available until cleanup runs.
Clicking Start the cutover does several things in one transaction:
- Creates a new
go_live_staterow, statusstarted, with the boundcloudrun_taggingURL stored asexpected_server_url. - Reads the live web container via GTM API to capture a fingerprint (
pre_go_live_web_fingerprint) — this is what GSS will compare against in section 2's verify to confirm the change actually happened. - Captures the pre-baseline (
go_live_baselinerow,phase='pre') with both containers' livecontainerVersionIds. - Writes an
audit_logeventgo_live_started.
GTM API failures during pre-baseline capture are best-effort — they log and continue rather than fail Start. The fingerprint, however, must be captured (without it, section 2's verify can't compare). After Start, section 2 becomes Active.
2 Switch production to the server-side path
This card walks you through the only manual GTM change in the cutover, then verifies it. The body opens with an instruction block that calls out your workspace options (recommended workspace + any other current workspaces with summaries), the routing-tag candidates GSS detected in your live container, and the exact server_container_url value to paste.
In GTM:
- Open your GTM web container and create or open a workspace based on the current live version (the app lists the workspace candidates inline and flags the recommended one).
- For each production Google tag identified in the candidate list, click the tag → Configuration settings → Add Parameter, then add Configuration Parameter
server_container_urlwith Value set to your tagging URL (or to a variable like{{GSS - Server URL}}if you maintain one). - Submit (publish) the workspace as a new version to environment Live.
- Back in GSS, click Verify change. The action note reads "Validate changes that you made are correct."
The candidate list is a best-effort discovery from the live container snapshot. Custom HTML tags and Custom Template tags will not appear there — if your production GA4 tags use those, edit them manually in GTM.
The Verify change action reads the live web container via GTM API and runs two checks:
- Fingerprint changed — the live container's fingerprint must differ from
pre_go_live_web_fingerprint. If it hasn't changed, you didn't publish. - Server URL present — at least one tag in the live container must have
serverContainerUrl(or legacytransportUrl) equal toexpected_server_urlafter slash normalization and variable resolution.
Pass advances state to manual_change_verified and surfaces a Re-verify button (note: "Re-checks the routing. Useful if you republished GTM after verification.") on the now-Done card. Fail returns 409 with a specific error and the card flips to a "Verify failed" state so you can re-publish in GTM and try again. Common 409 messages:
- No live GTM change detected — the live fingerprint still matches the pre-cutover fingerprint. The app calls out any workspaces that have draft changes but were never published.
- Container changed but routing not updated — you published, but the expected URL didn't end up in a production tag. The app shows the exact URL it was looking for.
Rollback during the active destructive state. Section 2 also exposes a How to revert this change disclosure inline on the card while it's active. The disclosure documents two options (republish the prior GTM version, or manually remove the server_container_url field) and is visible at the moment the operator is about to make or has just made the change — not gated behind completion. If you decide to back out of cutover here, this is the affordance you use; the dedicated Emergency rollback flow takes over only once cleanup has failed mid-flight.
Trailing-slash variations on the URL are tolerated by the verifier. Variable references (e.g., a Constant variable holding the URL) are also resolved — you can use a variable instead of a literal if you prefer.
3 Clean up the test route
Section 3 is an operator-clicked action, not an automatic section. The card lists the exact tag names that will be removed (pulled from the cutover log's tag-removal manifest), then exposes a Remove test tags button.
Clicking Remove test tags creates a scratch GTM workspace, removes the listed test-route tags from your web container, publishes that workspace, and retires the workspace. The test-route tags were the staged forwarders from the Build stage — they're no longer needed once production traffic flows through the real server-side pipeline.
On success: state advances to mirror_pipeline_cleanup_published; an audit_log event is written. On failure: state goes to failed and the entire page swaps from the section list to the Emergency rollback checklist. This is "failure mode B" — the production routing was already changed externally in section 2, so you can't just discard the cleanup. The rollback flow walks you back through reverting the manual change and republishing.
4 Verify production routing
Section 4 is also operator-clicked. Click Validate server routing. GSS does three things:
- Hit the Cloud Run tagging endpoint and confirm it responds.
- Read the live web container again, confirm no test-route tags remain (cleanup landed).
- Run validation and diagnose-issues against the captured observations vs. the frozen expectations from the test-pipeline selection's
expectations_data.
Results are stored in a go_live_verification_run row with three JSONB fields: validation_data, issues_data, observations_data. State advances to awaiting_event_verification. After it's Done, the button label flips to Re-validate server routing so you can re-run the same check.
This step does not gate progression on validation pass/fail — it always advances and surfaces results for display. The reason: validation is best-effort interpretation of routing observations; you (the human) are the final arbiter via the next step.
5 Verify critical events
This card lists the critical events from your frozen expectations as a checkbox table. For each event, the row shows the event name with an attention or pass LED dot plus a checkbox labeled "I see this event in GA4 Realtime" (or "Confirmed at <time>" once saved).
The card opens with a guidance paragraph linking to GA4 Realtime for your production property and to your live site, plus a Don't see an event in GA4 Realtime? disclosure with the most common reasons (realtime lag, wrong property selected, browser blockers, server container not forwarding). It points back at section 4's re-validate button and at the section 1 Emergency Rollback path for genuine misconfiguration.
Open GA4 Realtime, trigger each listed event on your live site (load a page for page_view, complete a purchase for purchase, etc.), check the box for each one as you see it land, then click Save. The Save button is disabled until at least one checkbox state differs from what's persisted, and the label only flips to "Confirmed" after the Save commits.
Each Save POST mutates go_live_verification_run.observations_data["mirror_pipeline_events"], adding (or updating) a record for each confirmed event with count > 0. The card also shows a running tally — e.g., "2 of 5 critical events confirmed as present" — directly above the Finalize card.
This step makes no GTM or GA4 changes. It is operator attestation only. Finalize re-enforces the gate server-side, so half-attested state can't slip through.
6 Finalize the cutover and record the baseline
The Finalize card opens with a three-item summary of what finalize will do:
- Freeze the cutover record (the page locks; cutover does not re-run on this project).
- Record a production baseline snapshot for ongoing health monitoring.
- Clean up the GTM staging workspaces GSS created during cutover — except where the operator added their own drafts. Those exceptions surface in a "Workspace cleanup details" disclosure after Finalize runs.
The Finalize the cutover button is disabled until every critical event from section 5 has been confirmed; the button's title attribute reads "Complete the sections above to enable this section." When section 6 is the active section, the button is enabled and clicking it does (in one transaction):
- Captures the post-baseline (
go_live_baselinerow,phase='post') — current live container version IDs at finalize time. - Sets
go_live_state.status = 'completed'and stampscompleted_at. - Retires every active test-pipeline workspace (sets
retired_at, status'retired') and removes them from GTM where possible. - Writes the setup snapshot — a frozen JSONB record of every binding, every profile_config field, and every latest check_run, FK'd to the latest audit_run + publish_audit_record + post-baseline + pre-baseline.
- Writes
audit_logeventgo_live_completed.
Finalize returns 422 "Missing critical events" if any critical event hasn't been confirmed (return to section 5 and confirm the missing ones). Finalize returns 500 if the snapshot already exists or if underlying audit/publish rows are missing — both are engineering escalations; the canonical path is archive + new project.
After finalize, the project is permanently in completed go-live state. The setup snapshot table has UNIQUE(project_id) — exactly one per project, ever.
Tier 3 lock — already in effect by the time you reach this page
The project is locked against Discover-page primary-binding changes well before cutover. has_tier3=True as soon as any of these exist: a test_validation_run, a published test-pipeline workspace (mirror_pipeline_workspace table), a managed_artifact with a published version, GSS-created Cloud Run bindings, a domain mapping, or a go_live_state / go_live_baseline / go_live_verification_run row. By Build → Cloud → Connect → Publish you've already crossed several of those thresholds. Cutover doesn't cause Tier 3; it just adds more rows.
Common errors & failure modes
| Symptom | Likely cause | Where to fix |
|---|---|---|
| Section 2 — verify change fails | ||
| Returns 409 "No live GTM change detected" | Live container fingerprint matches pre-cutover — you didn't publish the GTM workspace after editing. The app lists any workspaces with unpublished drafts. | In GTM, click Submit + Publish on the workspace, then click Verify change (or Re-verify) again. |
| Returns 409 "Container changed but routing not updated" | You published a workspace with the tag edit, but the expected URL didn't end up in the live container's tags. Common reason: you edited a draft in the wrong workspace, or the wrong tag type. | In GTM, open the live published version. Confirm a Google Tag has server_container_url equal to your tagging URL. Re-edit and re-publish if needed. |
| Section 3 — cleanup failure | ||
| Cleanup fails, page swaps to Emergency rollback checklist | GSS couldn't create the cleanup workspace, remove the test-route tags, or publish — usually GTM API permission, quota, or transient issue. | Walk the Emergency rollback checklist (revert your website snippet, republish prior versions, verify traffic restored, then archive). See the Rollback Readiness section of the Go Live overview. |
| Section 5 — verifying events | ||
| Critical event doesn't appear in GA4 Realtime even after waiting | Realtime lag, wrong GA4 property selected, browser blockers (ad blockers / private windows), or the event genuinely isn't firing post-cutover (server container routing issue). | Wait 60-90 seconds and refresh; confirm Realtime is on the production property; try a non-private browser. For real misconfiguration, click Re-validate server routing in section 4, or use Emergency Rollback (via Settings) to revert. |
| Finalize returns 422 "Missing critical events: ..." | Some critical event in the frozen expectations hasn't been confirmed (or the confirmation wasn't saved). | Return to section 5, locate the missing event(s), check each box, then click Save before re-clicking Finalize. |
| Section 6 — finalize | ||
| Finalize returns 500 "setup snapshot already exists" | A previous finalize partially succeeded and committed a setup snapshot. Re-cutover isn't supported. | Engineering escalation. Canonical path is archive this project and create a successor. |
| Finalize returns 500 with no snapshot row visible | Underlying audit_run or publish_audit_record is missing — the FKs into setup_snapshot can't resolve. Indicates earlier-stage data loss. | Engineering escalation. Re-running audit/publish on a partially completed go-live is risky. |
After Finalize: the read-only cutover record
Once Finalize commits, /p/<slug>/go-live doesn't go away — it switches into a permanent read-only "cutover record" view.
What the page renders after finalize:
- The header still reads Production Switch, but the intro paragraph changes to "This is the completed cutover record. Monitoring continues automatically. Use Health for current status and Monitoring for ongoing observation setup." with the cutover completion timestamp inline.
- A new Cutover record decision-record panel appears above the steps, listing durable evidence rows (timestamps, container version IDs, the bound tagging URL).
- All six section bodies render in their completed state. The event log under each step stays visible as the durable record of what happened during cutover.
- The step action buttons (Start, Verify change, Remove test tags, Validate server routing, Save event confirmations, Finalize) are hidden.
- The section 5 event-acknowledgement checkboxes stay rendered but with
disabledset — you can see what was confirmed, you can't change it. - The bottom footer surfaces direct links to Open Health and Open Monitoring.
- A green "The project is live" banner appears above the page body (rendered project-wide once
go_live_state.status == 'completed').
Behavior to note:
- All six cutover-step POST endpoints (
/go-live/start,/verify,/cleanup,/verify-production,/event-ack,/finalize) return409 live_projectpost-completion. There's no way to re-run a cutover from the UI; rollback is the only path out of live state. - The section 6 "Workspace cleanup details" disclosure still shows the GTM-cleanup outcomes for every staging workspace GSS created — deleted, skipped (operator had own drafts), failed, or errored. This is the durable cleanup record.
- If cutover ever needs to be undone, use Settings → Roll back this Go Live. That entry brings up the same Emergency rollback checklist that failure-mode B surfaces inline; the cutover record stays accessible during the rollback flow.
Next step
After Finalize, the project is in completed go-live state. Use Health for current operating status and Monitoring for uptime checks, alert policies, notification channels, and Cloud Monitoring links. The Go Live overview covers cross-page post-live behavior and the full Rollback Readiness reference.