13 Monitoring and Cost Controls

Add uptime checks, alerting, cost monitoring for Cloud Run.

Note: This step content was written for the original dual-run architecture and is being updated for the current setup workflow. The guide overview reflects the current approach.

This step is being finalized. Content is accurate but will be updated and expanded before full release.

Deliverable

Server tagging endpoint monitored for availability, errors, and cost.

Validation

Use this final step to make the deployment operationally safe. Confirm there is a real uptime check for the tagging endpoint, a real alerting policy for failures, and a monitoring view that the team can use after go-live. This step is about making sure the system can be trusted and supported, not just configured.

Check the exact resources that will notify humans when the endpoint fails or degrades. Confirm the notification channels, alert thresholds, and dashboard coverage are all reasonable for the expected traffic and support model. If cost controls or traffic expectations matter for the environment, document them here as part of the operational baseline.

Do not leave this step as a vague promise to set up monitoring later. The deliverable is a tagging endpoint that is both technically working and operationally monitored.

Step Values

Field Current value
Uptime check name Pending
Alert policy names Pending
Notification channels Pending
Baseline monthly cost Pending

13.1 Create Uptime Check for Tagging Service

In Google Cloud Monitoring > Uptime Checks, create a check targeting your tagging service /healthy endpoint.

Expected: A GCP uptime check pings the tagging service /healthy endpoint.

Checks

  • API GCP uptime check is configured for the tagging service endpoint.

13.2 Create Alert Policy for Failures

In Cloud Monitoring > Alerting, create a policy that triggers on uptime check failures or sustained Cloud Run error spikes. Add a notification channel (email, Slack, PagerDuty).

Expected: A GCP alert policy triggers on uptime check failures with a notification channel configured.

Checks

  • API GCP alert policy is configured for uptime check failures.

13.3 Set Up Metrics Dashboard

Create a Cloud Run metrics dashboard showing request count, latency, error rate, and billable instance time for the tagging service.

Expected: A Cloud Run metrics dashboard shows request count, latency, error rate, and billable instance time.

Checks

  • Manual Metrics dashboard is set up for Cloud Run monitoring.

Documentation

Fields recorded during this step.

Field Description
Uptime check name Name of the GCP uptime check resource.
Alert policy names Names of configured alert policies.
Notification channels Notification channels attached to alert policies.
Baseline monthly cost Estimated monthly cost for Cloud Run services.

Use the app to validate this step automatically.

Request access