Cost & Provisioning Method

How the Platform Radar cost area is intended to support both a management view and an SRE view without mixing explanation into the working screens themselves.

Back to Platform Radar

Purpose

The cost area should make cloud spend operationally useful. The Jacco view should prioritize product cost, focus areas, and shared-versus-product clarity. The SRE view should prioritize workload scope, namespaces, node pools, and deterministic follow-up checks. Both views should stay honest about what is direct, what is shared, and what still depends on fallback allocation.

What the current BigQuery export already gives us

The current billing export already contains repeated Kubernetes identity fields such as cluster name, node pool name, namespace, workload name, and workload type on many Compute Engine rows. The product itself should be derived from labels such as gcp-costcenter=pos or gcp-costcenter=wms, while namespaces remain drill-down context inside that product view.

The same sample also shows supporting GCP services such as Secret Manager. Those should be visible in the product story where helpful, but visually secondary to the main workload-backed spend.

How attribution should work

The first rule is direct attribution first. If a billing row already carries enough workload and product identity, Platform Radar should show that cost directly with high confidence.

The second rule is explicit fallback allocation. If a row does not map cleanly to one deployment or product, but does contain node pool, cluster, instance, or namespace context, the dashboard may still allocate it deterministically. That allocation should remain visibly lower confidence.

The third rule is to keep shared cost visible. Rows that are genuinely platform-level, unallocated, or unsupported should not be forced into fake deployment precision.

How recommendations should be produced

Provisioning recommendations should remain deterministic-first. BigQuery shows what costs money, but not yet whether requests are too high or too low. For that reason, rightsizing guidance should later combine billing data with Kubernetes requests and limits, pod placement, and Prometheus usage over time.

LLM support can still help, but only afterward. It can summarize the finding, explain trade-offs, and phrase a sensible next step. It should not be the primary source of truth for the cost or rightsizing signal itself.

What extra engineering work would improve things later

The dashboard should already try to extract the maximum value from the current export. Additional labels or mapping tables should only be requested where they create a concrete improvement.

Likely later improvements are optional owner or team labels for presentation quality, clearer handling of shared networking cost, and deterministic mapping rules for rows that stop at cluster or node level.