Step 1: Start by Gathering the Right Cloud Data
Everything starts with visibility. But to give GenAI real context, the data must go beyond spend summaries.
Here’s the foundational data to collect:
→ Billing exports: From AWS CUR, Azure EA, or GCP billing export
→ Resource metadata: tags, labels, account IDs, environment
→ Usage metrics: CPU/memory/network activity logs
→ Infrastructure state: live inventory via cloud APIs or IaC
→ Business mappings: team ownership, projects, environments
This data is best stored in a queryable system like BigQuery, Athena, or a structured lakehouse.
What matters most isn’t where it’s stored, but how well it’s cleaned and connected.