Skip to content

[Hubs] Data ingestion transform changes #1111

@flanakin

Description

@flanakin

⚠️ Problem

Data transforms in FinOps hubs are a critical part of improving the data quality and filling gaps to facilitate FinOps operations. There are a number of issues and open questions that have come up over time about the data coming from various systems. This issue is used to track requested changes, ideas for consideration, and any open questions that need an answer.

Items are categorized in the following order:

  1. Bugs that should be fixed in the data
  2. Questions that may be a bug or require a feature or change request
  3. Changes that are internal only and should not impact end users
  4. Features that are external-facing and will impact end users

🛠️ Solution

HubScopes

  • Feature: Detect any scopes that are duplicated via the drop-by folder path
  • Feature: Create an array of missing data based on the configured retention setting
  • Feature: Add min/max data timestamps for each scope

CommitmentDiscountUsage (reservation details)

  • Change: Switch to use parse_resourceid()
  • Question: What are these columns: ReservedHours, TotalReservedQuantity, UsedHours
  • Question: Is SkuName the same as x_SkuSize?
  • Question: What are the possible values for Kind?
  • Question: Does this dataset include non-hour units (namely the ReservedHours column)?
  • Feature: Consider adding ServiceSubcategory (1.1)

Costs (FOCUS cost)

  • Bug: Set ListCost and ContractedCost when not set for Marketplace usage/purchases
  • Bug: Set ListCost for reservation purchases
  • Bug: Change ListCost to be ContractedCost for rounding errors where ListCost - ContractedCost < 0.00000001
  • Bug: Change ListCost to be ContractedCost when not set for unused commitment discount rows
  • Bug: Change ChargeFrequency from "Usage-Based" to "One-Time" or "Recurring" for purchases (e.g., savings plan, M365)
  • Bug: Change PricingUnit from "Units" to "Operations" when applicable
  • Bug: Change ContractedCost/UnitPrice from 0 for commitment discount purchases
  • Bug: Change PublisherName from "Microsoft Corporation" to "Microsoft"
  • Bug: Change PricingUnit when incorrect for reservation purchases
  • Bug: Change PublisherName from null for EA
  • Bug: Change ServiceName from null to "Azure Savings Plan for Compute" for savings plan purchases
  • Bug: Change ServiceName from null for MCA rounding adjustments
  • Bug: Change SkuId/SkuPriceId for Microsoft EA rows
  • Bug: Change columns from "-2" and "Unassigned" to null for EA (also check MCA)
  • Bug: Change columns documented in FOCUS conformance gaps
  • Bug: Change columns to be consistent for EA and MCA rounding adjustments
  • Bug: Change x_BilledUnitPrice from 0 for commitment discount purchases
  • Bug: Change x_BillingExchangeRate from null for MCA (also check EA)
  • Bug: Change x_BillingProfileId to always be lowercase
  • Bug: Change x_PublisherId from null for MCA
  • Bug: Handle commitment tiers in the Cost/Prices join to populate missing prices/costs
  • Question: ChargePeriodStart/End: Should commitment discount purchases have a time for ChargePeriodStart/End?
  • Question: ContractedCost: Should we fix historical data where ContractedCost is off by the x_PricingBlockSize?
  • Question: ContractedCost/PricingQuantityShould we fix historical data where PricingQuantity and ContractedCost had bad values due to the wrong PricingQuantity scale?
  • Question: ListCost: Why is ListCost getting populated for rows that have a $0 Contracted/Billed/EffectiveCost?
  • Question: ListCost: Verify ListCost for commitment discount usage/purchases
  • Question: ListCost: Verify ListCost for Marketplace purchases (MCA only?)
  • Question: ListCost: Verify ListCost for Microsoft purchases (MCA only?)
  • Question: ListCost/ContractedCost: Should List/ContractedCost be 0 for reservation usage with Effective/BilledCost == 0?
  • Question: ListCost/ContractedCost: Should Adjustment costs be on List/ContractedCost?
  • Question: ListCost/ContractedCost: Consider replacing spot ListCost and ContractedCost with the on-demand equivalent cost
  • Question: ListCost/ContractedCost/ListUnitPrice/ContractedUnitPrice: What should the ListCost/UnitPrice and ContractedCost/UnitPrice be for commitment discount purchases?
  • Question: ListCost/ContractedCost/ListUnitPrice/ContractedUnitPrice: What should the ListCost/UnitPrice and ContractedCost/UnitPrice be for rounding adjustments?
  • Question: ListUnitPrice/ContractedUnitPrice/x_EffectiveUnitPrice: Why do price sheet savings plan usage prices not match cost effective prices?
  • Question: PricingCategory/x_PricingSubcategory: Should rounding adjustments have a value?
  • Question: ResourceType/ServiceCategory/ServiceName: Is there a resource type for microsoft.network/dnsresolvers/inboundendpoints? If so, why is ResourceType not set?
  • Feature: Create a process to backfill fixes and changes to the data ingestion
  • Feature: Add ServiceModel
  • Feature: Lowercase Microsoft 1.0-preview ResourceId values
  • Feature: Add CommitmentDiscountSpend/UsageEligibility
  • Feature: Create a pattern for identifying (and filtering out) commitment discount purchases (so they aren't dobule-counted)
  • Feature: Consider setting BillingAccountType/SubAccountType for non-Azure datasets
  • Feature: Add SkuPriceIdv2 for Microsoft rows
  • Feature: Add warnings for any changes applied to the data
  • Feature: Add warnings (errors?) when there's a difference between meter metadata in cost and price datasets
  • Feature: Create alerts when there are differences between cost/price data (???)
  • Feature: Confirm ProviderName for AWS and GCP and update it in the ProviderName backfill
  • Feature: Consider adding a unique charge ID per row
  • Feature: Add AHB columns: x_SkuLicenseCategory (Cloud, On-Premises?), x_SkuLicenseType (Windows/SQL), x_SkuLicenseStatus (Enabled/Eligible/Not Eligible), x_SkuLicenseQuantity, x_SkuLicenseUnit (Cores), SkuPriceDetails.CoreCount?

Prices (price sheet)

  • Bug: Change PricingCategory from "Standard" to "Committed" for CommitmentDiscountCategory == "Usage" (also x_PricingSubcategory)
  • Bug: Change SkuPriceId from "_" for unused savings plan rows
  • Bug: x_CommitmentDiscountSpendEligibility is not set correctly
  • Question: How should we handle null MeterId? Why are they null?
  • Question: Consider adding x_CommitmentDiscountMinimum (as in the minimum commitment amount; e.g., 100TB, 1PB, 10PB for storage)
  • Question: Consider adding x_CommitmentDiscountNormalizedRatio
  • Question: Consider adding x_ServiceModel
  • Question: Consider cleaning up x_SkuRegion
  • Question: Consider creating new rows for reservation usage
  • Question: Why is MeterName missing for some records (e.g., reservations)?
  • Question: Why are 309 meters missing all meter metadata for MCA?
  • Question: Why is MarketPrice not the on-demand list price for savings plan prices?
  • Question: Why is MarketPrice ~= UnitPrice (but not ==) for savings plan prices?
  • Question: Why are there Az Stack and Az Comm Services meters with "Unassigned" ProductName, MeterCategory, MeterSubcategory, MeterName, MeterRegion?
  • Question: Does ListUnitPrice/x_EffectiveUnitPrice match usage data for savings plans?
  • Question: Does x_SkuMeterName match usage data?
  • Question: Does the effective start date for reservation prices change based on the export date?
  • Question: How can we differentiate between Microsoft and third-party meters?
  • Question: How do we create a unique ID for third-party meters for MCA?
  • Question: How do we differentiate dev/test for MCA?
  • Change: Compare join vs. current per-row approach for populating commitment discount eligibility
  • Change: Compare join vs. lookup performance for populating savings plan prices
  • Change: Compare join vs. lookup performance for populating pricing unit columns

Recommendations (reservation recommendations)

  • Bug: Why is SubscriptionId not included?
  • Feature: Add support for ingesting multiple types of recommendations (not a transform issue)
  • Feature: Add Advisor recommendation columns
  • Change: Move ProviderName upstream

Transactions (reservation transactions)

  • Question: Is the Currency column the pricing or billing currency?
  • Question: A "Cancel" ChargeClass is not valid in FOCUS -- should we track this differently or leave it as an acceptable conformance gap?
  • Question: How does BillingFrequency map to ChargeFrequency?
  • Question: Should we include ChargePeriodEnd?
  • Change: Move ProviderName upstream
  • Feature: Consider looking up the billing account ID from cost data to fix the invalid BillingAccountId column

🙋‍♀️ Ask for the community

We could use your help:

  1. Please vote this issue up (👍) to prioritize it.
  2. Are there any open questions or concerns you have about the Cost Management data? How can we improve it?
  3. Leave comments to help us solidify the vision.

Metadata

Metadata

Assignees

No one assigned

    Labels

    OKR: 2.1 AdoptionIssues that contribute to customer adoptionTool: Cost ManagementExternal issues related to Microsoft Cost ManagementTool: FinOps hubsData pipeline solution

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions