Telos Global Telecom
Case Study
Problem Statement
Telos Global Telecom's "Customer 360" program was being derailed by poor data quality. Years of unvalidated ingestion into the Bronze layer had produced inconsistent and unreliable customer records. Addresses contained multiple suffix variations ("Street", "St", "Str."), state names were a mix of abbreviations and spellings ("CA", "Calif.", "California"), and phone numbers were riddled with non-numeric characters. As a result, marketing campaigns suffered from undeliverable mailers, and analytics teams could not trust their models or metrics. The company needed a reliable, repeatable way to cleanse and standardize this data before it reached downstream consumers.
Solution
A data-cleansing pipeline was developed using Microsoft Fabric Dataflow Gen2 to automatically standardize and validate customer records. The solution ingests raw data from the Customer_Bronze table, applies cleansing and lookup transformations, and writes standardized outputs to the Customer_Silver layer in the Lakehouse.
Key capabilities include:
-
Standardization of address and state fields through lookup mappings.
-
Removal of invalid characters from phone numbers.
-
Splitting of composite address fields into standardized components.
-
Assignment of a data quality flag to classify clean vs. suspect records.
The design is fully automated, low-code, and reusable—enabling consistent daily data quality without manual intervention.


The workflow begins by connecting to the Lakehouse and ingesting the customer_bronze.csv source file. Column headers are promoted, and data types are explicitly defined to ensure consistent schema enforcement across all fields.


State names were standardized through a left join with the State_Lookup table, replacing raw values with their corresponding entries from the standardized reference field.
​
All special characters were removed from the phone numbers, and the Add Column from Example feature was used to reformat them into a standardized 123-456-7890 pattern. Validation logic ensured that only entries containing exactly ten digits were retained.​​

The Address column is split into two components:​
Unit Component: Extracts identifiers such as apartments, suites, or flat numbers (e.g., “Apt 5”, “Suite 12”).
Street Component: Removes unit references and standardizes suffixes (e.g., “Rd” → “Road”, “St” → “Street”) via a left join with the Address_Lookup table.
Both cleaned up columns are then recombined to form a single, unified address field.
Business Impact
-
Delivered a consistent and reliable customer dataset that serves as a single reliable data source across analytics and operations.
-
Replaced manual cleansing tasks with an automated workflow, cutting daily effort by more than 90%.
-
Standardized address and phone data improved campaign reach and accuracy in customer communications.
-
Reliable, well-structured data enabled faster reporting and more accurate churn and marketing models.
-
Established a scalable, low-code data transformation layer in Microsoft Fabric that supports additional domains and use cases.