Metropolis City
Case Study
Problem Statement
The City of Metropolis centralized its departmental data into a shared “central inbox” to break down long-standing data silos. While this improved data submission, it introduced a new challenge: the inbox quickly became an ungoverned dumping ground for CSV and JSON files from multiple departments - including Police, Transportation, and 311 Citizen Services.
Analysts were manually inspecting this folder every day, visually matching filenames like Police_Incidents.csv, Parking_Violations.json, and 311_Service_Requests.csv to the correct ingestion pipelines.
This manual triage created severe operational risks:
-
Misnamed files (e.g., Parking_Violations_report_final.json) were skipped entirely.
-
Files were frequently double-loaded because nothing was archived or quarantined.
-
Unrecognized files lingered in the inbox, becoming “digital ghosts” that were neither processed nor reviewed.
The city needed an automated, intelligent, and scalable file-routing solution to restore trust and reliability in its new data lakehouse.
Solution
A dynamic file‐routing pipeline was implemented in Microsoft Fabric to automatically classify, ingest, archive, and quarantine all files entering the city’s central inbox. The workflow uses a metadata-driven design to evaluate each file’s extension and naming pattern, ensuring accurate routing into the correct departmental tables.


Get Metadata (inbox_files)
Retrieves the complete list of files in the city’s shared inbox using the childItems field.

ForEach Loop
Iterates through each file, enabling dynamic per-file evaluation. Within the iterative process, a Switch activity was integrated to perform a conditional evaluation of the file format, routing each file to the appropriate handler for CSV, JSON, or unknown formats. This mechanism ensures comprehensive file evaluation and establishes the necessary framework for format-specific routing and subsequent cleanup.

Switch Activity
The Switch activity is configured with conditional logic to inspect the format of each incoming file (CSV, JSON, or unknown). Based on this format validation, the pipeline routes the file to the corresponding departmental folder or directs it to a quarantine location if the format is unrecognized. Post-routing, the file is systematically deleted from the inbox to maintain operational cleanliness and prevent data duplication. This methodology ensures effective segregation, format integrity validation, and reliable downstream ingestion.

Table creation in Lakehouse
Following the completion of the ForEach loop, three dedicated Copy Data activities were configured to manage the ingestion process. These activities independently load validated files from the department-specific staging folders (Police Incidents, Parking Violations, and 311 Service Requests) into their respective Lakehouse tables. This configuration ensures clean, schema-aligned data loading, which is critical for supporting reliable analytics and operational reporting across various city services

Business Impact
-
Eliminated 100% of manual triage work, saving analysts several hours every day.
-
Reduced ingestion errors and data loss to zero, thanks to quarantine-based exception handling.
-
Improved data governance and auditability, with every processed file archived and traceable.
-
Increased trust in the city’s unified data platform, enabling consistent cross-department analytics.
-
Enabled scalable onboarding of new departments, as adding new file types now requires only updating routing logic—not building new pipelines.