db 4 Data Quality and MDM Overview
Автор: Csoda81
Загружено: 2026-03-08
Просмотров: 26
Описание:
This video provides an educational overview of Data Quality (DQ) and Master Data Management (MDM), focusing on concepts and practical demonstrations using SQL Server tools.
Core Concepts of Data Quality
Types of Data: The video categorizes data into transactional, hierarchical/taxonomies, warehouses, semi-structured (XML/JSON), and unstructured (emails/PDFs) [00:04].
Dimensions of Quality:
Hard Dimensions: Completeness (all data present), accuracy (correct values), and consistency (no contradictions across systems) [00:28].
Soft Dimensions: User-perceived qualities like "trust" [00:46].
Master Data Management (MDM): This solution is applied when quality dimensions are low. It requires central storage, data governance, and Data Stewards—people responsible for the quality of specific data types (e.g., customer data) [00:55].
Data Profiling & SQL Analysis
Functional Dependencies: The video demonstrates how to use SQL queries on the AdventureWorks database to check if one column's value can be determined by another (e.g., checking if StateProvinceCode determines CountryRegionCode) [01:34].
Profiling Tools: Data profiling automatically identifies statistics, candidate keys, and potential foreign keys, which can be done via SSIS tasks [03:37].
SQL Server Data Quality Services (DQS)
The second half of the video focuses on DQS, a tool for cleaning and matching data [04:10].
DQS Components: Includes the DQS Engine, Knowledge Bases (KB), and various user roles (Administrator, KB Editor, etc.) [04:19].
Knowledge Bases (KB): A KB contains "domains" (reference value sets). Domains can have rules similar to check constraints and "leading values" with associated synonyms [06:20].
Cleansing Projects:
Basic Cleansing: Correcting individual field values [05:32].
Matching/Deduplication: Identifying and removing duplicate records [05:39].
Demos and Practices
Cleaning Last Names: A demonstration of using a pre-installed "US Last Name" KB to cleanse 18,000 records. It shows how values are categorized as Correct, Corrected, Suggested, New, or Invalid [07:19].
Custom Knowledge Base: The video concludes by showing how to build a custom KB through Knowledge Discovery, which involves sampling existing data to define new domains and rules [09:30].
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: