by Danish Ali Detho | O365 & Power Platform Solution Architect
Every organization utilizes various sources of data often scattered across cloud and on-premises systems. Over time, these data sources continue to grow and in the absence of an efficient data governance system, it can be quite troublesome for IT teams to keep track of all the data assets in cloud or on-premises utilized within an organization and that’s when an enterprise data catalog becomes a necessity. Microsoft has launched various products over the past few years for providing enterprise data catalog, data governance, and compliance including Azure Purview services and some Microsoft 365 services such as Records Management, Microsoft 365 Compliance Center. Now they have packaged them all together in a new umbrella brand which covers its data governance and data compliance range of products and services called MS Purview. It is a unified data governance service that helps organizations to manage and govern all on-premises, multi-cloud, and software-as-a-service (SaaS) data in one place. In this blog, we will take a quick look at MS Purview and what it has to offer in terms of data governance, compliance, and record management.
What is Microsoft Purview?
Microsoft Purview, previously known as Azure Purview, is a set of tools and services designed to provide a solution for centralized data governance for an organization’s entire environment, including on-premises databases, cloud databases, SaaS data, and virtually any other data source or platform. MS Purview reached GA in April 2022, with the main focus to improve data governance across the organization by enabling data discovery, traceability, and searchability. This allows employees to search for and discover organizational data, helping to streamline data operations and prevent duplicate or redundant projects across multiple teams. It helps to gain visibility into assets across your entire data estate and leverages that visibility to manage end-to-end data risks and regulatory compliance. The three main components of MS Purview include:
Data Map which provides the foundation for data discovery and effective data governance. Azure Purview Data Map is a cloud-native PaaS service that captures metadata about enterprise data present in analytics and operation systems on premises and in the cloud.
Data Catalog which finds trusted data sources by browsing and searching your data assets. The data catalog aligns your assets with business terms and data classification to identify data sources.
Data Insights which give you an overview of your data estate to help you discover what kinds of data you have and where.
Key Features
Unified map of data across hybrid sources
MS Purview provides a Unified map of all the data assets within the organization and their relationships for more effective data governance. Microsoft Purview Data Map stores metadata, annotations, and relationships associated with data assets in a searchable knowledge graph. It helps to automate and manage metadata from hybrid sources and classify data using built-in and customized classifiers and Microsoft Information Protection sensitivity labels. It easily integrates all the data catalogs and systems using Apache Atlas APIs.
Improved data utilization
Data is spread across cloud, on-prem systems, SaaS applications, databases, etc., and this impedes its usability. MS Purview automatically discovers data and classifies them without having to move them across systems or formats. This helps in improving the business value of data management for your data consumers using Data Catalogue. It helps to understand the origin of your data with interactive data lineage visualization and provides data scientists/analysts with the data they need for BI, analytics, AI, and machine learning.
Enhanced discovery and protection of sensitive data
One of the key features of Microsoft Purview is the ability to identify sensitive data (including during data discovery) and mark the data as sensitive in its metadata. It comes with an extensive range of pre-programmed patterns in a function called Classification which is used to search for specific types of data (driver’s license numbers, credit cards etc) normally associated with sensitive or private data. It enables organizations to then leverage security group policies on top of Microsoft Purview to restrict the ability to search for the data and therefore maintain regulatory compliance. This will enable the management of sensitive data across your entire data estate.
Reduced data duplication by Data Sharing
MS Purview provides a streamlined user interface that enables data producers and consumers to collaborate. This makes it easy to share data within or between organizations using Microsoft Purview Data Sharing (in preview). This helps in reducing the duplication of data within the organization by improving the utilization of data across multiple teams/departments within the organization and to collaborate with external business partners while maintaining data security in your own environment.
Improved data governance using Data Estate Insights
Data Estate Insights application is purpose-built for governance stakeholders and provides actionable insights into the organization’s data estate, catalog usage, adoption, and processes. As organizations scan and populate their Microsoft Purview Data Map, the Data Estate Insights application automatically extracts valuable governance gaps and highlights them in its top metrics All the reports within the Data Estate Insights application are automatically generated and populated.
How much does it cost?
Purview is charged on utilization which means pricing will vary based on the amount of data, the number of systems scanned and how many people are using the instance. A likely low-end price point will be for a single platform with approx. 4 systems are to be scanned weekly, generating less than 2GB of metadata. This would cost between USD$35,00 and USD $7,000 per year.
Data Map Population
|
For Power BI online
|
Free for a limited time
|
|
For SQL Server on-prem
|
Free for a limited time
|
|
For other data sources
|
$0.63 per 1 vCore Hour
|
Data Map Enrichment
|
Advanced Resource Set
|
$0.21 per 1 vCore Hour
|
|
Insights Generation
|
$0.82 per 1 vCore Hour
|
Data Map Consumption
|
Capacity Unit
|
$0.411 per Capacity Unit per Hour
|
Example
Data Map (Always on): 1 capacity unit x $0.411 per capacity unit per hour x 730 hours for up to 10 GB metadata storage and 25 operations per sec
Scanning (Pay as you go): Total [M] min duration of all scans in a month / 60 min per hour x 32 vCore per scan x $0.63 per vCore per hour
Resource Set: Total [H] hour duration of processing Advanced Resource Set data assets in a month * $0.21 per vCore per hour
Conclusion
Microsoft Purview is a great tool for simplifying and improving data governance, especially for organizations heavily utilizing the Microsoft eco-system and possessing a hybrid cloud and on-premises enterprise application base. It helps you understand and govern the data across your estate, safeguard that data wherever it lives, and improve your risk and compliance in a much simpler way and allows you to search for data using technical or business terms. It provides a great way to build your enterprise data catalog with features for business in the business glossary, for IT security with its classification capability, and for IT architecture and analysts with the ability to track data flow and lineage from source to target.