Manage data quality with metadata

What is metadata Data
       in any file system is divided into data and metadata. Data refers to the actual data in ordinary files, and metadata refers to system data used to describe the characteristics of a file, such as access rights, file owners, and distribution information (inode...) of file data blocks. In a cluster file system, the distribution information includes the location of the file on the disk and the location of the disk in the cluster. Users who need to operate a file must first get its metadata, in order to locate the location of the file and get the content or related attributes of the file.
Metadata management methods
       There are two methods for metadata management. Centralized and distributed management. Centralized management means that there is a node in the system dedicated to metadata management, and all metadata is stored on the node's storage device. All client requests for files must first request metadata from the metadata manager. Distributed management means that metadata is stored in any node of the system and can be dynamically migrated. Responsibility for metadata management is also distributed across different nodes. Most cluster file systems employ centralized metadata management. Because centralized management is simple to implement and consistent maintenance is easy, it can provide satisfactory performance within a certain frequency of operations. The disadvantage is the single point of failure problem. If the server fails, the entire system will not work properly. Moreover, when the operation of metadata is too frequent, centralized metadata management becomes the performance bottleneck of the whole system.
       The benefit of distributed metadata management is that it solves the single point of failure of centralized management, and performance is not bottlenecked with frequent operations. The disadvantage is that the implementation is complex, the consistency maintenance is complex, and it has a certain impact on performance.

 

How to use metadata to manage data quality: click here

More excellent courses:

ApsaraDB for Redis Tutorial

Getting Started with Cloud Storage Object Storage OSS

Alibaba Cloud CDN Tutorial

Load Balancing Getting Started and Product Usage Guide

Alibaba Cloud University Official Website (Alibaba Cloud University - Official Website, Innovative Talent Workshop under the Cloud Ecosystem )

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325054088&siteId=291194637