How to classify big data analysis standards

  Everyone generally intuitively thinks that data standards are just a few documents that describe some specifications and requirements that need to be followed. And we believe that data standards are not just a set of specifications, but a system composed of management specifications, control processes, and technical tools. It is a process through which information standardization is gradually realized. Data standardization is a process that uses a set of data specifications, management and control processes, and technical tools to ensure that all kinds of important information, such as products, customers, institutions, accounts, etc., are used and exchanged consistently and accurately within and outside the company.

  Data standards can be classified from three dimensions: data structure, data content source, and technical business:

  1. Data standard classification from the perspective of data structure

  Structured data standards are standards developed for structured data, usually including: information item classification, type, length, definition, value range, etc.

  Unstructured data standards are standards developed for unstructured data, usually including: file name, format, resolution, etc.

  2. Data standard classification based on data content sources

  Basic data standards refer to detailed data and related code data directly generated by the business system, ensuring the consistency and accuracy of data related to business activities.

  Derived data standards refer to data derived from basic data processing and calculation according to the needs of management operations, such as statistical indicators, entity tags, etc.

  3. Data standard classification from a technical and business perspective

  Business data standards refer to standards developed for business communication, which usually include: business definition and management departments, business topics, etc.

  Technical data standards refer to the unified specifications and definitions of data standards from the perspective of information technology, usually including: data type, field length, precision, data format, etc.

  Data standards are mainly targeted at businesses. The semantics of many businesses in an enterprise rely heavily on manual combing by business personnel, which is difficult and inefficient. It is very likely that business semantics are difficult to be discovered and managed in a timely manner because combing personnel fail to sort them out in time. However, in enterprise data governance, any data standard will be difficult to implement without corresponding technical means.

Guess you like

Origin blog.csdn.net/qq_30187071/article/details/127963277
Recommended