File data preservation, data storage and registration backup

In February of this year, the author had the honor to visit and communicate with the archives and data preservation center of Suda Suhang, and gained a deeper understanding of archives data preservation and data storage. It has been nearly 5 years since the Suda Suhang Archives Data Preservation Center was officially launched in 2018. It has served more than 30 archives, schools, hospitals and other units, and the scale of preserved data has reached 200TB. According to the description in the report "Soochow University launched the first foreign service archive data preservation platform" originally published in "China Archives News", this platform is the first foreign service archive data preservation platform in China, which can solve the existing electronic archive data In the process of storage, problems such as outdated carriers, data loss and damage, and unknowable data status occur, so as to realize the security supervision of archive data.

/File Data Preservation/

At the same time, the Suda Suhang team also submitted an archives industry standard project "Electronic Archives Evidence Effectiveness Maintenance Specifications". "Electronic archives data security system " gives the following definition:

Provide electronic archive evidence validity maintenance, an information system with functions such as electronic archive evidence storage, stored data monitoring, error repair, and traceability of the storage process.

We only need to slightly process the above definition to get the definition of " archive data preservation ":

★Archival data preservation——the process of maintaining the effectiveness of electronic archive evidence through technical means such as electronic archive evidence storage, stored data detection, error repair, and traceability of the storage process.

This article does not discuss whether the above concepts are accurate. After all, the "Code for the Maintenance of Evidence Effectiveness of Electronic Archives" has not yet been officially released, and the definitions therein may be revised and adjusted. However, the above definition leads to a second concept: " electronic file storage ", or "electronic data storage" in a broader and more general sense.

/archive data storage/

The definition of "electronic archives storage" given in the "Code for the Maintenance of Evidence Effectiveness of Electronic Archives (Draft for Comment)" is as follows:

The operation of fixing, preserving and verifying the originality of electronic files through legally recognized technical means such as electronic signature, time stamping, extracting and solidifying the integrity check value of electronic files, and uploading to the blockchain for storage.

The judicial administration industry standard SF/T 0076-2020 "Technical Specifications for Electronic Data Storage" provides the definition of " electronic data storage " as follows:

We only need to slightly change the above definition, and combine the electronic signature, time stamp, integrity verification, blockchain and other technical measures mentioned in the "Electronic Archives Evidence Effectiveness Maintenance Specification (Draft for Comment)" to get " Archival data storage " is defined as follows:

★Archive data storage——through integrity verification, electronic signature, trusted time stamp, blockchain and other technical measures to implement evidence solidification, storage and verification operations on archive data.

From the above description, we can see that archival data preservation can be regarded as a technical means of archival data preservation, and the purpose of archival data preservation is to maintain the effectiveness of electronic archival evidence, which is the "source of evidence" mentioned in the new "Archives Law". Reliable, procedurally regulated, and elementally compliant” are the 12 words. Of course, archival data storage itself can be realized through technical measures such as integrity verification, electronic signature, trusted time stamp, and blockchain.

Archival data preservation VS archival data preservation

The author further explains the meaning of the two through two typical application systems.

Blockchain business file data storage system

Taking public resources trading business archives as an example, blockchain technology is introduced in the bidding business, and public resource trading blockchain alliances are established by public resource trading centers, bidding agencies, and regulatory authorities. Transparent, non-tamperable, traceable and other characteristics, build a business archives data storage platform, and complete the docking between the blockchain platform, the electronic bidding system, and the electronic archives management system. As shown below:

For the bidding documents, bidding documents and transaction data generated during the bidding process uploaded by bidding agencies and bidding companies in the electronic bidding system, the system all conducts transaction behavior and transaction data storage operations on the chain (because the electronic bidding documents are large, In fact, the chain storage certificate is the hash value of the electronic bidding document); after the transaction is completed, in the archiving and file management links, all the archiving information, sorting information, and utilization information are also stored on the chain to ensure that every business link can be traced . Tenderees, bidders and regulatory authorities can query the data on-chain storage information through the system at any time. When a dispute occurs, the judiciary can collect evidence through the blockchain platform, and the data on the blockchain platform can be used as electronic evidence. Blockchain-based public resource transaction business file management can provide business file data storage, anti-counterfeiting verification, file traceability, data inspection, document forensics and other functions, which not only realizes the immutability and integrity of transaction data and materials, but also ensures The chain storage of transaction behavior, transaction data, and data, and the traceability of the whole process, make public resource transactions more fair, just, and open.

The Electronic Archives Data Preservation System of an Archives

The electronic archive data security system combines data backup, data preservation, data storage, intelligent recovery and other technical means to form a management model that can conduct full-process security supervision of massive archive data. The system can provide automatic, policy-based data storage and backup solutions, real-time monitoring of target archive data storage devices (such as management database, that is, production warehouse), to achieve real-time data monitoring, real-time early warning, real-time preservation, real-time repair, so as to ensure Safe storage of archive data, simplify key data backup and recovery process, improve the security of the production environment, and carry out visual supervision, statistics, and analysis of archive data, improve the security management ability of management departments for massive data, and achieve more efficient data security protection .

The data preservation mechanism between the four databases is as follows:

1. Production library A: file management library, which is also the resource library corresponding to the file management system;

2. Preservation library B: monitor the data changes in the production library A, automatically back up the specified file data in the production library A according to the policy, form different backup versions, and perform four-character detection on the backup file data;

3. Local backup repository C: back up the data in preservation repository B, and check the readability of the backup data;

4. Off-site backup library D: After backing up the data in the preservation library B, store the carrier in the off-site backup library D;

5. When the data of A is damaged, restore A from B;

6. When the data of A and B are damaged, restore B from C, and then restore A from B;

7. In the extreme case where local A, B, and C are all damaged, first restore B from D, then restore A from B, and finally generate C from B.

/file registry backup/

Finally, let's talk about the registration and backup work that was popular in the archives department of Zhejiang Province more than 10 years ago. This concept is similar to " archive data preservation " and " archive data storage ".

In December 2009, the General Office of the Zhejiang Provincial Party Committee of the Communist Party of China and the General Office of the Zhejiang Provincial People's Government issued the "Notice on Carrying out the Registration and Backup of Electronic Documents and Digital Archives" (Zhe Wei Ban [2009] No. 140), fully launching the province's Registration and backup of electronic files and digital archives (hereinafter referred to as " registration and backup "). which states:

★Registration and backup——The archives administrative department at or above the county level registers and authenticates the electronic documents and digital archives that are important to the country and society and has important preservation value for the country and society, and the certified national comprehensive archives at the same level Electronic files and digital archives for data backup.

There are three main forms of electronic files and digital archives registered for backup management:

1. Electronic business data.

Refers to the current electronic business data (database) formed by various professional competent departments and subordinate units through special business information management systems;

2. Electronic documents.

Refers to official electronic documents (generally formatted official documents, documents, charts, etc., including relevant metadata and machine-readable catalogs, etc.) formed by various units through computers and other electronic equipment in the process of processing official business;

3. Archive digitization results.

Refers to the data results after the digital conversion of the traditional carrier archives of each unit, including full text and machine-readable catalogues, etc.

And give priority to the registration and backup of important files related to people's livelihood and key construction project files. At the same time, strengthen the construction of digital archives in government agencies, groups, enterprises and institutions, and actively guide state-owned enterprises and institutions to carry out file registration and backup work.

The registration backup process is shown in the following figure:

three similarities

There are obvious differences in the implementing subjects, service models, and technical means of archival data preservation, archival data storage and registration backup, but the author believes that these three have at least the following three aspects in common:

1. The management objects are the same:

Both are for archival data, or the "electronic files and digital archives" mentioned in the registration backup;

2. Both include the "Backup" function:

Archival data preservation includes at least one set of complete data backups of managed objects; although a complete set of backup data may not be stored in archival data storage, at least data related to evidence and its digital abstract (hash value ); registration and backup itself is the registration and backup of electronic files and digital files;

3. The ownership of the data belongs to the original unit:

For units that provide security services, certificate deposit services, and registration backup services, they only play a role similar to "bank safes". tamper. The service provider itself is not allowed to add, delete, modify, check and other operations on the data without the permission of the original unit.

The Digital Rosetta Project is committed to objectively and impartially expressing its views and opinions on the field of archives informatization as a neutral third party. The truth is becoming clearer and clearer, and we sincerely welcome more and more people to devote themselves to the research in the field of archival digital resource management and preservation and express their insights, and work together for the inheritance of human civilization!

Guess you like

Origin blog.csdn.net/weixin_56245650/article/details/129611438