Small data storage algorithm

Record a question: Is there a 10G size of the file, the file is an integer within a line of a given memory can be used for the statistical requirements of 2G numbers appear most frequently.

1, streaming data processing (individual piece of writing it back, buried pit ....)

2, sub-file processing

      Document reading section, of the mold 10. The numerical value of the same into a file. Then processed 10 files. Statistics largest number of occurrences.

      I think the above scheme can solve this case. File value is not repeated. Or less repeat case

Assume an extreme case. All modulo file contents are all the same. Or not more than 2G applicable in the above method. The more suitable the same numbers into the same file.

     Situation is different solutions are not the same. There is no silver bullet
---------------------
Author: Joe sail
Source: CSDN
Original: https: //blog.csdn.net/weixin_40596063/article/details/82895458
Disclaimer: This article is a blogger original article, reproduced, please attach Bowen link!

Guess you like

Origin www.cnblogs.com/stone531/p/10992547.html