Wax torch of education: learning how to handle the machine loading problem in large data?

Original title: candle education: how to deal with problems in large data loading machine learning?

Wax torch education teacher, said in dealing with machine learning algorithms, often because the database is too big to fit into memory caused, encountered several problems: When running data collection algorithm which led to the collapse of how to solve? When the need to process large data files, how to load? How convenient to solve the problem of insufficient memory?

To solve these problems, wax torch education teacher gives seven recommendations:

Wax torch of education: learning how to handle the machine loading problem in large data?
1. allocate more memory
some ML tool or database default memory configuration is unreasonable, you can see whether you can manually assigned.

2. Use a small sample
confirmed the need to process all the data? Before the final fitting of the data using random data samples to this example.

3. Use a larger device memory
can hire large memory servers, so you can get more computing power on the physical means.

4. Change the data format
by changing the data format to speed data loading and reduce memory usage, such as binary format.

The data stream or a progressive load
can load data into memory gradually be used.

6. Use relational database
Internally, data stored on disk can be gradually loaded and can use standard language (SQL) to query.

7. Use the large data platforms
such as using Hadoop Mahout machine learning libraries and library with Spark MLLib, which is to handle very large data sets specially designed platform.

Wax torch education teacher said, if they are too large to fit in the database related issues House, you can find solutions from above seven methods.

Guess you like

Origin blog.51cto.com/14355900/2401928