Good programmers share 10 Big Data Big Data jargon

  Big data job market for less than demand, a high degree of talent shortage, large enterprises demand! IT industry circle great, numerous types of engineers is also why we should choose a large data it happens? Big Data era is Hing waves, students at the forefront of the times, how rushed the future! Good programmers inventory today, 10 large data jargon, white quickly over here!


1. algorithm. "Algorithm" What is the origin of big data? In fact, the algorithm is a generic term, it is big data analysis makes it highly popular and everyone rushed popular.


2. Analysis. I am most looking forward to the end of the list of transactions you can do this analysis. Transaction and year-end reports from the major credit card companies mailed to the full year, we are further analyzed year consumption and accounted for various expenditures. When you learn from the data, you can make future spending decisions fully.


3. descriptive analysis. Obviously, we can conclude from the consumer's credit card details of a year, spending on food by 25%, 35% spent on clothing, 20% spent on entertainment, the rest is consumed by other matters, such It is descriptive analysis.


4. cloud. What is cloud computing, where we do not repeat. Cloud computing is not essentially running on a remote server


  Software and (/ or) data hosting, and allows access from anywhere on the Internet.


5. cluster computing. It is a use of pooled resources of multiple servers "cluster" to be strange way of computing. After learning more technology, we may also discuss node, cluster management, load balancing and parallel processing.


6. Dark data. This data has a very special nature, in essence ,, dark data is data that is collected and processed companies but not for any meaningful purpose, thus describes it as "dark", they may never be buried . They may be social networking traffic, call center logs, meeting notes, and so on. People make a lot of estimates, 60-90% of all enterprise data are likely to be "dark data", but no one really knows.


7. Data Lake. Data Lake is an enterprise-class data format of the original large repository. Here, we also need to discuss data warehousing, data warehouse and the lake because the data is conceptually very similar, enterprise-wide repository for all data, but in a structured format after cleaning and integration with other data sources the difference.


  Data warehouses commonly used in conventional data (but not quite). It is said that the lake allows data users to easily access enterprise data, users really needed to know what they are looking for is how to deal and let the intelligent use. Embracing open source technologies premise - understanding the data you know Lake Lakes data (DATALAKE) do?


8. Data Mining. Data mining is the use of sophisticated pattern recognition technology to find meaningful patterns from large amounts of data, extract insights. This "analysis" is closely related term use of personal data for analysis of our previously discussed. In order to extract meaningful patterns in data mining by using a statistical (Yes, good old math), machine learning algorithms and artificial intelligence.


9. Distributed File System. Due to the large data too large to be stored on a single system, a distributed file system provides a data storage system to facilitate across multiple storage devices to store a lot of data, and help reduce the cost and complexity of a large number of data storage .


10.ETL. ETL are the extract, transform, load acronym, on behalf of extraction, transformation and loading process. It specifically refers to "extract" the raw data, through the data cleaning / modified manner "conversion" to obtain a "suitable" data, thereby "loading" the entire process suitable repository for system use. Although the concept is based on the data warehouse ETL, but is also applicable to other scenarios in the process, such as access to / from an external data source absorbance data in the big data system.


Guess you like

Origin blog.51cto.com/14249543/2404273