hadoop learning route route

Follow this roadmap to learn.
    1. Three articles by M. Tim Jones: Distributed Data Processing
    with Hadoop Part 1 (Introduction): http://www.ibm.com/developerworks/cn/linux/l-hadoop-1/index.html
Distributed Data Processing     with Hadoop Part 2 (Advanced): http://www.ibm.com/developerworks/cn/linux/l-hadoop-2/index.html
    Distributed Data Processing with Hadoop Part 3 (Application Development): http://www.ibm.com/developerworks/cn/linux/l-hadoop-3/index.html

    2. The blog of "Stars in the Galaxy", in which the Google paper series (including Groundbreaking paper "MapReduce: Simplifying data processing on large clusters"), introduction to search and distribution
    [google paper three] MapReduce simplifies data processing on large clusters: http://duanple.blog.163.com/blog /static/709717672010923203501/
    The Map/Reduce program for word frequency statistics can be found here: http://blog.csdn.net/shijinupc/article/details/7522446
    Google paper series:http://duanple.blog.163.com/blog/#m=0&t=3&c=google
    is connected according to Hadoop components: http://duanple.blog.163.com/blog/static/7097176720119791920962/
    
    3. IBM Other Hadoop articles on developerWorks, search with Hadoop keywords on dw, you can find a lot of Hadoop articles.
    Here are some to see:
    Introduction to Hadoop Distributed File System: http://www.ibm.com/developerworks/cn/web/ wa-introhdfs/index.html
    uses Apache Pig to process data: http://www.ibm.com/developerworks/cn/bigdata/basic.html
        
    4. Introduction in "Open Source Software Architecture "
    (Volume 1 Chapter 8) HDFS --Hadoop Distributed File System: http://www.ituring.com.cn/article/4299
    English original: http://www.aosabook.org/en/index.html  (Volume 1, Chapter 8)
    
    5. The official blog of Alibaba Group Data Platform, which contains a lot of Hadoop research and application experience
    http://www.alidata.org/archives
    
    6. The official blog of Baidu Search R&D Department, mainly including experience in distributed systems (Hadoop), search technology, data mining, large-scale website architecture , etc.

    http://baidu-tech.com/

 

    7. Dong's blog, research on Hadoop and distributed systems

    http://dongxicheng.org/recommend/


    8. Of course, the official documents are indispensable, mainly including the construction of Hadoop cluster, the use of MapReduce, and the introduction of HDFS architecture. The
    stable version is given priority: http://hadoop.apache.org/docs/stable/
    The latest version (including the following The first generation of MapReduce is the introduction of YARN): http://hadoop.apache.org/docs/current/
        
    9, caibinbupt's blog, Hadoop source code analysis series
    http://caibinbupt.iteye.com/?page=6
    thousand of Column, Hadoop-0.20.0 source code analysis
    http://blog.csdn.net/shirdrn/article/category/595039/3
    
    10. spork's blog, among which the series about Hadoop
    http://www.cnblogs.com/spork /category/226077.html
    
    11. chinacloud's blog, some experience in Hadoop architecture and distributed system design
    http://www.cnblogs.com/chinacloud/archive/2010/12/03/1895369.html
    
    12, beanmoon's blog, the Hadoop series
    http://www.cnblogs.com/beanmoon/

Reprinted from: http://blog.csdn.net/zhoudaxia/article/details/8801769

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326265297&siteId=291194637