A new generation of CTO training road all-in-AI series-big data + AI artificial intelligence driving technological change big data department organizational structure and team building


With the rapid development of big data and artificial intelligence technology, the traditional old generation of CTOs, in addition to excellent engineering capabilities, still need to charge to learn big data and AI artificial intelligence technology! Big data and artificial intelligence technology are an indispensable part of the company's entire technical system and become the company's core competitiveness. At the same time, big data + AI artificial intelligence plays a pivotal role in driving product innovation, change, and upgrade! As a new generation of CTO, you must master it. Engineering ability + big data + AI artificial intelligence = a new generation of CTO, yes! You will be out if you don't learn

For Internet companies, technology is the core competitiveness. Based on the massive user behavior data, deeper big data modeling and analysis can take your product to a higher level. Let data drive product design, make scientific decisions and guide products. But this is inseparable from the coordination of other departments, and the organic unity and collaboration of various groups and positions within the big data department are also inseparable.

1.2.1 Organizational Structure of Big Data Department

The big data department can be roughly divided into three groups: big data platform group, algorithm group, and data analysis group. These three groups are led by the big data VP. Some people may not know what the big data VP means. The big data VP is the big data vice president, usually reporting to the CTO, and some companies directly reporting to the CEO. The big data platform group, algorithm group, and data analysis group are generally led by the director. Some companies are led by the architect. Of course, it can also be led by the manager or TeamLeader. These three directors report to the big data VP. Let's look at the organizational structure shown in Figure 1.1:

image

Figure 1.1 Organizational Chart of Big Data Department

Based on this figure 1.1, let's talk about the division of labor of each department and the responsibilities of each position below.

The responsibility of the big data platform group is to provide the basic data platform, data warehouse, data burying point collection, general tools, and platform support for the algorithm group and data analysis group.

The algorithm group is based on the big data platform to do a lot of data mining and analysis, and develop company products such as personalized recommendation systems, search engines, user portraits, and other algorithm products. It is an engineering application that is more upstream.

Data analysis is based on the big data platform to do data analysis, statistics, mining, data visualization, report development, etc. It has some intersections with the algorithm group, partial data analysis and application, management decision-making, and data insight discovery.

1. Big Data Platform Group

The responsibility of the big data platform group is to provide the basic data platform, data warehouse, data burying point collection, general tools, and platform support for the algorithm group and data analysis group.

There are various positions in the group that cooperate with each other, and everyone does their duty to complete the construction of the big data platform.

1) Director of Big Data Platform

The general task is to be responsible for department management and architecture design of the big data platform. The specific tasks are as follows:

(1) Responsible for designing big data architecture and reviewing iterative work based on business requirements;

(2) Model design and data asset system construction based on big data processing platform;

(3) Participate in data warehouse modeling and ETL architecture design, and participate in the research of big data technical difficulties;

(4) Responsible for data approval and data connection for external cooperation of team data to promote cooperation and exchanges;

(5) Analyze and select big data technologies to cultivate and improve team skills.

(6) Responsible for the core strategy application of the company's big data platform, and use machine learning to help business development.

(7) Code writing, guiding and training engineers for the core part of the system, and continuous system optimization;

2) Hadoop platform operation and maintenance engineer

The general task is to be responsible for the construction and operation and maintenance of Hadoop clusters. Generally, large Internet companies can set up such a position specifically, because there may be thousands of cluster rules and distinguish between production clusters and test clusters. If the cluster is not very large, there is generally no need to set up this position separately, and the big data platform engineer will be responsible. The specific work is as follows:

(1) Responsible for the development and maintenance of the big data platform architecture

(2) Responsible for Hadoop cluster operation, maintenance and management

3) Big data platform engineer

The general task is to be responsible for cluster building operation and maintenance, data warehouse construction, general tools, and data collection and embedding services.

The specific work is as follows:

(1) Responsible for the development and maintenance of the big data platform architecture

(2) Responsible for Hadoop cluster operation, maintenance and management

(3) Responsible for data warehouse construction

(4) Data burying point, data collection, data processing

(5) Company-level BI general tools

4) Big data ETL engineer

The general task is to be responsible for ETL data processing, configuration job dependency, directional data collection and processing, etc.

The specific work is as follows:

(1) ETL data processing, development, and workflow scheduling design

(2) Script deployment and configuration management, workflow exception handling, daily management, batch running, maintenance, and monitoring.

(3) Complete daily tasks such as collection and crawling of directional data, analysis and processing, and storage;

5) Flow Computing Engineer

The general task is to be responsible for real-time online data analysis tasks such as Storm and Flink stream processing.

The specific work is as follows:

(1) Analyze online user behavior data in real time and find out abnormal users;

(2) According to the user's real-time behavior, real-time processing and updating of databases such as Hbase;

(3) Track the progress of mainstream computing technology in the industry and integrate it into current business;

6) Data warehouse engineer

The general task is responsible for data warehouse modeling and data processing.

The specific work is as follows:

(1) Understand the company's various existing data, gain insight into the points to be optimized in the matching of the existing data system with the customer's business, and continue to improve;

(2) Responsible for the construction and improvement of the data management system, covering the standards, models, quality and data access process of the data life cycle;

(3) Responsible for the hierarchical design and data processing of the data warehouse, and effectively manage and integrate various data;

7) Spark engineer

The general task is responsible for Spark data processing.

The specific work is as follows:

(1) Responsible for one-stop development of streaming data processing and offline processing;

(2) Responsible for data processing based on Spark and provide data support for algorithm models.

8) Back-end web/front-end engineer

This organizational chart is not drawn, but in reality, this role is often needed to develop the background management tools of the big data department, general web tools, such as data warehouse management tools, data quality management tools, etc., part of the web interface service work, since it is web development, Generally, a front-end engineer's position will be split. Artists generally don't set up positions separately, and the company's unified design department can do the UI on their behalf.

2. Algorithm group

The algorithm group is based on the big data platform to do a lot of data mining and analysis, and develop company products such as personalized recommendation systems, search engines, user portraits, and other algorithm products. It is an engineering application that is more upstream. The following are specific job responsibilities.

1) Director of Algorithms

The general task is to lead the algorithm team and algorithm system architecture. The specific tasks are as follows:

(1) Lead the algorithm product and R&D team, plan the direction of algorithm research and development, and overall control the progress of algorithm research and development

(2) Deeply understand product business requirements, and implement the combination of algorithm and business based on product requirements

(3) Build an excellent algorithm team and lead the algorithm team to raise the technical level to the first-class level.

(4) Responsible for the recommendation system, search engine, face recognition, dialogue robot, knowledge graph and other algorithms involved in the product application.

2) Recommend algorithm engineer

The general task is to recommend algorithm development and optimization. The specific tasks are as follows:

(1) Responsible for the research and development of the recommendation algorithm, and improve the overall recommendation click rate and conversion rate through algorithm optimization.

(2) According to the scene characteristics, model abstract business scenarios for user and item information, and formulate effective recall algorithms; at the same time, continuously optimize the predictive ranking algorithm from the dimensions of samples, features, and models.

3) Natural language processing engineer

The general task is the design, development and optimization of NLP algorithm products. The specific tasks are as follows:

(1) Responsible for the design, development and optimization of related NLP algorithm products, including keyword extraction, text classification, sentiment analysis, semantic analysis, name recognition, text summarization and intelligent question answering, etc.;

(2) The use and improvement of basic NLP tools, including word segmentation, part-of-speech tagging, naming practice recognition, new word discovery, syntax, semantic analysis and recognition, etc.;

(3) Domain intention recognition, entity extraction, semantic slot filling, etc.;

(4) Participate in text intent analysis, including text classification and clustering, spelling error correction, entity recognition and disambiguation, central word extraction, short text understanding, etc.

4) Machine learning engineer

The general task is the engineering of data analysis and mining and artificial intelligence technology. The specific tasks are as follows:

(1) Propose artificial intelligence solutions and models for product applications;

(2) Engineering of artificial intelligence technology;

(3) Research and implementation of intention recognition, intelligent search, and personalized recommendation algorithms in dialogue scenarios.

5) Data mining engineer

The general task is data modeling and analysis. The specific tasks are as follows:

(1) Responsible for data mining in data analysis of product business;

(2) Based on the analysis and diagnosis results, establish mathematical models and optimize them, write reports, and provide data support for operational decisions, product directions, sales strategies, etc.

6) Deep learning engineer

The general task is the research and application of deep learning related algorithms. The specific work is as follows:

(1) Research and implementation of deep learning related algorithms;

(2) Efficiently implement algorithms on a variety of different platforms and frameworks, and continue to optimize the implementation of algorithms and models based on the understanding of the internal mechanisms of the platforms and frameworks

(3) Optimization of deep learning network and mobile phone applications;

(4) Research and application of deep learning algorithms, including image classification, target detection, tracking, semantic segmentation, etc.

(5) Docking with the product.

7) Spark engineer

The general task is similar to the Spark development of the big data platform and can be shared. But it is more focused on providing data processing and support for algorithm developers.

8) Back-end web/front-end engineer

This organizational chart is not drawn. In fact, the algorithm department also has many background management tools, such as recommendation management platform, search management background, algorithm AB testing platform and optimized data visualization. There is also a need to provide business interfaces to other departments, such as recommendation engine Web services, search services, etc.

3. Data Analysis Group

Data analysis is based on the big data platform to do data analysis, statistics, mining, data visualization, report development, etc. It has some intersections with the algorithm group, partial data analysis and application, management decision-making, and data insight discovery. The positions are as follows:

1) Director of Data Analysis

The general task is to be responsible for the management of the data analysis department, the investigation of business needs, the management and execution of data projects, and the provision of industry reports. The specific tasks are as follows:

(1) Write reports based on insights from massive data, provide support for marketing and operation decisions, and promptly discover and analyze actual business problems, and give targeted optimization suggestions;

(2) Participate in business needs research, design big data solutions according to needs and industry characteristics, and follow up the implementation of specific projects;

(3) Design and implement systematic support for BI analysis, data product development, and algorithm development to ensure data mining modeling and engineering;

(4) Manage and execute data projects, achieve customer requirements and goals, and meet KPI evaluation indicators;

(5) Be familiar with the development of the industry, master the latest data analysis technology, and provide industry reports on a regular basis;

2) User portrait engineer

The general tasks are user data analysis, user portrait modeling, and user tag extraction. The specific tasks are as follows:

(1) Based on massive user behavior data, construct and optimize user portraits, generate user tags, which are used to improve recommendation and search results, and provide data support for operations;

(2) Responsible for building a complete user portrait mining system, including data processing, mining user portraits, accuracy evaluation, etc.;

(3) Leading the analysis of user portrait needs, controlling the direction of user portrait construction, designing and constructing platform-based portrait service capabilities based on user behavior characteristics;

(4) Unify data standards and establish an evaluation mechanism and monitoring system for user portrait products;

3) Data Analyst

The general tasks are data analysis and modeling, data visualization, and providing industry reports. The specific tasks are as follows:

(1) Collect business data, process and analyze, and visualize data;

(2) Analyze, mine and model multiple data sources, and submit effective analysis reports;

(3) Discover new market trends and different customer application scenarios from data analysis, and provide decision support;

4) Report development engineer

The general tasks are business data analysis, report development, and data visualization display. The specific tasks are as follows:

(1) According to the needs of various business departments, clean, analyze, monitor and evaluate related data, produce analysis reports, and make effective recommendations for business activities

(2) Monitoring, optimization, permission and performance management for visualization tools such as Tableau to ensure the normal use and expansion of data analysts and report users

(3) According to the analysis, usage and performance requirements of data analysts and report users, sort out all kinds of data, assist in optimizing data structure, enrich database content, improve data quality, and improve data management system;

5) Data Product Manager

The data product manager is a new position created in the past few years. Understanding data analysis and understanding algorithms are some of the requirements for this position. This is generally transferred from other traditional product managers. The general task is to be responsible for the planning and design of data products, the analysis, design, and implementation of business data requirements. The specific tasks are as follows:

(1) Responsible for planning and design of data products, analysis, design and implementation of business data requirements;

(2) Coordinate the data source and data development engineers to make data docking flexible, efficient, and accurate through process-oriented and standardized ideas;

(3) Deeply understand the business and coordinate the data development team to complete the requirements;

4. More detailed division of big data departments

The above is an introduction to the occupations and corresponding positions of each department. This departmental structure is relatively popular. Generally, the total number of people in the big data department can be divided into 20-50 people. But if there are more people involved, such as more than 50 people, the department can be more detailed. For example, recommendation algorithms and search are very core teams on the Internet, and it is suitable to separate and process a part from the algorithm group to form a recommendation system group and search group. Furthermore, the user profile group is also a very important team, which can be separated from the data analysis group. The positions of web development, front-end and back-end interface engineering can also be separated from each group to form a separate engineering group. In this way, our big data department is divided into several groups:

1) Big data platform group

2) Algorithm group

3) Recommended system group

4) Search group

5) User portrait group

6) Data Analysis Group

7) Engineering group

What is the division of labor between these groups? Based on experience, it is summarized as follows:

(1) The big data platform group is the basic group, and the data of all other groups are provided by this group.

(2) The recommendation system is often independent of the algorithm group, and can also be the same group as the algorithm group. See more people but fewer people.

(3) The recommendation system generally uses search, so many Internet companies search and recommend as a group, and often separate from the big data department, and set up a search recommendation group parallel to the big data department. Personal opinion: If the person in charge of the big data department has experience in search recommendation, it is recommended to put the search recommendation under the big data department, so that the product will be better. After all, search recommendation is the most classic application based on big data.

(4) The user portrait group relies on the big data group, and the user portrait mart can be established separately. Search recommendations, and other data analysis groups also need user profile data.

(5) The engineering group can be embedded in other groups, or it can be grouped separately. The most important aspect of the engineering group is to provide web services to other departments of the company such as front-end websites and apps. Such as data burying point collection interface, user portrait interface, search interface, recommendation interface, other data interface, etc.

to sum up

This article has a corresponding supporting video. For more exciting articles, please download the charging app, you can get tens of thousands of free lessons and articles. For the supporting new book and textbook, please see Chen Jinglei’s new book: "Distributed Machine Learning in Action" (Artificial Intelligence Science) And Technology Series)

[New book introduction]
"Distributed machine learning in practice" (artificial intelligence science and technology series) [edited by Chen Jinglei] [Tsinghua University Press]
Features of the new book: Explain the framework of distributed machine learning and its application supporting personalized recommendation algorithm system step by step , Face recognition, dialogue robots and other practical projects

[New book introduction video]
Distributed machine learning practice (artificial intelligence science and technology series) new book [Chen Jinglei]

Video features: focus on the introduction of new books, analysis of the latest cutting-edge technology hotspots, and technical career planning suggestions! After listening to this lesson, you will have a brand new technological vision in the field of artificial intelligence! Career development will also have a clearer understanding!

[Excellent Course]
"Distributed Machine Learning Practical Combat" Big Data Artificial Intelligence AI Expert-level Excellent Course

[Free experience video]:

Artificial intelligence million annual salary growth route / from Python to the latest hot technology

From the beginner's introduction to Python programming with zero foundation to the advanced practical series of artificial intelligence courses

Video features: This series of expert-level fine courses has a corresponding supporting book "Distributed Machine Learning Practical Combat". The fine courses and books can complement each other and complement each other, which greatly improves the learning efficiency. This series of courses and books take distributed machine learning as the main line, and give a detailed introduction to the big data technology it depends on. After that, it will focus on the current mainstream distributed machine learning frameworks and algorithms. This series of courses and books focus on actual combat. , Finally, I will talk about a few industrial-level system combat projects for everyone. The core content of the course includes Internet company big data and artificial intelligence, big data algorithm system architecture, big data foundation, Python programming, Java programming, Scala programming, Docker container, Mahout distributed machine learning platform, Spark distributed machine learning platform, Distributed deep learning framework and neural network algorithm, natural language processing algorithm, industrial-grade complete system combat (recommended algorithm system combat, face recognition combat, dialogue robot combat), employment/interview skills/career planning/promotion guidance, etc. .

[Is it charged? Company introduction]

Rechargeable App is an online education platform focusing on rechargeable learning for vocational training for office workers.

Focus on the improvement and learning of work vocational skills, improve work efficiency, and bring economic benefits! Are you charging today?

Is it charging official website
http://www.chongdianleme.com/

Is it charged? App official website download address
https://a.app.qq.com/o/simple.jsp?pkgname=com.charged.app

Features are as follows:

【Full Industry Positions】-Focus on improving the vocational skills of office workers

Covering all industries and positions, whether you are an office worker, executive or entrepreneur, there are videos and articles you want to learn. Among them, big data intelligent AI, blockchain, and deep learning are the practical experience of the Internet's first-line industrial level.

In addition to professional skills learning, there are general workplace skills, such as corporate management, equity incentives and design, career planning, social etiquette, communication skills, presentation skills, meeting skills, emailing skills, how to relax work pressure, personal connections, etc. Improve your professional level and overall quality in all aspects.

【Niuren Classroom】-Learn the work experience of Niuren

1. Intelligent personalization engine:

Massive video courses, covering all industries and all positions, through the skill word preference mining analysis of different industries and positions, intelligently matching the skill learning courses that you are most interested in for the current position.

2. Search the whole network

Enter keywords to search for massive video courses, there are everything, there is always a course suitable for you.

3. Details of listening to the class

Video playback details, in addition to playing the current video, there are also related video courses and article reading, which strengthens a certain skill knowledge point, allowing you to easily become a senior expert in a certain field.

【Excellent Reading】-Interesting reading of skill articles

1. Personalized reading engine:

Tens of millions of articles to read, covering all industries and all positions, through the skill word preference mining analysis of positions in different industries, intelligently matching the skills learning articles you are most interested in in your current position.

2. Read the whole network search

Enter keywords to search for a large number of articles to read, everything is available, there are always skills learning articles you are interested in.

[Robot Teacher]-Personally enhance fun learning

Based on the search engine and intelligent deep learning training, we will create a robot teacher who understands you better, chat and learn with the robot teacher in natural language, entertaining and learning, efficient learning, and happy life.

【Short Course】-Learn knowledge efficiently

Massive short courses to satisfy your time fragmented learning and quickly improve a certain skill knowledge point.

Guess you like

Origin blog.csdn.net/weixin_52610848/article/details/111994389