Ordinary programmers, how to turn the current shortage of big data relevant personnel?

For programmers, technical progress much more than people imagine, if you do not follow the progress of the times, it will fall behind the times.

In fact, I've heard a lot of people told me the same thing. But the vocabulary of different people have different mouth mentioned - big data, data mining, machine learning, artificial intelligence ...... these current hot concept is different, there are cross, in short, we are promoting good control of vast amounts of data and extract the valuable technical information.

 

If you are ready to join the big data and big data regarding the current 2019

[Prospects] poke me read

[Jobs] poke me read

[Salary] Big Data poke me read

[Line] complete learning poke me read

Focus on micro-channel public number itdaima get a full set of development tools, as well as large data entry learning materials

Programmers eager for these technologies, "ordinary programmer how to move closer to artificial intelligence?" And other issues have a high degree of concern. We can also see in the recruitment market, an increasing number of technical candidates will be thinking when to quit, whether engaged in work-related positions.

From the data on the 100offer platform point of view, a large proportion of the interview invitation job-related data is also increasing.

 

Currently, many candidates favor of large data relating to the post is not accidental

Speed ​​up processor speed, the maturing of large-scale data processing technology, allowing us to quickly extract valuable information from Big Data becomes possible. A few decades ago neural network algorithms have been proposed at the beginning, stretched computing power is difficult to make this computationally intensive algorithms to play its due role. Now, PB-level data model can also be done in a short time training machine learning. It makes Klaeng deep pupil, IFLYTEK and other images are highly dependent on the depth of learning, speech recognition company is the product of rapid iteration.

The rapid development of the Internet industry, the company has so many thousands of user data, each wants to tap this rich reserves of gold, by extension, the tremendous value of the data in their own business in different application scenarios - Jingdong, Taobao and other electric providers website with a user-portrait made personalized recommendations, PayPal, Yixin and other Internet companies financial risk control, Didi, Dada and other travel by identifying high-risk behavior feature implementation, distribution and use of business transaction data so that real-time pricing maximize profits ......

There are some companies, with big data technologies to create new business models - such as the use of personalized content recommendation algorithms do the headlines today, a little information, such as the integration of massive amounts of data through monitoring services, data do realizable value TalkingData, of course, there are some underlying architecture supporting service providers such as Ali cloud, UCloud also opened a hosting cluster, machine learning platform and other services.

Overall these companies to big data, data mining-related demand for talent is very large, resulting in supply relative lack of talent in the industry. Thus often pay relatively higher.

 

Coupled with these positions compared to traditional software engineering, there are more and more difficult challenges of space, naturally attracted more people to enter into this field.

Recently, in order to understand the status of the recruitment of large data relevant engineers, we visited several large companies need to tighten data related to personnel, technology and their recruitment status quo Leader chat chat related personnel.

For engineers, the big data-related positions can be considered what?

Recruitment of engineers from various point of view, dealing with large data core so engineers are usually divided into two categories

Big Data Platform / Development Engineer

Their focus is to collect, store, manage and process data.

Usually partial to develop and maintain the underlying infrastructure, we need these engineers have a clearer understanding of Hadoop / Spark ecology, to understand the development and maintenance of distributed clusters. Construction of familiar NoSQL, understand ETL, data warehouse understanding and possibly contact with the machine-learning platform and other platforms to build.

Some big data development engineers to do the job may also be emphasis on the application layer, the engineers trained the algorithm to achieve a good model in the application layer logic, but some companies will be classified as such engineer software development team rather than large data team.

Algorithms & Data Mining Engineers

Such engineers focus lies in the value of data mining.

They usually means the use of algorithms, machine learning, data mining from the mass of valuable information, or to solve business problems. While the skill composition similar, but in a different team, a different face as business scenarios, algorithms & data mining engineers need skills have different emphases. Thus this category can be subdivided into two subclasses:

algorithm engineer

Such problems faced by the team is clear and there is usually more difficult, such as face recognition, such as online payment risk of interception. These issues through a clear definition and a high degree of abstraction, but there is enough in itself difficult, engineers need sufficient force to focus on the research question, there is enough depth understanding of the relevant algorithms to be able to model the ultimate transferred and then solve the problem. Title of these engineers generally "algorithm engineer."

Data mining engineer

Some of the challenges facing the team is not limited to a certain specific issues, but rather how complex business logic into the algorithm, the model problem, to take advantage of massive data to solve this problem. Engineers do not need this kind of problem in the algorithm was to explore in depth enough, but requires sufficient breadth and cross skills. They need to understand the common machine learning algorithms, and know the pros and cons of various algorithms. At the same time they also have the ability to quickly understand the business, know the source of the data, and the whereabouts of the process of processing data and a high degree of sensitivity. Title to engineer such "data mining engineer" majority.

From a technical perspective Leader of the personnel requirements, where job transfer opportunities?

Not a technical Leader does not want to be a group of men tiger. They look forward to the team engineers are able to work independently of each all-rounder.

Based logic, English and other quality is a must, smart, learning ability is the protection of room to grow, the computer needs a solid basis for the future, it is best done development and tuning large-scale clusters, data processing will also be familiar with clustering, classification, recommendation, NLP, and other common neural network algorithm, if it is to achieve before, optimized data applications on top of the better ......

Ah, that's Technical Leader hearts perfect candidate for large data relevant image.

However, if you are recruiting to perfect the standard, I am afraid not many teams can get them. Now big data, data mining itself is no fire up a few years, if you want to find one who has many years of experience in the all-rounder, the difficulty is not generally high. At this point, we have a clear understanding of technology Leader.

However, all-rounder difficult to recruit, does not mean that Leader will lower recruitment requirements. They will not tolerate fighting affected the whole team. Face recruitment difficulties, they will have some corresponding measures -

You can not demand perfection, but requires team members to have their own strengths, can form a complex whole

Just mentioned, in order to find a various conditions are good people for the big data-related positions, very difficult. Technical Leader will thus be more pragmatic to recruit "more suitable person" - absorbing talents have different strengths for different positions.

Hope grid to deep pupil, for example, which is a computer vision field of big data companies, both teams need the algorithm had talent thorough research about the image recognition algorithm model to adjust to the extreme, also need to project strength strong talent the algorithm is trained models to achieve high performance in their products, or to help the team build a set of video image data capture, mark, machine learning, automated testing, product realization platform.

For the former engineer, he needs to learn in depth the algorithm even in the field of visual computing have had in-depth research, programming skills may be of some weaker; while the latter engineer, if he has a powerful engineering capabilities, even without depth learning algorithm-depth research, can quickly take over the corresponding work. Both people need to be at work in close cooperation, and jointly promote the output and optimize the company's products.

Even within the algorithm team of engineers, focus skills between different members may vary.

For example, personalized content recommendation information platform - algorithm team a little information, a portion of the core algorithm engineers will focus on research in question, to solve a very specific problem (such as paper issues were sorted by semantic analysis, how to determine the "title party "the issues, etc.), they need to have enough depth understanding; another part of the engineers, the focus on application algorithm model in the product, they should be very business sense, with powerful analytical capabilities, can from complex business problems in figure things out, the business problems abstract algorithmic problems, and using an appropriate model to solve. Both a research emphasis on the core algorithms, analysis and implementation of a business emphasis, complementary work together to optimize personalized content recommendation experience.

For the latter, because there is no algorithm for core capacity requirements of the former so high, and the ability to pay more attention to the code of business sense, so the team can accommodate a richer context talents, such as supplements has been common knowledge engineer the algorithm, as well as at the graduate level algorithms have some understanding of graduating students.

Employers of big data related to the candidate's experience, background there is a greater acceptance of space, which gives non-associated candidates to enter the big data opportunity big data, algorithms team. At this point, combing aware of their existing skills for the new team of value is very important, which is to promote the new team decided to absorb its own key.

Song Xiang computing service providers UCloud now working in the cloud, over the past four or five years has been committed to study the underlying computer system. In Baidu, he had to provide support for the deep learning algorithm, optimized hardware and the underlying system, speed up the operation speed of the machine learning algorithms. Enter UCloud the beginning, the main research direction of Song Xiang is how to use GPU accelerated computing servers.

Later, taking into account more and more companies rely on machine learning, data mining, UCloud expect to launch a compatible mainstream open source machine learning system Paas, so that engineers use this machine learning platform to focus on training the model itself, regardless of the deployment model, system performance, scalability, computing resources and other issues.

Song Xiang expertise in optimizing the underlying system just to play in this work, so he was immediately given the task of leading this platform to build.

Let the algorithm run quickly enough to the machine to be able to shorten the time model iterations, accelerate the model optimization process. Most likely this algorithm engineers are poorly understood, but Song Xiang can give full play to their strengths, take advantage of the underlying hardware and systems to accelerate the machine learning algorithm.

When the amount of data that needs special training big time, such as tens of PB grade T even when, in a distributed system, I / O or network may become a bottleneck, and then need to intervene systems engineers to see how to optimize data transmission allows I / O utilization rate increase; to see how to store, use or HDFS with Key Value store or other storage that lets you quickly get the data to calculate, or if you use disk storage or SSD storage or in -memory storage. Among these, systems engineers also need to balance the relationship between cost and efficiency.

Systems engineers can also help you design a system that allows engineers to submit algorithm task quickly, or easily train multiple models at the same time, try a number of parameters.

System engineers are very good at the original serial work after the split becomes parallel work. For example the data can be pre-learning calculation and a depth of concurrency, and so on.

In addition to in-depth understanding of the underlying system, he has now also understand the machine learning algorithm. He led a small team, in addition there are two systems engineers, there are two algorithms engineer, he has encouraged two kinds of engineers to learn from each other and improve together, so as to maximize the efficiency of the entire team. If the system engineers do not understand the algorithm, we may not know how to optimize the efficiency of the algorithm running; algorithm engineer should probably understand the different models on the computing speed CPU, GPU machine to help you design a more efficient algorithm.

For ordinary engineers expect large reorientation-related data, the new team once cut through its own good skills, there are more opportunities for horizontal development, help them build more competitive in the big data related fields.

Compared to the current demanding skill level, more attention and a solid foundation for growth

Whatever engineers, employers hope to have the overall quality of talent, rather than demanding one-sided current skill level. Especially for large data related to the field of current market supply less than normal, have to accomplish something in terms of big data, algorithms talent after all, a minority. With a good basis for literacy, and has great potential engineers are very popular with companies favor. These engineers can use existing engineering strength to complete part of the basic work, and after 1-- after two years of training, took over the more complex issues.

We can put big data relevant engineers the ability to model the abstract into the following core skills pyramid

 

The more partial accomplishment at the bottom of the pyramid, the more important for businesses. Bottommost basic literacy, on behalf of future growth. The current rapid development of the Internet, every company is running forward, if a current skills, good engineers, limited room for growth in the future, it may become a burden on businesses.

Basic computer to the next level - the basic algorithms and data structures, a programming language proficient, almost every engineer jobs are important capability. A foundation is not solid programmer, may make companies suspect their learning ability. Solid foundation, it will remove obstacles to learning practical skills, it is easier to establish the depth of understanding; but the mathematical basis for the algorithm to understand help is very important.

This is the bottom two layers form the basis for a quality engineer talent. If the underlying foundation more solid, time to master the skills required for the application layer perhaps more than we expected to be less.

Vice president of deep spiritual pupil grid technology - Deng Yafeng mentioned:

For the field of computer vision algorithm engineers, we certainly hope to recruit both in the basic level or application level, skills are perfect candidates.

But if you algorithms, data structures stronger, a better understanding of the programming language C ++, you learn on the application layer, may be much faster than others. For example, it pays 1-2 years in depth study, paid on an image domain knowledge six months to a year to have a basis for understanding.

After the fact, now more dependent on computer vision depth learning, feature selection are dependent on domain knowledge threshold has come down, so I have seen a lot of people have a good foundation, including the number of graduating students a solid foundation, working in the field of image after six months to a year will be able to get good results.

In view of the large data engineer recruitment, TalkingData Chi Yan Tao, VP of technology and chief data scientist Zhang also mentioned the summer:

Big Data engineers work TalkingData very dependent Spark skills, but understand Spark itself is not so difficult, thus the candidate's skills Spark for me is not the strongest point of attraction.

Compared to the Spark to know more people, I prefer to recruit those who learn better in Java. Because the interface is relatively easy to Spark learning curve, but in order to be proficient in Java is a very difficult thing.

If you learn Java or C ++ through, your understanding of computer technology is not the same. This is actually a problem Road and surgery.

Two Leader TalkingData also give me an example of a home team:

They were enrolled in a college graduate engineers in 14 years, the company has done little in the recommendation algorithm, we will write Hadoop Mapreduce, but did not have in-depth research on big data. The engineer was big data skills and can not achieve TalkingData recruitment standards, but fortunately his clear thinking, look at the problem have their own unique ideas. Java combined with a good foundation, on a company to do things very solid, so I came in recruiting.

Here, the two Leader admitted that "fortunately was not how to pick your resume, perhaps later in accordance with the standard may not be able to engineer hires."

Never thought that the engineer initiative is very strong, Leader just give direction to the work, he would drive himself to learn knowledge, to quickly complete the goal. 2 years later, Spark has the ability to exercise the engineer was very tough, with the Leader's words "can pit one against ten"; he has a keen interest in learning to big data, machine, after Spark lay a solid foundation, and reorientation to the algorithm team of engineers, write the core code TalkingData machine learning platform, this platform greatly improves the efficiency of machine learning team.

From the above example, we can also be a bonus information, compared to the job-hopping job transfer, internal job transfer will be easier. Because in-house, companies have sufficient time to examine the ability of engineers, potential. After the company's engineers to enhance the recognition, will be new challenges more assured.

Zhao Ping, an engineer CreditEase technology research and development centers, before joining CreditEase, he helped the back-end architecture China Mobile set-top box business will be service transformation. Holding a strong interest in the underlying platform architecture, Ping Zhao Yixin added. The first project in his company to do is to design and develop a distributed storage system. After the first project the perfect ending, his ability to learn, the ability to base much praise. When Yixin began to build a big data platform team, Zhao Ping saw his ideal career development and job transfer application submitted, based on the outstanding performance of his past, successfully got the jobs.

After reorientation, Ping Zhao also encountered some challenges, such as large data relating to knowledge, tools need to use richer, Spark, Scala, HBase, MongoDB ..., countless skills need to learn side by side for evil complement; such as the way of thinking, the processing required to flow thinking represented Spark processing paradigm shift timing data in real time from the original. But his solid foundation, as well as a smooth transition before doing a distributed storage system based on experience, combined with the entire team in an atmosphere of good technical assistance, and ultimately the successful completion of the development work of a large data items.

General engineer wishing to transfer big data-related work, some pertinent suggestions

At the end of the article, based on a number of cases we mentioned in the article, to sum up a few tips to help ordinary engineer big data related positions it:

Pay attention to basic. Whether a variety of jobs, infrastructure is the cornerstone of growth.

Their expertise. From their existing expertise to be able to play the start of the job, allowing new team welcome you to join. For example, engineering algorithm model, emphasis on business data mining, big data platform development, machine learning systems development, etc., which work for the average engineer easier to use. Ordinary engineers transferred directly biased research algorithm engineers, more difficult.

well prepared. Please check in advance to learn knowledge, there are hands-on practice better. Without a little preparation, how employers believe you really are interested in this field?

Consider job transfer with the company. In the same company reorientation less resistance. You may also consider adding an emphasis on big data company, and then reorientation.

Finally, if you do have a strong interest in big data, data mining, the best way is to immediately begin to practice. Maybe you would not use it as a career, but more than a technical near the body.

Perhaps these skills for future programmers, like as common now for people in the workplace than MS Office.

 

Guess you like

Origin blog.csdn.net/huasdsadsa/article/details/94210096