Six kinds of reasons for the failure of machine learning, you caught it?

In general, the learning process usually means the first to make mistakes and choose the wrong path, and then think to understand how to avoid these pitfalls in the future. Machine learning is no exception.

When you use your machine learning in the enterprise, be careful: some technical marketing will tell you that the process of machine learning is faster and better, but this is an unrealistic expectation for the technology. The fact is, machine learning process must error. But at least for a considerable period of time, the error will be encoded into the business process. The result is that these errors now occur on a large scale, and usually without direct human control.

SPR consulting firm's chief data scientist Ray Johnson said: "Only blindly desire and lack of pragmatic and hard work can lead to machine learning brings benefits become almost useless."

Machine learning process to detect errors and to deal with them will help you be more successful in terms of technology, and to meet your expectations for machine learning.

Here are some questions about machine learning tool in the learning process mistakes, these problems may cause increase and extend the number of errors wrong time - machine learning tool itself may never be able to identify and correct these errors lessons.

 

Click here to add a caption

 

Lack of understanding of the business problem of machine learning fail

 

Some of the data using machine learning model workers do not really understand the machine learning are trying to solve business problems, and this may give the process introduce errors.

Financial Services web site LendingTree's vice president and director of strategic analysis Akshay Tandon said that when his team using a machine learning tool, he encouraged it starts from the assumption that statement. The statement should be asked what you want to solve the problem is and what you want to build models to solve the problem.

Tandon said that, from a statistical point of view, machine learning tools available today are very powerful. As a proper use of it becomes more significant responsibility because these powerful tools, if not used carefully, can lead to bad decisions and far-reaching. If the data analysis team is not careful, they model the resulting data may not meet the specific team is trying to learn. Rapid deterioration of the result, he said, is what could soon be major incident.

In addition, many business users do not understand that from the moment of production, quality of the model will have decreased to some extent, Tandon said. Recognizing this in mind, like a car or any other machine, users need to constantly monitor it and notice how it influence decision-making.

 

Poor quality data can lead to errors of machine learning

 

Garbage in, garbage out. If the data quality standards, machine learning will be negatively affected. Poor data quality is one of the most worrisome problems of data administrators. No matter how good the data scientists and other professionals working with the original intent of information work, poor data quality could endanger large data analysis and their efforts destroyed. It is entirely possible to make machine learning models chaos.

All walks of life organizations often overestimate the toughness of machine learning algorithms, but underestimated the impact of bad data. Johnson said, poor data quality can lead to poor data results, leading organizations have made unwise business decisions. The results of these decisions will damage business performance and make plans for the future is difficult to get support.

Based on past and present experience, you may find that there is a low quality of the data obtained from the results of machine learning, since these data results look that does not make sense.

Johnson said, exploratory data analysis (EDA) is a proactive approach to solve this problem. EDA basic data quality issues may be identified, for example, wild threshold value, the value of vacancies and inconsistent. You can also use statistical sampling techniques to determine whether there is enough data points to instances fully reflect the overall distribution, and the definition of data quality remediation rules and policies.

 

Machine learning incorrect use

 

Machine learning expert engineer consulting firm Cambridge Consultants of Sally Epstein said: "We still see from the company's most common problem is that companies eager to use machine learning for no other reason, just because is fashionable." But she said, must be properly use this tool to be successful. The traditional engineering methods may provide solutions faster and much lower cost.

Johnson said that when the machine learning may not be the best choice to solve problems and use cases are not fully understood, could lead to solve the wrong problem.

In addition, solve the wrong problem will lead to lose the opportunity, because the organization is working to its customized for specific use cases, inappropriate models. This includes waste of resources deployed to achieve results in terms of personnel and infrastructure, but the outcome could have used a simpler alternative methods to obtain.

To avoid false use of machine learning, consider the desired business results, complexity of the problem, the amount of data and number of attributes. Johnson said the relatively simple problems, such as classification, clustering and association rules using a small amount of property a small amount of data can be handled by visual or statistical analysis. In these cases, the use of machine learning may require more time and resources.

When the amount of data becomes large, machine learning may be more appropriate. However, the first by a machine learning exercises, and then only to find business results have not been clearly defined and lead to solve the wrong problem is not uncommon.

 

Machine learning models may deviate

 

The use of poor quality data set may lead to misleading conclusions. It not only introduces inaccuracies and missing data, also introduce bias. Human certainly there may be bias, the model created by people inspired by or derived may contain bias.

Epstein said that each machine learning algorithm or the unbalanced distribution of the class has a different sensitivity. If you do not solve these problems, you may eventually get the results would be, for example, dependent on the complexion of facial recognition tools, or models with gender bias. In fact, this has happened many times in the commercial service.

The accuracy of the conclusions - whether derived through algorithms or human - are processed depends on the breadth and quality of information. Head of consulting firm Deloitte consulting in the field of analytical services Vic Katyal said organizations and individuals facing algorithm prejudice caused by the financial, legal and reputational risks is why any company should use machine learning to ethics as an example of the organization's requirements.

Katyal said that signs of deviation algorithm has been well documented in the public sphere credit score, education courses, recruitment and criminal judicial decisions. Collection, planning or improper application of data even in the most elaborate application of machine learning and well-planned design introduced bias.

He noted that the inherent bias of machine learning system may cause some customer groups or community stakeholders at a disadvantage, and may result in the continuation or unfair results.

Consulting firm McKinsey said in a report in 2017 noted that the deviation algorithm is one of the biggest risk of machine learning, because it will affect the actual purpose of machine learning. The company said that this is an often overlooked defects, can lead to costly mistakes, if unchecked, may cause the organization to complete the project and the wrong direction.

McKinsey said that if we can effectively solve this problem at the outset, will be richly rewarded, in order to maximize the true potential of machine learning.

 

Insufficient resources to do a good job learning machine

 

When you start the machine learning program, an organization is easy to underestimate the resources needed for its own personnel, and infrastructure. Machine learning may have a lot of demands on infrastructure, especially in the images, video and audio processing.

Johnson said that without the required processing power, but also the timely development and solutions based on machine learning, to better say it is difficult, at worst they are fundamentally impossible.

There is also the deployment and consumer issues. If there are no prerequisites to allow its infrastructure deployment and consumer users of the results, then develop solutions to machine learning what use is it?

Deploy scalable infrastructure to support machine learning can be both expensive and difficult to maintain. However, there are several cloud services can provide scalable machine learning platform can be configured as needed. Johnson said the cloud machine learning methods can be carried out on a large scale, without being physical hardware acquisition, configuration, and deployment constraints.

Some organizations want their infrastructure internalization. If this is the case, it can be used as a stepping stone to cloud services and educational experience, so after a lot of these organizations can understand before investing in infrastructure from the perspective of what machine learning needs.

From the staff point of view, the lack of knowledgeable resources, such as data scientists and engineers, machine learning, machine learning may make the development and deployment on track. Machine learning has to understand concepts and their application and interpretation of talent, in order to determine whether to implement a specific business results, which is crucial.

Johnson said, can not underestimate the importance of having a wealth of machine learning skills. Knowledgeable people who can help identify data quality issues to ensure proper use and deployment of machine learning tools, and help establish best practices and management strategies.

 

Lack of poor planning and management will destroy the machine learning

 

Machine learning efforts could start with enthusiasm, but then lost momentum and stalled. This indicates that poor planning, lack of management.

If you do not take appropriate guidelines and restrictions, machine learning, work will continue indefinitely, could lead to huge expenditure of resources and will not get any benefit, Johnson said.

Organizations need to keep in mind that machine learning is an iterative process, the model may be modified over time and continue to occur, in order to support the changing needs. As a result, people may engage in machine learning complete lack of interest in work, which could lead to undesirable results. The project sponsor may turn to other work, ultimately working machine learning stagnate.

Johnson said that the need for regular monitoring of machine learning to work to ensure that things run smoothly. If the progress started to slow down, it may be time to take a break and re-examine the project.

Click here to add a caption

END

 

Click here to add a caption

 

Bi-mao wonderful classroom courses recommended:

1.Cloudera data analysis courses;

2.Spark Hadoop development and training;

3. Big Data machine learning recommendation systems;

4.Python data analysis and machine learning practical;

 

Click here to add a caption

 

For more details, please look at our public number: Bi Mao data - Course Products - Bi-mao classroom

Register now interactive learning was massive currency, a large number of quality courses free delivery!

 

Guess you like

Origin blog.csdn.net/ShuYunBIGDATA/article/details/91369342