Feature Engineering for Machine Learning-"Feature Engineering for Machine Learning"


The book "Feature Engineering for Machine Learning" published by O'Reilly Media, Inc. (domestic translation "Proficient in Feature Engineering") can be said to be a treasure of feature engineering. This article is based on the English version translated by the well-known open source apachecn organization. The original text was modified into jupyter notebook format, and some codes were added and modified, and all the tests passed. This information can be said to be a treasure of feature engineering and is worth recommending.


Feature Engineering for Machine Learning-"Feature Engineering for Machine Learning"

Information Description

"Feature Engineering for Machine Learning" is translated by the well-known open source apachecn organization. The original English book can be tried online (for free for 10 days). The trial address:

https://www.oreilly.com/library/view/feature-engineering-for/9781491953235/

This book can be said to be a collection of feature engineering and is worth recommending.

After getting the consent of apachecn, this site polished the translated version and implemented the code, modified the original text into jupyter notebook format, and added and modified part of the code, all the tests passed, and all the data sets have been downloaded on Baidu cloud.

The translation code is available for download in the data science github warehouse, the warehouse address:

https://github.com/fengdu78/Data-Science-Notes/tree/master/9.feature-engineering

Note: The translated version of this article is different from the "Proficient Feature Project" published by People's Posts and Telecommunications Press, and it is an independent completion.

File Directory

  • I. Introduction

  • Second, the unique skills of simple numbers

  • 3. Text data: expand, filter and block

  • 4. The effect of feature scaling: from bag of words to TF-IDF

  • Five, category characteristics: egg counting in the era of robotic chickens

  • Sixth, dimensionality reduction: use PCA to compress the data set

  • Seven, nonlinear feature extraction and model stacking

  • 8. Automated feature extractor: image feature extraction and deep learning

  • 9. Back to features: put them together (updated)

  • Appendices, linear models, and basic linear algebra

brief introduction

Chapter 1 starts with the basic feature engineering of digital data: filtering, merging, scaling, log conversion and energy conversion, and interactive functions.

Chapters 2 and 3 delve into feature engineering of natural text: bag-of-words, n-gram and phrase detection.

Chapter 4 uses tf-idf as an example of feature scaling and discusses how it works.

Chapter 5 discusses efficient coding techniques for categorical variables, including feature hashing and bin-counting.

Principal component analysis is conducted in Chapter 6, and we dive into the field of machine learning.

Chapter 7 treats k-means as a characterization technique, which illustrates the effective theory of model stacking.

Chapter 8 is all about images, which are more challenging than text data in feature extraction. Before coming to the explanation that deep learning is the latest image feature extraction technology, we focus on two manual feature extraction techniques SIFT and HOG.

In Chapter 9, we completed several different techniques in an end-to-end example and created a recommender for the academic paper data set.
Content screenshot

Feature Engineering for Machine Learning-"Feature Engineering for Machine Learning"

Feature Engineering for Machine Learning-"Feature Engineering for Machine Learning"

to sum up

In this article, "Feature Engineering for Machine Learning" is modified into jupyter notebook format, all tests are passed, and download is provided.

The warehouse address of the translated code:

https://github.com/fengdu78/Data-Science-Notes/tree/master/9.feature-engineering

reference

https://www.oreilly.com/library/view/feature-engineering-for/9781491953235/
https://github.com/alicezheng/feature-engineering-book
https://github.com/apachecn/feature-engineering-for-ml-zh

Guess you like

Origin blog.51cto.com/15064630/2578644