feature_column related interface

In TensorFlow, the feature column ( Feature column ) is the interface between the original data and the Estimator, which tells the Estimator how to use the data.

The original data set contains a variety of features, some of which are numeric, such as age, length, and speed; some are text, such as addresses, email content, database query statements, etc.

The input accepted by the neural network can only be a numerical value, and it is a sorted numerical value

Therefore, a bridge is needed between the original data and the input requirements of the neural network, and this bridge is the feature column ( Feature column )

  • Use feature columns to convert categorical features into one-hotencoded features, build continuous features into bucketed features, and generate cross features for multiple features, etc.

  • To create feature columns, call  tf.feature_columnthe module's functions. The nine functions commonly used in this module are shown in the figure below, all nine functions return one  Categorical-Columnor an  Dense-Columnobject, but do not return  bucketized_column, which inherits from these two classes.

  • Note: All types must be converted into types Catogorical Columnbefore they can be passed into the DNN model!indicator_columnDense Column

To create a feature column ,  a function of the tf.feature_column  module needs to be called. There are eight types of feature columns in TensorFlow V1.8, and the corresponding nine functions are:

1. Numeric column ( tf.feature_column.numeric_column )

2,Bucketized column ( tf.feature_column.bucketized_column )

3. Categorical identity column ( tf.feature_column.categorical_column_with_identity )

4. Categorical vocabulary column ( tf.feature_column.categorical_column_with_vocabulary_list  or  tf.feature_column.categorical_column_with_vocabulary_file )

5. The hashed column ( tf.feature_column.categorical_column_with_hash_bucket )

6. Combined column ( tf.feature_column.crossed_column )

7. Indicator column ( tf.feature_column.indicator_column )

8. Embedding column ( tf.feature_column.embedding_column )

Notes on passing feature columns to Estimator :

As the list below shows, not all Estimators support all types of feature_columns parameters:

1, LinearClassifier  and  LinearRegressor : Accept all types of feature columns.

2, DNNClassifier  and  DNNRegressor : only accept dense columns. Columns of other types must be wrapped in indicator_column or embedding_column.

3,DNNLinearCombinedClassifier 和 DNNLinearCombinedRegressor

       a). The linear_feature_columns parameter accepts any feature column type.

       b). The dnn_feature_columns parameter only accepts dense columns.

https://www.tensorflow.org/api_docs/python/tf/feature_column/sequence_categorical_column_with_vocabulary_file?hl=enicon-default.png?t=LA46https://www.tensorflow.org/api_docs/python/tf/feature_column/sequence_categorical_column_with_vocabulary_file?hl=en

categorical_column_with_vocabulary_list
import tensorflow as tf

pets = {'pets': ['rabbit','pig','dog','mouse','cat']}  

column = tf.feature_column.categorical_column_with_vocabulary_list(
    key='pets',
    vocabulary_list=['cat','dog','rabbit','pig'], 
    dtype=tf.string, 
    default_value=-1,
    num_oov_buckets=3)

indicator = tf.feature_column.indicator_column(column)
tensor = tf.feature_column.input_layer(pets, [indicator])

with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    session.run(tf.tables_initializer())
    print(session.run([tensor]))

references

This article is very well written

Tensorflow stepping pit collection 1. feature_column - Xiaoqi in the wind and rain - blog garden stepping pit content includes the following input and output types of feature_column, using a data set to give demo feature_column connected to estimator feature_column connected to Keras feature_co https://www.cnblogs.com /gogoSandy/p/12435372.htmlLearn the API of feature engineering in TensorFlow - Cloud + Community - It is already known to the public that Tencent Cloud uses the TensorFlow framework to build neural networks. Today we will talk about how to use TensorFlow to perform feature engineering on data. https://cloud.tencent.com/developer/article/1467606

Exploration on feature processing of tf.feature_column - Zhihu 1. Background tf.estimator is an advanced API interface of tensorflow. Its biggest feature is that it is compatible with both distributed and stand-alone scenarios. Engineers can realize stand-alone training under the same code structure. Distributed training can also be realized. It is precisely because of this feature that many companies including Ali currently... https://zhuanlan.zhihu.com/p/73701872

tensorflow2.x study notes twenty: feature column feature_column - Gray letter network (software development blog aggregation) https://www.freesion.com/article/2669486257/ [1024 Happy Programmer's Day] TensorFlow feature engineering: feature_column is 1024 today Programmer's Day, I wish all programmers' dreams come true and no bugs! \x0a\x0aWhen using many models, it is necessary to perform necessary feature engineering processing on the input data. The most typical ones are: one-hot processing, hash bucketing and other processing. And tensorflow provides a series of feature engineering methods to facilitate the use https://mp.weixin.qq.com/s/lYFRuw0V67QTbY5ytTCBaA

Guess you like

Origin blog.csdn.net/u013385018/article/details/120373761