A small summary of classification problems about running deep learning models with keras.
What I got was a 2-category model, which needs to be changed to 5 categories. I didn't have a clue at first, so I checked around and solved the problem, and summarized it for later viewing.
The above is the post I found that can solve my problem,
I'm new to Machine Learning, thought I'll start with keras. Here I'm classifying movie reviews as three class classification (positive as 1, neutral as 0 and negative as -1) using binary crossentropy. So, when I'm trying to wrap my keras model with tensorflow estimator, I get the error.
The code is as follows:
It probably means that he wants to run a three-category model, and when he gets the model in hand to run, an error is reported. The code is as follows:
import tensorflow as tf
import numpy as np
import pandas as pd
import numpy as K
csvfilename_train = 'train(cleaned).csv'
csvfilename_test = 'test(cleaned).csv'
# Read .csv files as pandas dataframes
df_train = pd.read_csv(csvfilename_train)
df_test = pd.read_csv(csvfilename_test)
train_sentences = df_train['Comment'].values
test_sentences = df_test['Comment'].values
# Extract labels from dataframes
train_labels = df_train['Sentiment'].values
test_labels = df_test['Sentiment'].values
vocab_size = 10000
embedding_dim = 16
max_length = 30
trunc_type = 'post'
oov_tok = '<OOV>'
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
tokenizer = Tokenizer(num_words = vocab_size, oov_token = oov_tok)
tokenizer.fit_on_texts(train_sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(train_sentences)
padded = pad_sequences(sequences, maxlen = max_length, truncating = trunc_type)
test_sequences = tokenizer.texts_to_sequences(test_sentences)
test_padded = pad_sequences(test_sequences, maxlen = max_length)
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length = max_length),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(6, activation = 'relu'),
tf.keras.layers.Dense(2, activation = 'sigmoid'),
])
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
num_epochs = 10
model.fit(padded, train_labels, epochs = num_epochs, validation_data = (test_padded, test_labels))
Error message:
ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))
Here's my enthusiastic reply:
It pointed out the problem at once. One is that the binary classification of loss should be changed to multi-classification, and the other is that the final Dense layer should be changed to 3, and the label should be converted to one-hot.
Change binary_crossentropy in loss to categorical_crossentropy
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
Then the number of layers of the last Dense layer is changed to 3
tf.keras.layers.Dense(3, activation = 'sigmoid'),
There is also an article by an older brother that is also of great reference value, so I posted it below.
The role of the Dense layer
In keras, it is equivalent to a fully connected layer, which receives the output of the previous Dense as the input of the next Dense, and the last Dense is responsible for the output.
that's all.