code directly
I am using GBDT and the prediction results are as follows:
give only key codes
gbdt = GBTClassifier(featuresCol="features", labelCol="y", predictionCol="prediction",)
df_train_eval = model.transform(df_train)
df_train_eval.select(*['y', 'rawPrediction', 'probability', 'prediction']).show(truncate=False)
+---+----------------------------------------+-----------------------------------------+----------+
|y |rawPrediction |probability |prediction|
+---+----------------------------------------+-----------------------------------------+----------+
|1 |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0 |
|1 |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0 |
|0 |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0 |
|0 |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0 |
|0 |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0 |
|0 |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0 |
|0 |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0 |
|1 |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0 |
|1 |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0 |
|0 |[1.5435020027249835,-1.5435020027249835]|[0.9563534785727067,0.043646521427293306]|0.0 |
|1 |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0 |
|1 |[-1.5435020027249835,1.5435020027249835]|[0.04364652142729318,0.9563534785727068] |1.0 |
+---+----------------------------------------+-----------------------------------------+----------+
- y: original label
- rawPrediction: raw prediction value
- probability: probability value
- prediction: predict the result
Why is it said to be softmax instead of sigmoid?
Reason one:
Let's look at the column of probability values probability
. Each row adds up to 1, which is in line with softmax.mutually exclusivein principle.
Cause 2:
def softmax(x):
e_x = np.exp(x)
return e_x / e_x.sum()
def sigmoid(x):
return 1. / (1 + np.exp(-x))
By taking the original predicted value rawPrediction
into the softmax and sigmoid functions, it can also be proved that its result is the result of softmax.