pyspark 保序回归

from pyspark.ml.regression import IsotonicRegression
from pyspark.sql import SparkSession

spark= SparkSession\
                .builder \
                .appName("dataFrame") \
                .getOrCreate()
# Loads data.
dataset = spark.read.format("libsvm")\
    .load("/home/luogan/lg/softinstall/spark-2.2.0-bin-hadoop2.7/data/mllib/sample_isotonic_regression_libsvm_data.txt")

# Trains an isotonic regression model.
model = IsotonicRegression().fit(dataset)
print("Boundaries in increasing order: %s\n" % str(model.boundaries))
print("Predictions associated with the boundaries: %s\n" % str(model.predictions))

# Makes predictions.
model.transform(dataset).show()
Boundaries in increasing order: [0.01,0.17,0.18,0.27,0.28,0.29,0.3,0.31,0.34,0.35,0.36,0.41,0.42,0.71,0.72,0.74,0.75,0.76,0.77,0.78,0.79,0.8,0.81,0.82,0.83,0.84,0.85,0.86,0.87,0.88,0.89,1.0]

Predictions associated with the boundaries: [0.15715271294117644,0.15715271294117644,0.189138196,0.189138196,0.20040796,0.29576747,0.43396226,0.5081591025000001,0.5081591025000001,0.54156043,0.5504844466666667,0.5504844466666667,0.563929967,0.563929967,0.5660377366666667,0.5660377366666667,0.56603774,0.57929628,0.64762876,0.66241713,0.67210607,0.67210607,0.674655785,0.674655785,0.73890872,0.73992861,0.84242733,0.89673636,0.89673636,0.90719021,0.9272055075,0.9272055075]

+----------+--------------+-------------------+
|     label|      features|         prediction|
+----------+--------------+-------------------+
|0.24579296|(1,[0],[0.01])|0.15715271294117644|
|0.28505864|(1,[0],[0.02])|0.15715271294117644|
|0.31208567|(1,[0],[0.03])|0.15715271294117644|
|0.35900051|(1,[0],[0.04])|0.15715271294117644|
|0.35747068|(1,[0],[0.05])|0.15715271294117644|
|0.16675166|(1,[0],[0.06])|0.15715271294117644|
|0.17491076|(1,[0],[0.07])|0.15715271294117644|
| 0.0418154|(1,[0],[0.08])|0.15715271294117644|
|0.04793473|(1,[0],[0.09])|0.15715271294117644|
|0.03926568| (1,[0],[0.1])|0.15715271294117644|
|0.12952575|(1,[0],[0.11])|0.15715271294117644|
|       0.0|(1,[0],[0.12])|0.15715271294117644|
|0.01376849|(1,[0],[0.13])|0.15715271294117644|
|0.13105558|(1,[0],[0.14])|0.15715271294117644|
|0.08873024|(1,[0],[0.15])|0.15715271294117644|
|0.12595614|(1,[0],[0.16])|0.15715271294117644|
|0.15247323|(1,[0],[0.17])|0.15715271294117644|
|0.25956145|(1,[0],[0.18])|        0.189138196|
|0.20040796|(1,[0],[0.19])|        0.189138196|
|0.19581846| (1,[0],[0.2])|        0.189138196|
+----------+--------------+-------------------+
only showing top 20 rows

猜你喜欢

转载自blog.csdn.net/luoganttcc/article/details/80635063