Sparse vector to dense vector in spark

After the multi-column feature values ​​are combined through VectorAssembler, due to the spark storage format, a row with many zero values ​​will be converted into a sparse vector sparseVector
for storage. However, in the subsequent calculation process, what we need is a dense vector, so we need to convert the sparse vector to a dense vector.

1. First use VectorAssembler to convert the required columns into vector columns
. 2. After converting to RDD, use map operation to convert the elements in the feature column to DenseVector and
insert the picture description here.
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44695793/article/details/109135659