graphlab中.apply只处理前100个数

运行下列代码后,发现products新家的great列,只处理前100个数。

def great_count(word_count_vector):
    if 'great' in word_count_vector:
        return word_count_vector['great']
    else:
        return 0
products['great'] = products['word_count'].apply(great_count)
products['great'].show()

研究后发现,是apply的用法,当第二个参数dtype不设置时,就只计算前100个数,因此,使用时,需要加上dtype。如下
def great_count(word_count_vector):
    if 'great' in word_count_vector:
        return word_count_vector['great']
    else:
        return 0
products['great'] = products['word_count'].apply(great_count, int)
products['great'].show()



猜你喜欢

转载自blog.csdn.net/weixin_41770169/article/details/80808393