使用APOC插件可以Neo4j中导入批量数据。
Neo4j 3.x引入了过程和函数的概念。可以通过APOC来方便的操作
APOC库包含许多(约450个)程序和函数,可帮助处理数据集成,图算法或数据转换等
本文环境:win10,neo4j-community-3.5.8
neo4j APOC插件安装
1、下载地址https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/
2、拷贝apoc-<version>-all.jar 到$NEO4J_HOME/plugins/
3、修改$NEO4J_HOME/conf/neo4j.conf,添加
dbms.security.procedures.unrestricted=apoc.*
如果直接添加过其他,则可以通过逗号隔开:
dbms.security.procedures.unrestricted=algo.*,apoc.*
4、$NEO4J_HOME/bin/neo4j restart
5、通过以下命令验证安装是否成功
RETURN apoc.version();
Yelp示例数据导入
定义图模式(约束/索引)
CALL apoc.schema.assert(
{Category:['name']},
{Business:['id'],User:['id'],Review:['id']});
下载business.json数据https://www.yelp.com/dataset/download 或者 https://www.kaggle.com/yelp-dataset/yelp-dataset/version/9#yelp_academic_dataset_business.json
注意!!!!!!:以下加载数据命令需要yelp第10轮的数据才可以正常运行
下载business.json到$NEO4J_HOME/import(neo4j数据导入位置)
配置conf,添加
apoc.import.file.enabled=true
加载数据
CALL apoc.periodic.iterate("
CALL apoc.load.json('file:///business.json') YIELD value RETURN value
","
MERGE (b:Business{id:value.business_id})
SET b += apoc.map.clean(value, ['attributes','hours','business_id','categories','address','postal_code'],[])
WITH b,value.categories as categories
UNWIND categories as category
MERGE (c:Category{id:category})
MERGE (b)-[:IN_CATEGORY]->(c)
",{batchSize: 10000, iterateList: true});
其他数据类似,下载完放到$NEO4J_HOME/import
CALL apoc.periodic.iterate("
CALL apoc.load.json('file:///tip.json') YIELD value RETURN value
","
MATCH (b:Business{id:value.business_id})
MERGE (u:User{id:value.user_id})
MERGE (u)-[:TIP{date:value.date,likes:value.likes}]->(b)
",{batchSize: 20000, iterateList: true});
评论数据
CALL apoc.periodic.iterate("
CALL apoc.load.json('file:///review.json')
YIELD value RETURN value
","
MERGE (b:Business{id:value.business_id})
MERGE (u:User{id:value.user_id})
MERGE (r:Review{id:value.review_id})
MERGE (u)-[:WROTE]->(r)
MERGE (r)-[:REVIEWS]->(b)
SET r += apoc.map.clean(value, ['business_id','user_id','review_id','text'],[0])
",{batchSize: 10000, iterateList: true});
用户数据
CALL apoc.periodic.iterate("
CALL apoc.load.json('file:///user.json')
YIELD value RETURN value
","
MERGE (u:User{id:value.user_id})
SET u += apoc.map.clean(value, ['friends','user_id'],[0])
WITH u,value.friends as friends
UNWIND friends as friend
MERGE (u1:User{id:friend})
MERGE (u)-[:FRIEND]-(u1)
",{batchSize: 100, iterateList: true});