Neo4j-import import data using the relationships and

background

Here Insert Picture Description
We understand what the festival is drawing database as a characteristic neo4j object of study of the advantages and disadvantages and basic environmental structures.
Now we talk about the call records stored in the data in csv into neo4j go, and you can go to query the data and the relationships imported by cql

1. Select Import manner

There are a lot of import neo4j way, I probably summed up:

  1. Cypher CREATE statement, write a CREATE for each piece of data
  2. The Cypher the LOAD CSV statement data into the CSV format, data is read by LOAD CSV.
  3. Official API provided by the Java - Batch Inserter
  4. Prepared by Daniel Batch Import Tool
  5. The official neo4j-import tool

Comparative advantages and disadvantages:

create statement load csv statement Batch Inseter Batch Import neo4j-import
Applicable scene 1~1w nodes 1w~10w nodes Million or more nodes Million or more nodes Million or more nodes
speed Very slow (1000 nodes / s) General (5000 nodes / s) Very fast (tens of thousands of nodes / s) Very fast (tens of thousands of nodes / s) Very fast (tens of thousands of nodes / s)
advantage Easy to use, real-time insertion. Easy to use, can be loaded locally Remote CSV; real-time insertion Based Batch Inserter, can be compiled to run directly jar package; can import data already exists in the database Official produced, than the Batch Import occupy fewer resources
Shortcoming Slow We need to convert the data into a csv Need to turn into the CSV; only in the JAVA; must be stopped and inserted when neo4j We need to turn into a CSV; must stop neo4j Need to turn into the CSV; Neo4j must stop; only generate a new database, the data can not be inserted already exists in the database

There are many ways to import can be seen, due to the large amount of data we import, so I choose here is the last Neo4j-Import , you can also choose to import other way

neo4j-import use

We opened neo4j-import using the Web site can see this excerpt

Super Fast Batch Importer For Huge Datasets LOAD CSV is great for
importing small – medium sized data, i.e. up to the 10M records range.
For large data sets, i.e. in the 100B records range, we have access to
a specialized bulk importer.

We want to use it to import similar order data into Neo4j: customers,
orders and contained products.

The tool is located in path/to/neo4j/bin/neo4j-import and is used as
follows:

The general meaning of this passage is that we can not use load csv large amount of data to meet our business needs, so we had to choose a new way to import, here we have chosen neo4j-import this way, the following is an imported example

bin/neo4j-import --into retail.db --id-type string \
                 --nodes:Customer customers.csv --nodes products.csv  \
                 --nodes orders_header.csv,orders1.csv,orders2.csv \
                 --relationships:CONTAINS order_details.csv \
                 --relationships:ORDERED customer_orders_header.csv,orders1.csv,orders2.csv

   
   

Data structure for example:
If you call the neo4j-importscript without parameters, it will list a comprehensive help page.

This --into retail.dbis clearly the target database, which can not contain an existing database .

Repeat --nodesand --relationshipsparameters are the same entity of a plurality of (possibly split) CSV file group, i.e., having the same column configuration.

All files of each group are considered to be connected into one big file. A first group of a file header line is required, it may even be contained therein may be easier to handle and edit a document than a multi-GB single-line text file. Also supports compressed files.

  1. customers.csvAs with direct :Customernode label is introduced, directly from the property file.
  2. For the :LABELsame is true get node labels column product.
  3. Orders node from three files, one title and two content files.
  4. Enter :CONTAINSthe order_details.csvline item relations through its ID to create, including orders and included in the product.
  5. Order by using the order again csv file to connect to customers, but this time using a different header that: IGNORE is not related to the column
    which -id-type string represents all: ID column contains alphanumeric values (only digital ID to be optimization).

Column name for the node property names and relationships, there are some additional specific column mark

  • name:ID - Global id column for later reconnected through the column for the node,
    • If you leave the property name, it will not be stored (temporarily), this is the -id-type within the meaning of
    • If you have repeated cross-entity id, you must provide the entity (id-group) in brackets :ID(Order)
    • If your ID is globally unique, you can turn it off
  • : LABEL - Label column node, a plurality of labels may be separated by a delimiter
  • :START_ID, :END_ID- dossier column, the reference node ID, used for id-groups used: END_ID (Order)
  • :TYPE - relational columns
  • All other columns are considered property, but if the skip is empty or when the comment: IGNORE
  • Type conversion by adding the name of the back, such as by :INT, :BOOLEANetc.
Import data call log

In csv after finishing our phone records is data:

  1. phones.csv recorded telephone number list as nodes node
    Here Insert Picture Description

  2. phone_header header file only one line of data
    phone:ID

  3. The file record information call.csv call records, after the addition as established relationships and the relationship between the properties of
    Here Insert Picture Description
    the first row from left to right field meaning:
    150 **** 136 **** 0743 to 5301 playing a total of 125 when the minute-long phone calls, an average of 125 minutes a

  4. call_header.csv call record header information
    Here Insert Picture Description
    herein :START_IDrefers to the relationship between the starting point, :END_IDit refers to the relationship between the termination point

After these csv file ready, we will write a shell script to execute these files.

import()
{
    #导入命令
    neo4j stop 
    cd /usr/local/Cellar/neo4j/3.5.0/libexec/data/databases
    rm -rf graph.db
    cd /Documents/归档/data
    neo4j-admin import \
    --database=graph.db
    --nodes:phone="../phone_header.csv,phones.csv \
    --ignore-duplicate-nodes=true \
    --ignore-missing-nodes=true \
    --relationships:call="../call_header.csv,call.csv"
    neo4j start
}

   
   
  • Here we prevent the new database already exists, we have chosen to remove the existing library and then import
  • I remember first close neo4j

View Results

After the import is completed we open the browser look neo4j result of import
we open http: // localhost: 7474 / browser
/ First, let's look at the Database Information
222
Here we can see the existing number of nodes, how many relationships, database information storage space occupied
and then we see a phone number social circle:

match (p:phone{phone:"13825259929"})-[r]->(o) return p,o,r;

   
   
  • 1

Here Insert Picture Description
When the mouse over the corresponding nodes and relationships, the bottom will be corresponding properties appear
now our data import is complete
then we use springboot + neo4j + d3 to show a person's phone records circle.

On one: Neo4j (a) of FIG acquaintance database neo4j.

Original Address: https: //blog.csdn.net/qq_32519415/article/details/87942379

Guess you like

Origin www.cnblogs.com/jpfss/p/11289669.html