background
We understand what the festival is drawing database as a characteristic neo4j object of study of the advantages and disadvantages and basic environmental structures.
Now we talk about the call records stored in the data in csv into neo4j go, and you can go to query the data and the relationships imported by cql
1. Select Import manner
There are a lot of import neo4j way, I probably summed up:
- Cypher CREATE statement, write a CREATE for each piece of data
- The Cypher the LOAD CSV statement data into the CSV format, data is read by LOAD CSV.
- Official API provided by the Java - Batch Inserter
- Prepared by Daniel Batch Import Tool
- The official neo4j-import tool
Comparative advantages and disadvantages:
create statement | load csv statement | Batch Inseter | Batch Import | neo4j-import | |
---|---|---|---|---|---|
Applicable scene | 1~1w nodes | 1w~10w nodes | Million or more nodes | Million or more nodes | Million or more nodes |
speed | Very slow (1000 nodes / s) | General (5000 nodes / s) | Very fast (tens of thousands of nodes / s) | Very fast (tens of thousands of nodes / s) | Very fast (tens of thousands of nodes / s) |
advantage | Easy to use, real-time insertion. | Easy to use, can be loaded locally | Remote CSV; real-time insertion | Based Batch Inserter, can be compiled to run directly jar package; can import data already exists in the database | Official produced, than the Batch Import occupy fewer resources |
Shortcoming | Slow | We need to convert the data into a csv | Need to turn into the CSV; only in the JAVA; must be stopped and inserted when neo4j | We need to turn into a CSV; must stop neo4j | Need to turn into the CSV; Neo4j must stop; only generate a new database, the data can not be inserted already exists in the database |
There are many ways to import can be seen, due to the large amount of data we import, so I choose here is the last Neo4j-Import , you can also choose to import other way
neo4j-import use
We opened neo4j-import using the Web site can see this excerpt
Super Fast Batch Importer For Huge Datasets LOAD CSV is great for
importing small – medium sized data, i.e. up to the 10M records range.
For large data sets, i.e. in the 100B records range, we have access to
a specialized bulk importer.We want to use it to import similar order data into Neo4j: customers,
orders and contained products.The tool is located in path/to/neo4j/bin/neo4j-import and is used as
follows:
The general meaning of this passage is that we can not use load csv large amount of data to meet our business needs, so we had to choose a new way to import, here we have chosen neo4j-import this way, the following is an imported example
bin/neo4j-import --into retail.db --id-type string \
--nodes:Customer customers.csv --nodes products.csv \
--nodes orders_header.csv,orders1.csv,orders2.csv \
--relationships:CONTAINS order_details.csv \
--relationships:ORDERED customer_orders_header.csv,orders1.csv,orders2.csv
Data structure for example:
If you call the neo4j-import
script without parameters, it will list a comprehensive help page.
This --into retail.db
is clearly the target database, which can not contain an existing database .
Repeat --nodes
and --relationships
parameters are the same entity of a plurality of (possibly split) CSV file group, i.e., having the same column configuration.
All files of each group are considered to be connected into one big file. A first group of a file header line is required, it may even be contained therein may be easier to handle and edit a document than a multi-GB single-line text file. Also supports compressed files.
customers.csv
As with direct:Customer
node label is introduced, directly from the property file.- For the
:LABEL
same is true get node labels column product. - Orders node from three files, one title and two content files.
- Enter
:CONTAINS
theorder_details.csv
line item relations through its ID to create, including orders and included in the product. - Order by using the order again csv file to connect to customers, but this time using a different header that: IGNORE is not related to the column
which -id-type string represents all: ID column contains alphanumeric values (only digital ID to be optimization).
Column name for the node property names and relationships, there are some additional specific column mark
name:ID
- Global id column for later reconnected through the column for the node,- If you leave the property name, it will not be stored (temporarily), this is the -id-type within the meaning of
- If you have repeated cross-entity id, you must provide the entity (id-group) in brackets
:ID(Order)
- If your ID is globally unique, you can turn it off
- : LABEL - Label column node, a plurality of labels may be separated by a delimiter
:START_ID
,:END_ID
- dossier column, the reference node ID, used for id-groups used: END_ID (Order):TYPE
- relational columns- All other columns are considered property, but if the skip is empty or when the comment: IGNORE
- Type conversion by adding the name of the back, such as by
:INT
,:BOOLEAN
etc.
Import data call log
In csv after finishing our phone records is data:
-
phones.csv recorded telephone number list as nodes node
-
phone_header header file only one line of data
phone:ID
-
The file record information call.csv call records, after the addition as established relationships and the relationship between the properties of
the first row from left to right field meaning:
150 **** 136 **** 0743 to 5301 playing a total of 125 when the minute-long phone calls, an average of 125 minutes a -
call_header.csv call record header information
herein:START_ID
refers to the relationship between the starting point,:END_ID
it refers to the relationship between the termination point
After these csv file ready, we will write a shell script to execute these files.
import()
{
#导入命令
neo4j stop
cd /usr/local/Cellar/neo4j/3.5.0/libexec/data/databases
rm -rf graph.db
cd /Documents/归档/data
neo4j-admin import \
--database=graph.db
--nodes:phone="../phone_header.csv,phones.csv \
--ignore-duplicate-nodes=true \
--ignore-missing-nodes=true \
--relationships:call="../call_header.csv,call.csv"
neo4j start
}
- Here we prevent the new database already exists, we have chosen to remove the existing library and then import
- I remember first close neo4j
View Results
After the import is completed we open the browser look neo4j result of import
we open http: // localhost: 7474 / browser
/ First, let's look at the Database Information
Here we can see the existing number of nodes, how many relationships, database information storage space occupied
and then we see a phone number social circle:
match (p:phone{phone:"13825259929"})-[r]->(o) return p,o,r;
- 1
When the mouse over the corresponding nodes and relationships, the bottom will be corresponding properties appear
now our data import is complete
then we use springboot + neo4j + d3 to show a person's phone records circle.