Nosql数据库:Neo4j图数据库【py2neo:对Neo4j数据库进行增删改查的python第三方库】

一、Neo4j简介

在这里插入图片描述

neo4j是由Java实现的开源NoSQL图数据库.自从2003年开始研发, 到2007年发布第一版, 最新版本为3.3.5, neo4j现如今已经被各行各业的数十万家公司和组织采用.

neo4j实现了专业数据库级别的图数据模型的存储. 与普通的图处理或内存级数据库不同, neo4j提供了完整的数据库特性, 包括ACID事物的支持, 集群支持, 备份与故障转移等. 这使其适合于企业级生产环境下的各种应用.

案例展示:红楼梦人物关系图

在这里插入图片描述

Neo4j W3C教程:Neo4j–W3CSchool教程

neo4j的版本说明:

  • 企业版: 需要高额的付费获得授权, 提供高可用, 热备份等性能.
  • 社区开源版: 免费使用, 但只能单点运行.

neo4j图形数据库的有关概念:

在这里插入图片描述

节点

  • 节点是主要的数据元素, 节点通过关系连接到其他节点, 节点可以具有一个或多个属性
  • (即存储为键/值对的属性), 节点有一个或多个标签, 用于描述其在图表中的作用. 示例: Person>节点.
  • 可以将节点类比为关系型数据库中的表, 对应的标签可以类比为不同的表名, 属性就是表中的列.

关系

  • 关系连接两个节点, 关系是方向性的, 关系可以有一个或多个属性(即存储为键/值对的属性

属性

  • 属性是命名值, 其中名称(或键)是字符串, 属性可以被索引和约束, 可以从多个属性创
  • 建复合索引.

标签

  • 标签用于组节点到集, 节点可以具有多个标签, 对标签进行索引以加速在图中查找节点.

二、neo4j图数据库的安装

neo4j图数据库依赖于Java,所以要先按照Jdk

neo4j图数据库的安装流程:

  • 第一步: 将neo4j安装信息载入到yum检索列表.
  • 第二步: 使用yum install命令安装.
  • 第三步: 修改配置文件内容 /etc/neo4j/neo4j.conf.
  • 第四步: 启动neo4j数据库.

1、第一步: 将neo4j安装信息载入到yum检索列表

对于CentOS系统安装Neo4j,需要手动安装Yum源

(base) [root@whx ~]# cd /tmp
(base) [root@whx tmp]# wget http://debian.neo4j.org/neotechnology.gpg.key
(base) [root@whx tmp]# sudo rpm --import neotechnology.gpg.key
  • 其中cd /tmp 为导航到系统tmp目录下;
  • 然后使用wget命令将安装配置文件neotechnology.gpg.key下载到当前目录;、
  • 再使用sudo rpm --import neotechnology.gpg.key命令将安装配置文件导入到系统中。

2、第二步: 文本编辑器创建一个/etc/yum.repos.d/neo4j.repo内容:

(base) [root@whx ~]# vim /etc/yum.repos.d/neo4j.repo
[neo4j] 
name=Neo4j RPM Repository
baseurl=http://yum.neo4j.org/stable
enabled=1
gpgcheck=1

3、第三步: 我们就可以使用yum命令安装neo4j。

yum install neo4j-3.3.5

至此在CentOS系统下Neo4j已安装完毕。下面是安装后Neo4j的文件路径:

  1. Neo4j安装目录为:/usr/share/neo4j
  2. Neo4j的属性文件所在目录为: /etc/neo4j
  3. Neo4j默认的数据库文件保存目录为: /var/lib/neo4j

我们导航到/usr/share/neo4j/bin 运行目录下,运行:neo4j start命令就可以启动neo4j数据库了。

4、第四步:修改配置文件

默认在/etc/neo4j/neo4j.conf, 为了方便显示下面把一些修改显示在这里

# 数据库的存储库存储位置、日志位置等
dbms.directories.data=/var/lib/neo4j/data
dbms.directories.plugins=/var/lib/neo4j/plugins
dbms.directories.certificates=/var/lib/neo4j/certificates
dbms.directories.logs=/var/log/neo4j
dbms.directories.lib=/usr/share/neo4j/lib
dbms.directories.run=/var/run/neo4j

# 导入的位置
dbms.directories.import=/var/lib/neo4j/import

# 初始化内存大小
dbms.memory.heap.initial_size=512m

# Bolt 连接地址
dbms.connector.bolt.enabled=true
dbms.connector.bolt.tls_level=OPTIONAL
dbms.connector.bolt.listen_address=0.0.0.0:7687

修改后的完整配置文件:

#*****************************************************************
# Neo4j configuration
#
# For more details and a complete list of settings, please see
# https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
#*****************************************************************

# The name of the database to mount
#dbms.active_database=graph.db

# Paths of directories in the installation.
# 数据库的存储库存储位置、日志位置等
dbms.directories.data=/var/lib/neo4j/data
dbms.directories.plugins=/var/lib/neo4j/plugins
dbms.directories.certificates=/var/lib/neo4j/certificates
dbms.directories.logs=/var/log/neo4j
dbms.directories.lib=/usr/share/neo4j/lib
dbms.directories.run=/var/run/neo4j

# This setting constrains all `LOAD CSV` import files to be under the `import` directory. Remove or comment it out to
# allow files to be loaded from anywhere in the filesystem; this introduces possible security problems. See the
# `LOAD CSV` section of the manual for details.
# 导入的位置
dbms.directories.import=/var/lib/neo4j/import

# Whether requests to Neo4j are authenticated.
# To disable authentication, uncomment this line
#dbms.security.auth_enabled=false

# Enable this to be able to upgrade a store from an older version.
#dbms.allow_upgrade=true

# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size.
dbms.memory.heap.initial_size=512m
#dbms.memory.heap.max_size=10g

# The amount of memory to use for mapping the store files, in bytes (or
# kilobytes with the 'k' suffix, megabytes with 'm' and gigabytes with 'g').
# If Neo4j is running on a dedicated server, then it is generally recommended
# to leave about 2-4 gigabytes for the operating system, give the JVM enough
# heap to hold all your transaction state and query context, and then leave the
# rest for the page cache.
# The default page cache memory assumes the machine is dedicated to running
# Neo4j, and is heuristically set to 50% of RAM minus the max Java heap size.
#dbms.memory.pagecache.size=10g

#*****************************************************************
# Network connector configuration
#*****************************************************************

# With default configuration Neo4j only accepts local connections.
# To accept non-local connections, uncomment this line:
dbms.connectors.default_listen_address=0.0.0.0

# You can also choose a specific network interface, and configure a non-default
# port for each connector, by setting their individual listen_address.

# The address at which this server can be reached by its clients. This may be the server's IP address or DNS name, or
# it may be the address of a reverse proxy which sits in front of the server. This setting may be overridden for
# individual connectors below.
dbms.connectors.default_advertised_address=0.0.0.0

# You can also choose a specific advertised hostname or IP address, and
# configure an advertised port for each connector, by setting their
# individual advertised_address.

# Bolt connector
# Bolt 连接地址
dbms.connector.bolt.enabled=true
dbms.connector.bolt.tls_level=OPTIONAL
dbms.connector.bolt.listen_address=0.0.0.0:7687

# HTTP Connector. There must be exactly one HTTP connector.
dbms.connector.http.enabled=true
dbms.connector.http.listen_address=0.0.0.0:7474

# HTTPS Connector. There can be zero or one HTTPS connectors.
#dbms.connector.https.enabled=true
#dbms.connector.https.listen_address=:7473

# Number of Neo4j worker threads.
#dbms.threads.worker_count=

#*****************************************************************
# SSL system configuration
#*****************************************************************

# Names of the SSL policies to be used for the respective components.

# The legacy policy is a special policy which is not defined in
# the policy configuration section, but rather derives from
# dbms.directories.certificates and associated files
# (by default: neo4j.key and neo4j.cert). Its use will be deprecated.

# The policies to be used for connectors.
#
# N.B: Note that a connector must be configured to support/require
#      SSL/TLS for the policy to actually be utilized.
#
# see: dbms.connector.*.tls_level

#bolt.ssl_policy=legacy
#https.ssl_policy=legacy

#*****************************************************************
# SSL policy configuration
#*****************************************************************

# Each policy is configured under a separate namespace, e.g.
#    dbms.ssl.policy.<policyname>.*
#
# The example settings below are for a new policy named 'default'.

# The base directory for cryptographic objects. Each policy will by
# default look for its associated objects (keys, certificates, ...)
# under the base directory.
#
# Every such setting can be overriden using a full path to
# the respective object, but every policy will by default look
# for cryptographic objects in its base location.
#
# Mandatory setting

#dbms.ssl.policy.default.base_directory=certificates/default

# Allows the generation of a fresh private key and a self-signed
# certificate if none are found in the expected locations. It is
# recommended to turn this off again after keys have been generated.
#
# Keys should in general be generated and distributed offline
# by a trusted certificate authority (CA) and not by utilizing
# this mode.

#dbms.ssl.policy.default.allow_key_generation=false

# Enabling this makes it so that this policy ignores the contents
# of the trusted_dir and simply resorts to trusting everything.
#
# Use of this mode is discouraged. It would offer encryption but no security.

#dbms.ssl.policy.default.trust_all=false

# The private key for the default SSL policy. By default a file
# named private.key is expected under the base directory of the policy.
# It is mandatory that a key can be found or generated.

#dbms.ssl.policy.default.private_key=

# The private key for the default SSL policy. By default a file
# named public.crt is expected under the base directory of the policy.
# It is mandatory that a certificate can be found or generated.

#dbms.ssl.policy.default.public_certificate=

# The certificates of trusted parties. By default a directory named
# 'trusted' is expected under the base directory of the policy. It is
# mandatory to create the directory so that it exists, because it cannot
# be auto-created (for security purposes).
#
# To enforce client authentication client_auth must be set to 'require'!

#dbms.ssl.policy.default.trusted_dir=

# Client authentication setting. Values: none, optional, require
# The default is to require client authentication.
#
# Servers are always authenticated unless explicitly overridden
# using the trust_all setting. In a mutual authentication setup this
# should be kept at the default of require and trusted certificates
# must be installed in the trusted_dir.

#dbms.ssl.policy.default.client_auth=require

# A comma-separated list of allowed TLS versions.
# By default only TLSv1.2 is allowed.

#dbms.ssl.policy.default.tls_versions=

# A comma-separated list of allowed ciphers.
# The default ciphers are the defaults of the JVM platform.

#dbms.ssl.policy.default.ciphers=

#*****************************************************************
# Logging configuration
#*****************************************************************

# To enable HTTP logging, uncomment this line
#dbms.logs.http.enabled=true

# Number of HTTP logs to keep.
#dbms.logs.http.rotation.keep_number=5

# Size of each HTTP log that is kept.
#dbms.logs.http.rotation.size=20m

# To enable GC Logging, uncomment this line
#dbms.logs.gc.enabled=true

# GC Logging Options
# see http://docs.oracle.com/cd/E19957-01/819-0084-10/pt_tuningjava.html#wp57013 for more information.
#dbms.logs.gc.options=-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution

# Number of GC logs to keep.
#dbms.logs.gc.rotation.keep_number=5

# Size of each GC log that is kept.
#dbms.logs.gc.rotation.size=20m

# Size threshold for rotation of the debug log. If set to zero then no rotation will occur. Accepts a binary suffix "k",
# "m" or "g".
#dbms.logs.debug.rotation.size=20m

# Maximum number of history files for the internal log.
#dbms.logs.debug.rotation.keep_number=7

#*****************************************************************
# Miscellaneous configuration
#*****************************************************************

# Enable this to specify a parser other than the default one.
#cypher.default_language_version=3.0

# Determines if Cypher will allow using file URLs when loading data using
# `LOAD CSV`. Setting this value to `false` will cause Neo4j to fail `LOAD CSV`
# clauses that load data from the file system.
#dbms.security.allow_csv_import_from_file_urls=true


# Value of the Access-Control-Allow-Origin header sent over any HTTP or HTTPS
# connector. This defaults to '*', which allows broadest compatibility. Note
# that any URI provided here limits HTTP/HTTPS access to that URI only.
#dbms.security.http_access_control_allow_origin=*

# Value of the HTTP Strict-Transport-Security (HSTS) response header. This header
# tells browsers that a webpage should only be accessed using HTTPS instead of HTTP.
# It is attached to every HTTPS response. Setting is not set by default so
# 'Strict-Transport-Security' header is not sent. Value is expected to contain
# dirictives like 'max-age', 'includeSubDomains' and 'preload'.
#dbms.security.http_strict_transport_security=

# Retention policy for transaction logs needed to perform recovery and backups.
dbms.tx_log.rotation.retention_policy=1 days

# Enable a remote shell server which Neo4j Shell clients can log in to.
#dbms.shell.enabled=true
# The network interface IP the shell will listen on (use 0.0.0.0 for all interfaces).
#dbms.shell.host=127.0.0.1
# The port the shell will listen on, default is 1337.
#dbms.shell.port=1337

# Only allow read operations from this Neo4j instance. This mode still requires
# write access to the directory for lock purposes.
#dbms.read_only=false

# Comma separated list of JAX-RS packages containing JAX-RS resources, one
# package name for each mountpoint. The listed package names will be loaded
# under the mountpoints specified. Uncomment this line to mount the
# org.neo4j.examples.server.unmanaged.HelloWorldResource.java from
# neo4j-server-examples under /examples/unmanaged, resulting in a final URL of
# http://localhost:7474/examples/unmanaged/helloworld/{nodeId}
#dbms.unmanaged_extension_classes=org.neo4j.examples.server.unmanaged=/examples/unmanaged

#********************************************************************
# JVM Parameters
#********************************************************************

# G1GC generally strikes a good balance between throughput and tail
# latency, without too much tuning.
dbms.jvm.additional=-XX:+UseG1GC

# Have common exceptions keep producing stack traces, so they can be
# debugged regardless of how often logs are rotated.
dbms.jvm.additional=-XX:-OmitStackTraceInFastThrow

# Make sure that `initmemory` is not only allocated, but committed to
# the process, before starting the database. This reduces memory
# fragmentation, increasing the effectiveness of transparent huge
# pages. It also reduces the possibility of seeing performance drop
# due to heap-growing GC events, where a decrease in available page
# cache leads to an increase in mean IO response time.
# Try reducing the heap memory, if this flag degrades performance.
dbms.jvm.additional=-XX:+AlwaysPreTouch

# Trust that non-static final fields are really final.
# This allows more optimizations and improves overall performance.
# NOTE: Disable this if you use embedded mode, or have extensions or dependencies that may use reflection or
# serialization to change the value of final fields!
dbms.jvm.additional=-XX:+UnlockExperimentalVMOptions
dbms.jvm.additional=-XX:+TrustFinalNonStaticFields

# Disable explicit garbage collection, which is occasionally invoked by the JDK itself.
dbms.jvm.additional=-XX:+DisableExplicitGC

# Remote JMX monitoring, uncomment and adjust the following lines as needed. Absolute paths to jmx.access and
# jmx.password files are required.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/8/docs/technotes/guides/management/agent.html
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
# and have permissions set to 0600.
# For details on setting these file permissions on Windows see:
#     http://docs.oracle.com/javase/8/docs/technotes/guides/management/security-windows.html
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.port=3637
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.authenticate=true
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.ssl=false
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.password.file=/absolute/path/to/conf/jmx.password
#dbms.jvm.additional=-Dcom.sun.management.jmxremote.access.file=/absolute/path/to/conf/jmx.access

# Some systems cannot discover host name automatically, and need this line configured:
#dbms.jvm.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME

# Expand Diffie Hellman (DH) key size from default 1024 to 2048 for DH-RSA cipher suites used in server TLS handshakes.
# This is to protect the server from any potential passive eavesdropping.
dbms.jvm.additional=-Djdk.tls.ephemeralDHKeySize=2048

# This mitigates a DDoS vector.
dbms.jvm.additional=-Djdk.tls.rejectClientInitiatedRenegotiation=true

#********************************************************************
# Wrapper Windows NT/2000/XP Service Properties
#********************************************************************
# WARNING - Do not modify any of these properties when an application
#  using this configuration file has been installed as a service.
#  Please uninstall the service before modifying this section.  The
#  service can then be reinstalled.

# Name of the service
dbms.windows_service_name=neo4j

#********************************************************************
# Other Neo4j system properties
#********************************************************************
dbms.jvm.additional=-Dunsupported.dbms.udc.source=rpm


5、第五步:启动neo4j数据库

启动图数据库并查看状态
neo4j start
neo4j status

终端显示如下, 代表启动成功

(base) [root@whx ~]# neo4j start
Active database: graph.db
Directories in use:
  home:         /var/lib/neo4j
  config:       /etc/neo4j
  logs:         /var/log/neo4j/
  plugins:      /var/lib/neo4j/plugins
  import:       /var/neo4j/import
  data:         /var/lib/neo4j/data
  certificates: /var/lib/neo4j/certificates
  run:          /var/run/neo4j
Starting Neo4j.
Started neo4j (pid 5246). It is available at http://0.0.0.0:7474/
There may be a short delay until the server is ready.
See /var/log/neo4j//neo4j.log for current status.
(base) [root@whx ~]# neo4j status
Neo4j is running at pid 5246
(base) [root@whx ~]# 

6、远程访问Neo4j的可视化界面x.x.x.x:7474/browser

neo4j.conf文件中:

dbms.connector.bolt.address=0.0.0.0:7687     
dbms.connector.http.address=0.0.0.0:7474   

开放防火墙相应的端口

firewall-cmd --zone=public --permanent --add-port=7474/tcp
firewall-cmd --zone=public --permanent --add-port=7687/tcp
firewall-cmd --reload #一定不要忘记这句话
firewall-cmd --list-ports # 查看端口是否打开成功
(base) [root@whx ~]# firewall-cmd --list-ports
20/tcp 21/tcp 22/tcp 80/tcp 8888/tcp 39000-40000/tcp 888/tcp 7474/tcp 7687/tcp
(base) [root@whx ~]# 

在你的浏览器中地址栏输入:http://<服务器ip地址>:7474/browser/,即可看到
在这里插入图片描述

7、第六步: neo4j的可视化管理后台登陆:

  • 访问地址: http://0.0.0.0:7474
  • ConnectURL: bolt://0.0.0.0:7687
  • Username: neo4j
  • Password: neo4j (默认)

在这里插入图片描述


  • 小节总结:

    • neo4j图数据库的安装流程:
      • 第一步: 将neo4j安装信息载入到yum检索列表.
      • 第二步: 使用yum install命令安装.
      • 第三步: 修改配置文件内容 /etc/neo4j/neo4j.conf.
      • 第四步: 启动neo4j数据库.

    • neo4j的可视化管理后台登陆:
      • 访问地址: http://0.0.0.0:7474.
      • ConnectURL: bolt://0.0.0.0:7687
      • Username: neo4j
      • Password: neo4j (默认)

三、Cypher介绍与使用

Cypher的基本概念:Cypher是neo4j图数据的查询语言, 类似于mysql数据库的sql语句, 但是它允许对图形进行富有表现力和有效的查询和更新.

Cypher的基本命令和语法:

  • create命令
  • match命令
  • merge命令
  • relationship关系命令
  • where命令
  • delete命令
  • sort命令
  • 字符串函数
  • 聚合函数
  • index索引命令

1、create命令: 创建图数据中的节点

1.1 创建命令格式一

CREATE (e:Employee{id:222, name:'Bob', salary:6000, deptnp:12})

  • 此处create是关键字, e为’节点’(相当于mysql中的表中的一条记录)的变量名称, Employee为’节点标签’(相当于myslq中的一张表), e和Employee放在小括号里面(),中间用冒号表示关系

  • 后面把所有属于节点标签属性(相当于Myslq中表的列名)放在大括号’{}‘里面, 依次写出属性名称:属性值, 不同属性用逗号’,'分隔

  • 例如下面命令创建一个节点对象e, 节点标签是Employee, 拥有id, name, salary, deptnp四个属性;
    在这里插入图片描述

  • 节点名称 e 是当前语句中的临时变量,节点标签 Employee 才真正保存到图数据库中;

  • 如果你不对对象进行操作就可以不写节点名称 e,比如create操作:

    create(:程序员 {
          
          name:"小东",age:23,birthday:"1995/12/06"})
    >>Added 1 label, created 1 node, set 3 properties, completed after 21 ms.
    
  • 对匹配到或者创建的实例进行操作的时候就需要写节点名称 e,因为只有拿到对象,才能对对象操作,这是必须的.比如下面你要返回检索的结果

    match(person:程序员) where person.name="小东" return person
    ╒══════════════════════════════════════════════╕
    │"person"                                      │
    ╞══════════════════════════════════════════════╡
    │{
          
          "birthday":"1995/12/06","name":"小东","age":23}│
    └──────────────────────────────────────────────┘
    
    match(p1:程序员) where p1.name="小东" return p1
    ╒══════════════════════════════════════════════╕
    │"p1"                                          │
    ╞══════════════════════════════════════════════╡
    │{
          
          "birthday":"1995/12/06","name":"小东","age":23}│
    └──────────────────────────────────────────────┘
    

由上面的例子可以看出,无论你节点起什么名字都无所谓,它相当于是python语言里面的一个变量名,指向一个对象,可以对其进行操作.

所以节点是可写可不写的.需要操作实例的时候就需要写节点,也可以理解为节点是对应实例的变量名.

但是注意标签是一定要写的.标签是Neo4j图数据库的分类.需要根据这个进行搜索.

1.2 创建命令格式二

CREATE (e:Employee) set e.id=222, e.name='Bob', e.salary=6000, e.deptnp=12

比如:

CREATE (e:Employee)  set e.id=222, e.name='Bob', e.salary=6000, e.deptnp=12 return e

2、match命令: 匹配(查询)已有数据.

# match命令专门用来匹配查询, 节点名称:节点标签, 依然放在小括号内, 然后使用return语句返回查询结果, 和SQL很相似.
MATCH (e:Employee) RETURN e.id, e.name, e.salary, e.deptno

在这里插入图片描述

3、merge命令: 若节点存在, 则等效与match命令; 节点不存在, 则等效于create命令.

MERGE (e:Employee {id:146, name:'Lucer', salary:3500, deptno:16})

在这里插入图片描述

然后再次用merge查询, 发现数据库中的数据并没有增加, 因为已经存在相同的数据了, merge匹配成功.

MERGE (e:Employee {id:146, name:'Lucer', salary:3500, deptno:16})

在这里插入图片描述

4、使用create创建关系:

必须创建有方向性的关系, 否则报错

# 创建一个节点p1到p2的有方向关系, 这个关系r的标签为Buy, 代表p1购买了p2, 方向为p1指向p2
CREATE (p1:Profile1)-[r:Buy]->(p2:Profile2)

在这里插入图片描述

5、使用merge创建关系

可以创建有/无方向性的关系.

# 创建一个节点p1到p2的无方向关系, 这个关系r的标签为miss, 代表p1-miss-p2, 方向为相互的
MERGE (p1:Profile1)-[r:miss]-(p2:Profile2)

在这里插入图片描述

6、where命令: 类似于SQL中的添加查询条件

# 查询节点Employee中, id值等于123的那个节点
MATCH (e:Employee) WHERE e.id=123 RETURN e

在这里插入图片描述

7、delete命令: 删除节点/关系及其关联的属性.

# 注意: 删除节点的同时, 也要删除关联的关系边
MATCH (c1:CreditCard)-[r]-(c2:Customer) DELETE c1, r, c2

在这里插入图片描述
match(t:Teacher) delete t

match(s:Student)-[r]-(t:Teacher) delete r,s,t

delete节点时,如果节点之间还有关系会报错
在这里插入图片描述
match(t:Teacher) detach delete t 直接将节点和关系一起删除

快速清空数据库:

MATCH (n)  DETACH DELETE n

8、sort命令: Cypher命令中的排序使用的是order by

# 匹配查询标签Employee, 将所有匹配结果按照id值升序排列后返回结果
MATCH (e:Employee) RETURN e.id, e.name, e.salary, e.deptno ORDER BY e.id

# 如果要按照降序排序, 只需要将ORDER BY e.salary改写为ORDER BY e.salary DESC
MATCH (e:Employee) RETURN e.id, e.name, e.salary, e.deptno ORDER BY e.salary DESC

在这里插入图片描述

9、字符串函数:

  • toUpper()函数
  • toLower()函数
  • substring()函数
  • replace()函数

9.1 toUpper()函数

将一个输入字符串转换为大写字母.

MATCH (e:Employee) RETURN e.id, toUpper(e.name), e.salary, e.deptno

在这里插入图片描述

9.2 toLower()函数

讲一个输入字符串转换为小写字母.

MATCH (e:Employee) RETURN e.id, toLower(e.name), e.salary, e.deptno

在这里插入图片描述

9.3 substring()函数

返回一个子字符串.

# 输入字符串为input_str, 返回从索引start_index开始, 到end_index-1结束的子字符串
substring(input_str, start_index, end_index)

# 示例代码, 返回员工名字的前两个字母
MATCH (e:Employee) RETURN e.id, substring(e.name,0,2), e.salary, e.deptno

在这里插入图片描述

9.4 replace()函数

替换掉子字符串.

# 输入字符串为input_str, 将输入字符串中符合origin_str的部分, 替换成new_str
replace(input_str, origin_str, new_str)

# 示例代码, 将员工名字替换为添加后缀_HelloWorld
MATCH (e:Employee) RETURN e.id, replace(e.name,e.name,e.name + "_HelloWorld"), e.salary, e.deptno

在这里插入图片描述

10、聚合函数

  • count()函数
  • max()函数
  • min()函数
  • sum()函数
  • avg()函数

10.1 count()函数

返回由match命令匹配成功的条数.

# 返回匹配标签Employee成功的记录个数
MATCH (e:Employee) RETURN count( * )

在这里插入图片描述

10.2 max()函数

返回由match命令匹配成功的记录中的最大值.

# 返回匹配标签Employee成功的记录中, 最高的工资数字
MATCH (e:Employee) RETURN max(e.salary)

在这里插入图片描述

10.3 min()函数

返回由match命令匹配成功的记录中的最小值.

# 返回匹配标签Employee成功的记录中, 最低的工资数字
MATCH (e:Employee) RETURN min(e.salary)

在这里插入图片描述

10.4 sum()函数

返回由match命令匹配成功的记录中某字段的全部加和值.

# 返回匹配标签Employee成功的记录中, 所有员工工资的和
MATCH (e:Employee) RETURN sum(e.salary)

在这里插入图片描述

10.5 avg()函数

返回由match命令匹配成功的记录中某字段的平均值.

# 返回匹配标签Employee成功的记录中, 所有员工工资的平均值
MATCH (e:Employee) RETURN avg(e.salary)

在这里插入图片描述

11、索引index

Neo4j支持在节点或关系属性上的索引, 以提高查询的性能.

可以为具有相同标签名称的所有节点的属性创建索引.


11.1 创建索引

使用create index on来创建索引.

# 创建节点Employee上面属性id的索引
CREATE INDEX ON:Employee(id)

在这里插入图片描述

11.2 删除索引

使用drop index on来删除索引.

# 删除节点Employee上面属性id的索引
DROP INDEX ON:Employee(id)

在这里插入图片描述

小节总结:

Cypher的基本概念:Cypher是neo4j图数据的查询语言, 类似于mysql数据库的sql语句, 但是它允许对图形进行富有表现力和有效的查询和更新.

Cypher的基本命令和语法:
* create命令
* match命令
* merge命令
* relationship关系命令
* where命令
* delete命令
* sort命令
* 字符串函数
* 聚合函数
* index索引命令


  • create命令: 创建图数据中的节点.
    • CREATE (e:Employee{id:222, name:‘Bob’, salary:6000, deptnp:12})

  • match命令: 匹配(查询)已有数据.
    • MATCH (e:Employee) RETURN e.id, e.name, e.salary, e.deptno

  • merge命令: 若节点存在, 则等效与match命令; 节点不存在, 则等效于create命令.
    • MERGE (e:Employee {id:145, name:‘Lucy’, salary:7500, deptno:12})

  • 使用create创建关系: 必须创建有方向性的关系, 否则报错.
    • CREATE (p1:Profile1)-[r:Buy]->(p2:Profile2)

  • 使用merge创建关系: 可以创建有/无方向性的关系.
    • MERGE (p1:Profile1)-[r:miss]-(p2:Profile2)

  • where命令: 类似于SQL中的添加查询条件.
    • MATCH (e:Employee) WHERE e.id=123 RETURN e

  • delete命令: 删除节点/关系及其关联的属性.
    • MATCH (c1:CreditCard)-[r]-(c2:Customer) DELETE c1, r, c2

  • sort命令: Cypher命令中的排序使用的是order by.
    • MATCH (e:Employee) RETURN e.id, e.name, e.salary, e.deptno ORDER BY e.id

  • 字符串函数:
    • toUpper()函数
    • toLower()函数
    • substring()函数
    • replace()函数

  • toUpper()函数: 将一个输入字符串转换为大写字母.
    • MATCH (e:Employee) RETURN e.id, toUpper(e.name), e.salary, e.deptno

  • toLower()函数: 讲一个输入字符串转换为小写字母.
    • MATCH (e:Employee) RETURN e.id, toLower(e.name), e.salary, e.deptno

  • substring()函数: 返回一个子字符串.
    • MATCH (e:Employee) RETURN e.id, substring(e.name,0,2), e.salary, e.deptno

  • replace()函数: 替换掉子字符串.
    • MATCH (e:Employee) RETURN e.id, replace(e.name,e.name,e.name + “_HelloWorld”), e.salary, e.deptno

  • 聚合函数
    • count()函数
    • max()函数
    • min()函数
    • sum()函数
    • avg()函数

  • count()函数: 返回由match命令匹配成功的条数.
    • MATCH (e:Employee) RETURN count( * )

  • max()函数: 返回由match命令匹配成功的记录中的最大值.
    • MATCH (e:Employee) RETURN max(e.salary)

  • min()函数: 返回由match命令匹配成功的记录中的最小值.
    • MATCH (e:Employee) RETURN min(e.salary)

  • sum()函数: 返回由match命令匹配成功的记录中某字段的全部加和值.
    • MATCH (e:Employee) RETURN sum(e.salary)

  • avg()函数: 返回由match命令匹配成功的记录中某字段的平均值.
    • MATCH (e:Employee) RETURN avg(e.salary)

  • 索引index
    • Neo4j支持在节点或关系属性上的索引, 以提高查询的性能.
    • 可以为具有相同标签名称的所有节点的属性创建索引.

  • 创建索引: 使用create index on来创建索引.
    • CREATE INDEX ON:Employee(id)

  • 删除索引: 使用drop index on来删除索引.
    • DROP INDEX ON:Employee(id)

四、在Python中使用neo4j

neo4j-driver简介: neo4j-driver是一个python中的package, 作为python中neo4j的驱动, 帮助我们在python程序中更好的使用图数据库.

1、neo4j-driver的安装:

pip install neo4j-driver

2、neo4j-driver使用演示:

from neo4j import GraphDatabase

# 关于neo4j数据库的用户名,密码信息已经配置在同目录下的config.py文件中
from config import NEO4J_CONFIG

driver = GraphDatabase.driver( **NEO4J_CONFIG)

# 直接用python代码形式访问节点Company, 并返回所有节点信息
with driver.session() as session:
    cypher = "CREATE(c:Company) SET c.name='黑马程序员' RETURN c.name"
    record = session.run(cypher)
    result = list(map(lambda x: x[0], record))
    print("result:", result)

输出效果:

result: 黑马程序员

3、事务的概念

如果一组数据库操作要么全部发生要么一步也不执行,我们称该组处理步骤为一个事务, 它是数据库一致性的保证.

使用事务的演示:

def _some_operations(tx, cat_name, mouse_name):
    tx.run("MERGE (a:Cat{name: $cat_name})"
           "MERGE (b:Mouse{name: $mouse_name})"
           "MERGE (a)-[r:And]-(b)",
           cat_name=cat_name, mouse_name=mouse_name)


with driver.session() as session:
    session.write_transaction(_some_operations, "Tom", "Jerry")

输出效果:

在这里插入图片描述

五、Python连接Neo4j工具:py2neo

# -*- coding: utf-8 -*-
from py2neo import Node, Graph, Relationship,NodeMatcher

# 将excel中数据存入neo4j
class DataToNeo4j(object):
    def __init__(self):
        link = Graph("http://localhost:7474", username="neo4j", password="123456")  # 建立连接
        self.graph = link
        self.buy = 'buy'    # 定义label
        self.sell = 'sell'  # 定义label
        self.graph.delete_all() # 清空数据库
        self.matcher = NodeMatcher(link)    # 匹配关系的方法

        """ 示例
        node3 = Node('animal' , name = 'cat')
        node4 = Node('animal' , name = 'dog')  
        node2 = Node('Person' , name = 'Alice')
        node1 = Node('Person' , name = 'Bob')  
        r1 = Relationship(node2 , 'know' , node1)    
        r2 = Relationship(node1 , 'know' , node3) 
        r3 = Relationship(node2 , 'has' , node3) 
        r4 = Relationship(node4 , 'has' , node2)    
        self.graph.create(node1)
        self.graph.create(node2)
        self.graph.create(node3)
        self.graph.create(node4)
        self.graph.create(r1)
        self.graph.create(r2)
        self.graph.create(r3)
        self.graph.create(r4)
        """
    # 建立节点
    def create_node(self, node_buy_key,node_sell_key):
        for name in node_buy_key:
            buy_node = Node(self.buy, name=name)
            self.graph.create(buy_node)
        for name in node_sell_key:
            sell_node = Node(self.sell, name=name)
            self.graph.create(sell_node)
            
    # 建立关系
    def create_relation(self, df_data):
        m = 0
        for m in range(0, len(df_data)):
            try:    
                print(list(self.matcher.match(self.buy).where("_.name=" + "'" + df_data['buy'][m] + "'")), list(self.matcher.match(self.sell).where("_.name=" + "'" + df_data['sell'][m] + "'")))
                relation = Relationship(self.matcher.match(self.buy).where("_.name=" + "'" + df_data['buy'][m] + "'").first(),
                                        df_data['money'][m],
                                        self.matcher.match(self.sell).where("_.name=" + "'" + df_data['sell'][m] + "'").first()
                                       )
                self.graph.create(relation)
            except AttributeError as e:
                print(e, m)
            
# -*- coding: utf-8 -*-
from dataToNeo4jClass.DataToNeo4jClass import DataToNeo4j
import os
import pandas as pd
#pip install py2neo==5.0b1 注意版本,要不对应不了【可以先阅读下文档:https://py2neo.org/v4/index.html】

invoice_data = pd.read_excel('./Invoice_data_Demo.xls', header=0)
print("invoice_data = {}".format(invoice_data))

# 从excel文件中抽取“节点”数据
def data_extraction():
    node_buy_key, node_sell_key = [], []
    for i in range(0, len(invoice_data)):   # 取出购买方名称到list
        node_buy_key.append(invoice_data['购买方名称'][i])
    for i in range(0, len(invoice_data)):   # 取出销售方名称到list
        node_sell_key.append(invoice_data['销售方名称'][i])
    node_buy_key, node_sell_key = list(set(node_buy_key)), list(set(node_sell_key))   # 去重
    return node_buy_key, node_sell_key

# 从excel文件中抽取“关系”数据
def relation_extraction():
    links_dict, sell_list, money_list, buy_list = {
    
    }, [], [], []
    for i in range(0, len(invoice_data)):
        money_list.append(invoice_data[invoice_data.columns[19]][i])#金额
        sell_list.append(invoice_data[invoice_data.columns[10]][i])#销售方方名称
        buy_list.append(invoice_data[invoice_data.columns[6]][i])#购买方名称
    sell_list, buy_list, money_list = [str(i) for i in sell_list], [str(i) for i in buy_list], [str(i) for i in money_list] # 将数据中int类型全部转成string
    links_dict['buy'], links_dict['money'], links_dict['sell']= buy_list, money_list, sell_list # 整合数据,将三个list整合成一个dict
    df_data = pd.DataFrame(links_dict)  # 将数据转成DataFrame
    print("df_data= {}".format(df_data))
    return df_data

relation_extraction()
create_data = DataToNeo4j()
create_data.create_node(data_extraction()[0], data_extraction()[1])
create_data.create_relation(relation_extraction())

输出结果:

在这里插入代码片

六、Neo4j常见问题

1、如果浏览器端已经打开了Neo4j可视化界面,则在服务器端启动Neo4j时报错

Store and its lock file has been locked by another process: /var/lib/neo4j/data/databases/graph.db/store_lock

Starting Neo4j failed: Component ‘org.neo4j.server.database.LifecycleManagingDatabase@1458ed9c’ was successfully initialized, but failed to start. Please see the attached cause exception “Externally locked: /var/lib/neo4j/data/databases/graph.db/neostore”.

猜你喜欢

转载自blog.csdn.net/u013250861/article/details/115057564