Neo4j实现欺诈检测

案例来源于 Neo4j权威指南 第七章

图的node的labels包括:

    账户持有人:AccountHolder

    地址:Address

    电话:PhoneNumber

    信用卡账号:CreditCard

    SSN号码:SSN

    银行账号:BankAccount

    无抵押贷款:UnsecureLoan

图的relationship是账户持有人has其他的node,如HAS_SSN

建图的cypher语句如下:

扫描二维码关注公众号,回复: 10553476 查看本文章
//创建三个人的账户
create (accountHolder1:AccountHolder{
FirstName:"John",
LastName:"Doe",
UniqueId:"JohnDoe"
})

create (accountHolder2:AccountHolder{
FirstName:"Jane",
LastName:"Appleseed",
UniqueId:"JaneAppleseed"
})

create (accountHolder3:AccountHolder{
FirstName:"Matt",
LastName:"Smith",
UniqueId:"MattSmith"
})

//创建地址
create (address1:Address{
Street:"123 NW 1st Street",
City:"San Francisco",
State:"California",
ZipCOde:"94101"
})

//把账户人1、账户人2、账户人3关联到地址上
create (accountHolder1)-[:HAS_ADDRESS]->(address1),
		(accountHolder2)-[:HAS_ADDRESS]->(address1),
		(accountHolder3)-[:HAS_ADDRESS]->(address1)
		
//创建电话号码
create (phoneNumber1:PhoneNumber{
PhoneNumber:"555-555-555"
})

//把账户1、账户2关联到电话号码1上
create (accountHolder1)-[:HAS_PHONENUMBER]->(phoneNumber1),
		(accountHolder2)-[:HAS_PHONENUMBER]->(phoneNumber1)


//创建社会安码 SSN1
create (ssn1:SSN{
SSN:"241-23-1234"
})
//把账户人2、账户人3关联到SSN1
create (accountHolder2)-[:HAS_SSN]->(ssn1),
		(accountHolder3)-[:HAS_SSN]->(ssn1)

//创建社会安全码SSN2 关联到账户人1
create (ssn2:SSN{SSN:"241-23-4567"})<-[:HAS_SSN]-(accountHolder1)

//创建信用卡1 关联到账户1
create (creditCard1:CreditCard{
AccountNumber:"1234567890123456",
Limit:5000,
Balance:1442.23,
ExpirationData:'02-20',
SecurityCode:'456'
})<-[:HAS_CREDITCARD]-(accountHolder1)

//创建信用卡2 关联到账户2
create (creditCard2:CreditCard{
AccountNumber:"2345678901234567",
Limit:4000,
Balance:2345.56,
ExpirationData:'02-20',
SecurityCode:'456'
})<-[:HAS_CREDITCARD]-(accountHolder2)

//创建银行账户1 关联到账户1
create (bankAccount1:BankAccount{
AccountNumber:"2345678901234567",
Balance:7054.43
})<-[:HAS_BANKACCOUNT]-(accountHolder1)

//创建银行账户2 关联到账户2
create (bankAccount2:BankAccount{
AccountNumber:"3456789012345678",
Balance:4231.12
})<-[:HAS_BANKACCOUNT]-(accountHolder2)

//创建银行账户3 关联到账户3
create (bankAccount3:BankAccount{
AccountNumber:"4567890123456789",
Balance:12345.45
})<-[:HAS_BANKACCOUNT]-(accountHolder3)

//创建无抵押贷款2并关联到账户人2
create (unsecuredLoan2:UnsecureLoan{
AccountNumber:'4567890123456789-0',
Balance:90453,
APR:0.0541,
LoanAmount:12000
})<-[:HAS_UNSECUREDLOAN]-(accountHolder2)

//创建无抵押贷款3并关联到账户人3
create (unsecuredLoan3:UnsecureLoan{
AccountNumber:'5678901234567890-0',
Balance:16341.95,
APR:0.0341,
LoanAmount:22000
})<-[:HAS_UNSECUREDLOAN]-(accountHolder3)

//创建电话号码3 并关联到账户3
create (phoneNumber2:PhoneNumber
{PhoneNumber:'555-555-1234'
})<-[:HAS_PHONENUMBER]-(accountHolder3)
return *

通过共同使用账号信息来发现欺诈人或者团伙,查询语句如下:

//书本的写法
match (accountHolder:AccountHolder)-[]->(contractInformation)
with contractInformation,count(accountHolder) as RingSize
match (contractInformation)<-[]-(accountHolder)
with collect(accountHolder.UniqueId) as AccountHolders,
	contractInformation,RingSize
where RingSize>1
return AccountHolders as FraudRing,
		labels(contractInformation) as ContractType,
		RingSize
order by RingSize desc



//更加简洁的写法
match (accountHolder:AccountHolder)-[]->(contractInformation)
 //最小划分粒度contractInformation 相当于按照contractInformation进行了groupby
with contractInformation,count(accountHolder) as RingSize,collect(accountHolder.UniqueId) as AccountHolders
where RingSize>1
return AccountHolders as FraudRing,
		labels(contractInformation) as ContractType,
		RingSize
order by RingSize desc

查询结果如下:

下面是一种得到错误结果的查询语句,mark一下吧:

match (accountHolder:AccountHolder)-[]->(contractInformation)
//这里按照contractInformation的labels进行groupby是错误的
//不同的contractInformation可以有相同的label,如SSN1、SSN2
with count(accountHolder) as RingSize,labels(contractInformation) as ContractType,collect(accountHolder.UniqueId) as AccountHolders
where RingSize>1
return RingSize,ContractType,AccountHolders
order by RingSize desc

计算风险敞口

//计算风险敞口
match (accountHolder:AccountHolder)-[]->(contractInformation)
with contractInformation,count(accountHolder) as RingSize
match (contractInformation)<-[]-(accountHolder),
		(accountHolder)-[r:HAS_CREDITCARD|HAS_UNSECUREDLOAN]->(unsecuredAccount)
with collect(distinct accountHolder.UniqueId) as AccountHolders,contractInformation,RingSize,
	sum(case type(r)
		when "HAS_CREDITCARD" then unsecuredAccount.Limit
		when "HAS_UNSECUREDLOAN" then unsecuredAccount.Balance
		else 0
		end) as FinancialRisk
where RingSize>1
return AccountHolders as FraudRing,
		labels(contractInformation) as ContractType,
		RingSize,
		round(FinancialRisk) as FinancialRisk
order by RingSize desc

结果如下:

删除边和节点的语句:

//删除数据 先删除边,再删除点
MATCH(n:AccountHolder)-[r]->(contractInformation)
DELETE r
MATCH (n:AccountHolder)
DELETE n

以上案例其实并没有很好的展现cypher的优势,一度关系通过关系数据库直接groupy也能得到,后面我找到合适的数据集,会用cypher进行类似的欺诈群体分析

发布了42 篇原创文章 · 获赞 13 · 访问量 3万+

猜你喜欢

转载自blog.csdn.net/wangzhanxidian/article/details/104823263