JSONPath库:使用类似Xpath的语法解析JSON数据

简介

JSONPath是一种信息抽取类库,是从JSON文档中抽取指定信息的工具,提供多种语言实现版本,包括:Javascript, Python, PHP 和 Java。JsonPath 对于 JSON 来说,相当于 XPath 对于 XML。

官方文档:http://goessner.net/articles/JsonPath
安装方法:pip install jsonpath

JsonPath与Xpath语法对比

Json结构清晰,可读性高,复杂度低,非常容易匹配,下表中对应了XPath的用法。

XPath JSONPath 描述
/ $ 根节点
. @ 现行节点v
/ .or[] 取子节点
n/a 取父节点,Jsonpath未支持
// 就是不管位置,选择所有符合条件的条件
* * 匹配所有元素节点
@ n/a 根据属性访问,Json不支持,因为Json是个Key-value递归结构,不需要。
[] [] 迭代器标示(可以在里边做简单的迭代操作,如数组下标,根据内容选值等)
[,] 支持迭代器中做多选。
[] ?() 支持过滤操作.
n/a () 支持表达式计算
() n/a 分组,JsonPath不支持

JSONPath表达式支持点号标记法和括号标记法。$.store.book[0].title等价于$['store']['book'][0]['title']

使用方法

JSONPath的API非常简单,最常用的就是jsonpath函数。

jsonpath函数的原型如下:jsonpath(obj, expr, result_type='VALUE', debug=0, use_eval=True)

参数如下:

  • obj:类型为字典,即JSON数据。必备参数。
  • expr:类型为字符串,即JSONPath表达式。必备参数。
  • result_type:类型为字符串,即返回值类型。可选参数。
    • 默认值为'VALUE',返回匹配的值。
    • IPATH'',以列表形式返回匹配值的JSONPath路径索引。
    • 其他值,返回匹配值的JSONPath表达式字符串。

返回值:类型为列表或False,无匹配值时返回值为False

案例:JSONPath返回值类型

import jsonpath
from pprint import pprint

jsonobj ={
    
     "store": {
    
    
    "book": [ 
      {
    
     "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      {
    
     "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      {
    
     "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      {
    
     "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
    
    
      "color": "red",
      "price": 19.95
    }
  }
}

# 返回值
all_prices = jsonpath.jsonpath(jsonobj,'$..price')
pprint(all_prices)
# 返回JSONPATH路径索引列表
all_prices = jsonpath.jsonpath(jsonobj,'$..price',result_type='IPATH')
pprint(all_prices)
# 返回JSONPATH表达式字符串
all_prices = jsonpath.jsonpath(jsonobj,'$..price',result_type='')
pprint(all_prices)

结果为:

[8.95, 12.99, 8.99, 22.99, 19.95]
[['store', 'book', '0', 'price'],
 ['store', 'book', '1', 'price'],
 ['store', 'book', '2', 'price'],
 ['store', 'book', '3', 'price'],
 ['store', 'bicycle', 'price']]
["$['store']['book'][0]['price']",
 "$['store']['book'][1]['price']",
 "$['store']['book'][2]['price']",
 "$['store']['book'][3]['price']",
 "$['store']['bicycle']['price']"]

案例:JSONPath综合应用

示例数据如下:

{
    
     "store": {
    
    
    "book": [ 
      {
    
     "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      {
    
     "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      {
    
     "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      {
    
     "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
    
    
      "color": "red",
      "price": 19.95
    }
  }
}
XPath JSONPath 含义
/store/book/author $.store.book[*].author 商店中所有图书的作者
//author $..author 所有作者
/store/* $.store.* 商店中的所有商品
/store//price $.store..price 商店中所有商品的价格
//book[3] $..book[2] 第三本书的信息
//book[last()] $..book[(@.length-1)] $..book[-1:] 最后一本书的信息
//book[position()<3] $..book[0,1] $..book[:2] 前两本书的信息
//book[isbn] $..book[?(@.isbn)] 筛选所有带isbn属性的图书
//book[price<10] $..book[?(@.price<10)] 筛选价格低于10元的图书
//* $..* 整个JSON
import jsonpath
from pprint import pprint

jsonobj ={
    
     "store": {
    
    
    "book": [ 
      {
    
     "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      {
    
     "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      {
    
     "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      {
    
     "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
    
    
      "color": "red",
      "price": 19.95
    }
  }
}
# 返回所有图书的价格
book_prices = jsonpath.jsonpath(jsonobj,'$.store.book[*].price')
pprint(book_prices)
# 返回所有商品的价格
all_prices = jsonpath.jsonpath(jsonobj,'$..price')
pprint(all_prices)
# 返回商店中所有商品的价格
store_prices = jsonpath.jsonpath(jsonobj,'$.store..price')
pprint(store_prices)
# 返回第三本书的信息
book3 = jsonpath.jsonpath(jsonobj,'$..book[2]')
pprint(book3)
# 返回第三本书的属性列表
book_attr = jsonpath.jsonpath(jsonobj,'$..book[2][*]')
pprint(book_attr)
# 返回第三本书的标题和作者
book_attr = jsonpath.jsonpath(jsonobj,'$..book[2][title,author]')
pprint(book_attr)
# 返回前两本书的信息
books12 = jsonpath.jsonpath(jsonobj,'$..book[:2]')
pprint(books12)
# 返回包含isbn属性的图书信息
isbns = jsonpath.jsonpath(jsonobj,'$..book[?(@.isbn)]')
pprint(isbns)
# 返回价格低于10元的图书
price10 = jsonpath.jsonpath(jsonobj,'$..book[?(@.price<10)]')
pprint(price10)

结果为:

[8.95, 12.99, 8.99, 22.99]
[8.95, 12.99, 8.99, 22.99, 19.95]
[8.95, 12.99, 8.99, 22.99, 19.95]
[{
    
    'author': 'Herman Melville',
  'category': 'fiction',
  'isbn': '0-553-21311-3',
  'price': 8.99,
  'title': 'Moby Dick'}]
[{
    
    'author': 'Nigel Rees',
  'category': 'reference',
  'price': 8.95,
  'title': 'Sayings of the Century'},
 {
    
    'author': 'Evelyn Waugh',
  'category': 'fiction',
  'price': 12.99,
  'title': 'Sword of Honour'}]
['fiction', 'Herman Melville', 'Moby Dick', '0-553-21311-3', 8.99]
['Moby Dick', 'Herman Melville']
[{
    
    'author': 'Herman Melville',
  'category': 'fiction',
  'isbn': '0-553-21311-3',
  'price': 8.99,
  'title': 'Moby Dick'},
 {
    
    'author': 'J. R. R. Tolkien',
  'category': 'fiction',
  'isbn': '0-395-19395-8',
  'price': 22.99,
  'title': 'The Lord of the Rings'}]
[{
    
    'author': 'Nigel Rees',
  'category': 'reference',
  'price': 8.95,
  'title': 'Sayings of the Century'},
 {
    
    'author': 'Herman Melville',
  'category': 'fiction',
  'isbn': '0-553-21311-3',
  'price': 8.99,
  'title': 'Moby Dick'}]

猜你喜欢

转载自blog.csdn.net/mighty13/article/details/119719178