python中递归比较json、列表和字典，显示差别的库，可忽略顺序，可支持正则，可设浮点精度（已上传至pypi，库名jsoncomparedeep）

在做接口自动化测试的时候，经常需要对返回的json比较、做断言。
但是如果返回的json串很大，手写断言就非常的麻烦。

网上虽然有很多轮子，但是都不是特别好用，存在比较严重的这样那样的缺陷。
所以我自己写了一个(功能更新于3月22日，版本1.16)。目前已经完整测试了python26/27,35/36/37/38的兼容性，且已经是公司接口自动化测试的主力断言库。但不管是哪个版本都欢迎继续测试，发现问题请告知我，谢谢！

自己也想写一个库，上传到pypi上给大家用？ 参考我的另一篇博文https://blog.csdn.net/qq_27884799/article/details/96664812

经过多次迭代，功能已经比较完善。目前已上传至pypi（上传pypi一堆坑啊），可以cmd里使用如下命令快速安装和使用（兼容python26+和35+）

pip install jsoncomparedeep

目前pypi上最新版本是1.14。还在使用旧版本的可以用如下命令升级

pip install -U jsoncomparedeep

在python中使用，更详细示例见文末

可以递归显示具体差异发生的位置。默认采用忽略列表顺序的方式比较

from json_compare import Jcompare
cp = Jcompare()
a = {"k1":"v1","k2":["v1", "v3"]}
b = {"k1":"v1","k2":["v4", "v1"]}
print(cp.compare(a, b))

输出

a is {'k1': 'v1', 'k2': ['v1', 'v3']}
b is {'k1': 'v1', 'k2': ['v4', 'v1']}
ignore_list_seq = True, re_compare = True, ignore_path = None
list b at /k2/0
has element that another list hasn't :
'v4'
list a at /k2/1
has element that another list hasn't :
'v3'
False

考虑列表顺序比较

print(cp.compare(a, b, ignore_list_seq=False))

输出

a is {'k1': 'v1', 'k2': ['v1', 'v3']}
b is {'k1': 'v1', 'k2': ['v4', 'v1']}
ignore_list_seq = False, re_compare = True, ignore_path = None
different value at /k2/0
a: 'v1'
b: 'v4'
different value at /k2/1
a: 'v3'
b: 'v1'
False

在上面的基础之上，忽略/k2/0和/k2/1两个位置比较
(即键k2下面，列表下标为0和1的元素不参加比较，所以两个json就被视为一致)

print(cp.compare(a, b, ignore_list_seq=False, ignore_path=["/k2/0", "/k2/1"]))

输出

a is {'k1': 'v1', 'k2': ['v1', 'v3']}
b is {'k1': 'v1', 'k2': ['v4', 'v1']}
ignore_list_seq = False, re_compare = True, ignore_path = ['/k2/0', '/k2/1']
True

使用正则表达式匹配，v.代表以v+任意一个字符，所以能匹配上v3

a = {"k1":"v1","k2":["v1", "v3"]}
b = {"k1":"v1","k2":["v1", "^(v.)"]}
print(cp.compare(a, b, ignore_list_seq=False, re_compare=True))

输出

a is {'k1': 'v1', 'k2': ['v1', 'v3']}
b is {'k1': 'v1', 'k2': ['v1', '^(v.)']}
ignore_list_seq = False, re_compare = True, ignore_path = None
True

使用正则表达式匹配，如果匹配不到结果，或者括号内匹配的内容和实际字符串不相符，匹配就会失败。这里括号外多了一个小数点，导致数字3被迫匹配到括号外侧，从而括号内侧只剩下v而不能通过

a = {"k1":"v1","k2":["v1", "v3"]}
b = {"k1":"v1","k2":["v1", "^(v.*)."]}
print(cp.compare(a, b, ignore_list_seq=False, re_compare=True))

输出

a is {'k1': 'v1', 'k2': ['v1', 'v3']}
b is {'k1': 'v1', 'k2': ['v1', '^(v.)']}
ignore_list_seq = False, re_compare = True, ignore_path = None
re compare failed, found v, expect v3, see next line
different value at /k2/1
a: u'v3'
b: u'^(v.*).'
False

可以兼容不同类型的字符串，可以兼容不同类型的数值，可以兼容元组和列表，还可以拿json字符串和对象比较……

a = ("字节", u"字符", 1)
b = '[1.0, "字符", "字节"]'
print(cp.compare(a, b))

输出

a is ('字节', '字符', 1)
b is [1.0, '字符', '字节']
ignore_list_seq = True, re_compare = True, ignore_path = None
True

当设置属性print_before为False后，再次比较将不再打印调试信息

cp.print_before = False
print(cp.compare(a, b))

只输出

True

以下是1.14新版功能
默认情况下，浮点数比较时会遇到误差累计的情况

a = [0.1+0.1+0.1]
b = [0.3]
print(cp.compare(a, b))

会导致匹配不通过，显示如下

a is [0.30000000000000004]
b is [0.3]
ignore_list_seq = False, re_compare = True, ignore_path = None, float_fuzzy_digits = 0
different value at /0
a: 0.30000000000000004
b: 0.3
False

可以通过指定精度来解决(默认为0，即完全匹配)。精度6，表示允许10的-6次方以内的误差。

cp.float_fuzzy_digits = 6
print(cp.compare(a, b))

则可以正确匹配

a is [0.30000000000000004]
b is [0.3]
ignore_list_seq = False, re_compare = True, ignore_path = None, float_fuzzy_digits = 6
True

如今，指定忽略路径时也开始支持正则表达式。所以以下写法变为可能

a = [{"a": 1, "b": 2}, {"a": 1, "b": 4}]  # also useful under list_seq ignored
b = [{"a": 2, "b": 4}, {"a": 2, "b": 2}]
print(cp.compare(a, b, ignore_path=[r"^(/\d+/a)"]))

因为忽略了所有列表中嵌套的子字典的键a，所以只有键b的值参加比较。又因为采用默认忽略列表顺序的比较，所以键b的值2,4和4,2是相等的，这个匹配会成功

a is [{'a': 1, 'b': 2}, {'a': 1, 'b': 4}]
b is [{'a': 2, 'b': 4}, {'a': 2, 'b': 2}]
ignore_list_seq = True, re_compare = True, ignore_path = ['^(/\\d+/a)'], float_fuzzy_digits = 0
True

全部功能：

可以断定两个对象(或json字串，会自动解析json字串)是否相等。如不等，可以打印出差异项和发生差异的位置
可以兼容各种类型，包括json字符串，也包括json解析后的列表、字典所构成的嵌套对象
可以识别字典、列表和多层嵌套，且可以深入层层嵌套到内部打印，并同时打印出深入的路径
可以和gson等兼容，忽略列表的顺序，也可以选择不忽略。在忽略状态下，[[1,2],[3,4]]和[[4,3],[2,1]]是相等的；如不忽略则不相等
可以兼容处理unicode类型的字符串和str(utf-8编码)类型的字符串。例如对象 [u"你好"] 和 [“你好”] 是相等的；json字串 u’{“name”: “姓名”}’ 和 ‘{“name”: “姓名”}’ 也是相等的
若解析的对象不是合法的json对象，会被断言出来发现
新增，支持正则表达式断言，提供字符串模糊匹配
新增，支持元组格式(适合pymysql的dictCursor查出的结果直接拿来断言)
新增，支持跳过特定路径的项做比较

更新：

2019.7.1 支持模糊匹配整型、长整型、浮点型，1、1L和1.0三者比较，不会再报类型不一致
2019.7.4 修复了字符型json和字典/列表型对象之间比较时，传参为None的bug
2019.7.9 升级多处
- 修复了循环中有些不进一步打印不同点的bug
- 增加了正则表达式断言
- 不同点的展示更友好
- 对不等长的列表，现在支持进一步报告差异了(目前只支持以无序的方式显示不同点。如要有序需动态规划，性价比不高，暂时不纳入)
- 提供了支持python 3的版本
2019.7.12 修复了一些缺陷，通过six和codecs提供了python 2和3都兼容的版本，并完善demo的注释和断言
2019.7.13 支持跳过指定路径的某个或某些项的比较
2019.7.14 做了一些跨平台适配，并上传至PyPi
2019.8.17 跳过指定路径比较时，也支持用正则表达式匹配路径了；支持浮点数模糊比较(自己指定精度)
2019.8.19 修复了一个在字符串json比较时，忽略路径丢失的bug
2020.3.22 修复了python 3.8不兼容的问题，并上传到github

~~缺陷：目前对于不等长的list，只报告不等长，不会报告具体哪些差异，要实现需要一定量的改动，欢迎二次开发和反馈bug。~~
(现在已经部分提供此功能)

这是目前pypi上最新的代码，支持所有新功能，且在windows、linux和mac上均做过测试

#!/usr/bin/env python
# coding: utf-8
# author: Rainy Chan [email protected]
# platform: python 2.6+ or 3.5+
# demos are provided in test_json_compare.py
import json
import re
import traceback
import six
import codecs

NUMBER_TYPES = list(six.integer_types) + [float]


class Jcompare(object):
    def __init__(self, print_before=True, float_fuzzy_digits=0):
        """
        :param bool print_before:  set True to print the objects or strings to compare first, disable it if printed
        :param int float_fuzzy_digits:  the accuracy (number of digits) required when float compare. 0 disables fuzzy
        """
        self.print_before = print_before
        self.float_fuzzy_digits = float_fuzzy_digits
        self._res = None
        self._ignore_list_seq = None
        self._re_compare = True
        self._ignore_path = None

    @staticmethod
    def _tuple_append(t, i):
        return tuple(list(t) + [six.text_type(i)])

    @staticmethod
    def _to_unicode_if_string(strlike):
        if type(strlike) == six.binary_type:
            try:
                return strlike.decode('utf-8')
            except UnicodeDecodeError:
                raise ValueError("decoding string {} failed, may be local encoded".format(repr(strlike)))
        else:
            return strlike

    @staticmethod
    def _to_list_if_tuple(listlike):
        if type(listlike) == tuple:
            return list(listlike)
        else:
            return listlike

    def _common_warp(self, anylike):
        return self._to_list_if_tuple(self._to_unicode_if_string(anylike))

    def _fuzzy_float_equal(self, a, b):
        if self.float_fuzzy_digits:
            return abs(a - b) < 10 ** (-self.float_fuzzy_digits)
        else:
            return a == b

    @staticmethod
    def _modify_a_key(dic, from_key, to_key):
        assert not any([type(to_key) == type(exist_key) and to_key == exist_key for exist_key in
                        dic.keys()]), 'cannot change the key due to key conflicts'
        # cannot use IN here `to_key in dic.keys()`, because u"a" in ["a"] == True
        dic[to_key] = dic.pop(from_key)

    @staticmethod
    def _fuzzy_number_type(value):
        type_dict = {x: float for x in six.integer_types}
        res = type(value)
        return type_dict.get(res, res)

    def _turn_dict_keys_to_unicode(self, dic):
        keys = dic.keys()
        for key in keys:  # a.keys() returns a constant, so it is safe because ak won't change
            if type(key) == six.binary_type:
                self._modify_a_key(dic, key, self._to_unicode_if_string(key))
            else:
                assert type(key) == six.text_type, 'key {} must be string or unicode in dict {}'.format(key, dic)

    def _set_false(self):
        self._res = False

    @staticmethod
    def _escape(s):
        """
        :param s: binary if py2 else unicode
        :return:
        """
        if r'\x' in s:
            s = s.decode('string-escape') if six.PY2 else codecs.escape_decode(s)[0].decode('utf-8')  # no string-escape
        if r'\u' in s:
            s = s.decode('unicode-escape') if six.PY2 else s.encode().decode('unicode-escape')
        if type(s) == six.binary_type:
            s = s.decode('utf-8')  # This often comes from unix servers
        return s

    # difference_print methods
    def _different_type(self, a, b, root):
        self._set_false()
        print("different type at /{}".format("/".join(root)))
        print("a {}: ".format(type(a)) + repr(a))
        print("b {}: ".format(type(b)) + repr(b))

    def _different_value(self, a, b, root):
        self._set_false()
        print("different value at /{}".format("/".join(root)))
        print("a: " + repr(a))
        print("b: " + repr(b))

    def _different_length(self, a, b, root):
        self._set_false()
        print("different length of list at /{}".format("/".join(root)))
        print("len(a)={} : ".format(len(a)) + repr(a))
        print("len(b)={} : ".format(len(b)) + repr(b))

    def _list_item_not_found(self, ele, which, root):
        self._set_false()
        print("list {} at /{}".format(which, "/".join(root)))
        print("has element that another list hasn't :")
        print(repr(ele))

    def _list_freq_not_match(self, root, aplace, bplace, ele, counta, countb):
        self._set_false()
        print(
            "list at /{}, index {}, has different frequency from b at index {}:".format("/".join(root), aplace, bplace))
        print("element is {}".format(ele))
        print("count of list a: {}".format(counta))
        print("count of list b: {}".format(countb))

    def _dict_key_not_found(self, keys, which, root):
        self._set_false()
        print("dict {} at /{}".format(which, "/".join(root)))
        print("has key(s) that another dict hasn't :")
        print(keys)

    # internal compare methods
    def _list_comp(self, a, b, root, printdiff):
        if len(a) != len(b):
            if not printdiff:
                return False
            self._different_length(a, b, root)
            found_b = [False] * len(b)

            for i, a_i in enumerate(a):
                found = False
                for j, b_j in enumerate(b):
                    if self._common_comp(a_i, b_j, printdiff=False):
                        found_b[j] = True
                        found = True
                        break
                if not found:
                    buff = self._tuple_append(root, i)
                    self._list_item_not_found(a_i, "a", buff)
            for j, b_j in enumerate(b):
                if not found_b[j]:
                    buff = self._tuple_append(root, j)
                    self._list_item_not_found(b_j, "b", buff)
            return

        if not self._ignore_list_seq:
            for i in range(min(len(a), len(b))):
                buff = self._tuple_append(root, i)
                if not self._common_comp(a[i], b[i], buff, printdiff):
                    if not printdiff:
                        return False
        else:
            counts_a = [[0, None] for _ in range(len(a))]
            counts_b = [[0, None] for _ in range(len(a))]
            need_to_compare_number = True

            for i in range(len(a)):
                for j in range(len(a)):
                    buff = self._tuple_append(root, len(a) * 10)
                    if self._common_comp(a[i], b[j], buff, printdiff=False):
                        counts_a[i][1] = j
                        counts_a[i][0] += 1
                    if self._common_comp(b[i], a[j], buff, printdiff=False):
                        counts_b[i][1] = j
                        counts_b[i][0] += 1

                if not counts_a[i][0]:
                    if not printdiff:
                        return False
                    need_to_compare_number = False
                    buff = self._tuple_append(root, i)
                    self._list_item_not_found(a[i], "a", buff)

                if not counts_b[i][0]:
                    if not printdiff:
                        return False
                    need_to_compare_number = False
                    buff = self._tuple_append(root, i)
                    self._list_item_not_found(b[i], "b", buff)

            if need_to_compare_number:
                for i in range(len(counts_a)):
                    counta, place = counts_a[i]
                    countb = counts_b[place][0]
                    if countb != counta and counts_b[place][1] == i:  # to prevent printing twice
                        if not printdiff:
                            return False
                        self._list_freq_not_match(root, i, place, a[i], counta, countb)

        if not printdiff:
            return True

    def _dict_comp(self, a, b, root, printdiff):
        self._turn_dict_keys_to_unicode(a)
        self._turn_dict_keys_to_unicode(b)

        ak = a.keys()  # refresh again to make sure it's unicode now
        bk = b.keys()
        diffak = [x for x in ak if x not in bk]
        diffbk = [x for x in bk if x not in ak]
        if diffak:
            if not printdiff:
                return False
            self._dict_key_not_found(diffak, "a", root)
        if diffbk:
            if not printdiff:
                return False
            self._dict_key_not_found(diffbk, "b", root)
        samekeys = [x for x in ak if x in bk]

        for key in samekeys:
            buff = self._tuple_append(root, key)
            if not self._common_comp(a[key], b[key], buff, printdiff):
                if not printdiff:
                    return False

        if not printdiff:
            return True

    def _common_comp(self, a, b, root=(), printdiff=True):
        if self._ignore_path:
            current_path = u"/{}".format("/".join(root))

            for ignore_item in self._ignore_path:
                if ignore_item[0] == "^" or ignore_item[-1] == "$":
                    find = re.findall(ignore_item, current_path)
                    assert len(find) < 2, "shouldn't be this"
                    if find and find[0] == current_path:
                        return True
                else:
                    if u"/{}".format("/".join(root)) == ignore_item:
                        return True

        a = self._common_warp(a)
        b = self._common_warp(b)

        if self._fuzzy_number_type(a) != self._fuzzy_number_type(b):
            if not printdiff:
                return False
            self._different_type(a, b, root)
            return

        if type(a) not in [dict, list]:
            if not self._value_comp(a, b, printdiff):
                if not printdiff:
                    return False
                self._different_value(a, b, root)
            elif not printdiff:
                return True
            return

        if type(a) == list:
            return self._list_comp(a, b, root, printdiff)

        if type(a) == dict:
            return self._dict_comp(a, b, root, printdiff)

        raise TypeError("shouldn't be here")

    def _value_comp(self, a, b, printdiff=True):  # the most base comparison
        if not self._re_compare or type(a) != six.text_type or type(b) != six.text_type:
            if (type(a) == float and type(b) in NUMBER_TYPES) or (type(b) == float and type(a) in NUMBER_TYPES):
                return self._fuzzy_float_equal(a, b)
            else:
                return a == b
        else:
            a_is_re = len(a) > 0 and (a[0] == "^" or a[-1] == "$")
            b_is_re = len(b) > 0 and (b[0] == "^" or b[-1] == "$")  # lazy eval prevents index out of range error
            if not a_is_re and not b_is_re:
                return a == b
            assert not (a_is_re and b_is_re), "can't compare two regular expressions"
            if b_is_re:  # let a be re
                a, b = b, a
            find = re.findall(a, b)
            assert len(find) < 2, "shouldn't be this"
            if not find:
                if printdiff:
                    print("re compare failed, empty match, see next line")
                return False
            if not find[0] == b:
                if printdiff:
                    print("re compare failed, found {}, expect {}, see next line".format(find[0], b))
                return False
            return True

    # user methods
    def compare(self, a, b, ignore_list_seq=True, re_compare=True, ignore_path=None):
        """
        real compare entrance
        :param str or unicode or list or tuple or dict a: the first json string/json-like object to compare
        :param str or unicode or list or tuple or dict b: the second one
        :param bool ignore_list_seq: set True to ignore the order when comparing arrays(lists), recursively
        :param bool re_compare: set True to enable regular expressions for assertion. The pattern MUST contains ONE
        bracket, start with ^ or end with $, otherwise it won't be considered as an re-pattern. You can use ^.*?(sth) or
        ().*$ or so on to extract something from middle of the string. ^(.*)$ can just match any string, make this item
        ignored. Comparing two re-patterns makes no sense so it isn't allowed
        :param list[str or unicode] or None ignore_path: a list of element-paths to be ignored when comparing. e.g.
        ["/key1/key2", "/key3/1"] maans all "ignored" in {"key1":{"key2":"ignored"},"key3":["not ignored","ignored"]}
        :return bool: Whether two json string or json-like objects are equal. If not, print the differences
        """
        flag = False  # transferred str to object, need recursion

        if type(a) in [six.text_type, six.binary_type]:
            json_loaded_a = json.loads(a)  # json only, should use eval when using python dict/list-like strings instead
            flag = True
        else:
            json_loaded_a = a
        if type(b) in [six.text_type, six.binary_type]:
            json_loaded_b = json.loads(b)
            flag = True
        else:
            json_loaded_b = b
        if flag:
            return self.compare(json_loaded_a, json_loaded_b, ignore_list_seq, re_compare, ignore_path)

        try:
            json.dumps(six.text_type(a), ensure_ascii=False)
            json.dumps(six.text_type(b), ensure_ascii=False)
        except TypeError:
            print(traceback.format_exc())
            raise TypeError("unsupported types during json check")

        self._res = True
        self._ignore_list_seq = ignore_list_seq
        self._re_compare = re_compare
        self._ignore_path = None if ignore_path is None else [self._to_unicode_if_string(path) for path in ignore_path]
        if self._ignore_path:
            assert all([path[0] == u"/" or u"(/" in path for path in self._ignore_path]), "invalid ignore path"

        if self.print_before:
            print(self._escape("a is {}".format(a)))
            print(self._escape("b is {}".format(b)))
            print("ignore_list_seq = {}, re_compare = {}, ignore_path = {}, float_fuzzy_digits = {}".format(
                ignore_list_seq, re_compare, ignore_path, self.float_fuzzy_digits))

        self._common_comp(a, b)
        return self._res

这是测试用例，如修改源码，请回归测试用例

#!/usr/bin/env python
# coding: utf-8
from json_compare import Jcompare
import six


def long_line():
    print("-" * 120)


def run_tests():
    cp = Jcompare()

    a = {"姓名": "王大锤"}  # str and unicode (or bytes and str in python3) are compatible, useful in Chinese words...
    b = {u"姓名": u"王大锤"} if six.PY2 else {"姓名".encode("utf-8"): "王大锤".encode("utf-8")}
    res = cp.compare(a, b)
    print(res)
    assert res is True

    long_line()

    a = [[1, 2, 3], [4, 5, 6]]
    b = ([6, 5, 4], [3, 2, 1])  # tuples (useful in pymysql & DictCursor) and different order of arrays are supported
    res = cp.compare(a, b)
    print(res)
    assert res is True

    long_line()

    a = [[1, 2, 3], [4, 5, 6]]
    b = [[3, 2, 1], [6, 5, 4]]  # ignore_list_seq=False makes these two different, however
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is False

    long_line()

    a = {"a": 1, "b": 3, "c": False, "d": "ok"}
    b = {"a": 1, "b": 2, "c": "False", "e": "ok"}  # False != "False"
    res = cp.compare(a, b)
    print(res)
    assert res is False

    long_line()

    a = {"a": [1, {"k": ["ok"]}]}
    b = {"a": [1, {"k": ["error"]}]}  # ignoring list order, we aren't sure to pair {"k": ["ok"]} with {"k": ["error"]}
    res = cp.compare(a, b)
    print(res)
    assert res is False

    long_line()

    a = {"a": [1, {"k": ["ok"]}]}
    b = {"a": [1, {"k": ["error"]}]}  # however, if we consider list order, we can locate differences deeper
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is False

    long_line()

    a = {"a": [1, {"k": [0]}]}  # we ignore this path now, test will pass.
    b = {"a": [1, {"k": [1]}]}  # notice we can't specify path deeper in a list when ignore_list_seq is enabled
    res = cp.compare(a, b, ignore_list_seq=False, ignore_path=["/a/1/k"])
    print(res)
    assert res is True

    long_line()

    a = [{"a": 1, "b": 2}, {"a": 5, "b": 4}]  # now we finally support regular expressions in ignore_path list
    b = [{"a": 3, "b": 2}, {"a": 6, "b": 4}]  # in this case, only value of "b" concerned
    res = cp.compare(a, b, ignore_list_seq=False, ignore_path=[r"^(/\d+/a)"])
    print(res)
    assert res is True

    long_line()

    a = [{"a": 1, "b": 2}, {"a": 1, "b": 4}]  # also useful under list_seq ignored
    b = [{"a": 2, "b": 4}, {"a": 2, "b": 2}]
    res = cp.compare(a, b, ignore_path=[r"^(/\d+/a)"])
    print(res)
    assert res is True

    long_line()

    a = [{"a": 1, "b": 3}, {"a": 1, "b": 4}]  # this time, 3 and 2 cannot match
    b = [{"a": 2, "b": 4}, {"a": 2, "b": 2}]
    res = cp.compare(a, b, ignore_path=[r"^(/\d+/a)"])
    print(res)
    assert res is False

    long_line()

    a = [{"a": 1, "b": 2}, {"a": 3, "b": 4}, {"a": 5, "b": 4}]  # this time, only different frequency found
    b = [{"a": 6, "b": 4}, {"a": 7, "b": 2}, {"a": 8, "b": 2}]  # but it will choose a random value of "a" to display
    res = cp.compare(a, b, ignore_path=[r"^(/\d+/a)"])  # it's caused by logic restriction, don't get confused
    print(res)
    assert res is False

    long_line()

    a = {"a": [1, {"k": [0], "l": None}, 2]}  # ignore two paths this time, only difference at /a/2 will be shown
    b = {"a": [1, {"k": [1], "l": False}, 3]}
    res = cp.compare(a, b, ignore_list_seq=False, ignore_path=["/a/1/k", "/a/1/l"])
    print(res)
    assert res is False

    long_line()

    a = '{"rtn": 0, "msg": "ok"}'  # can compare json string with python dict/list objects
    b = {"rtn": 1, "msg": "username not exist"}
    res = cp.compare(a, b)
    print(res)
    assert res is False

    long_line()

    a = u'{"body":{"text":"你好"}}'  # both text and binary json strings are supported
    b = '{"body":{"text":"你好啊"}}'
    res = cp.compare(a, b)
    print(res)
    assert res is False

    long_line()

    a = [1, 2, 2]  # even we ignore the order, the frequency of elements are concerned
    b = [1, 1, 2]
    res = cp.compare(a, b)
    print(res)
    assert res is False

    long_line()

    a = [1, 2, 3]
    b = [1, 3, 4, 5]  # even if the length of lists are not equal, we can still know the difference
    res = cp.compare(a, b)
    print(res)
    assert res is False

    long_line()

    a = [1, 2, 3]
    b = [1, 3, 4, 5]  # but we CANNOT keep the order of elements under different length even if ignore_list_seq is False
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is False

    long_line()

    a = [1.0]  # in face cp.compare(1, 1.0) is allowed, however non-standard jsons are not recommend
    b = [1 if six.PY3 else eval("1L")]  # Integers and floats are compatible, including long of python 2
    res = cp.compare(a, b)
    print(res)
    assert res is True

    long_line()

    a = [r"^(.*)$"]  # re-comparing enabled as default. Be careful bare r"^(.*)$" without list is considered as json-str
    b = ["anything"]  # use this to skip any unconcerned fields
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is True

    long_line()

    a = [r"(.*)"]  # without ^-start or $-end, this won't be regarded as re-pattern
    b = ["anything"]
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is False

    long_line()

    a = [r"^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})$"]  # we can use re-comparing to confine formats but not values
    b = ["anything"]
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is False

    long_line()

    a = [r"^(2019-07-01 \d{2}:\d{2}:\d{2})$"]  # e.g. this assertion will pass
    b = ["2019-07-01 12:13:14"]
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is True

    long_line()

    a = [r"^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})$", r"^(.*)$"]
    b = ["anything", "otherthing"]  # when using re with order-ignored list, it will be crossing compare
    # be careful, potential chance of messy
    res = cp.compare(a, b)
    print(res)
    assert res is False

    long_line()

    a = [r"^(.*)$"]  # two re-pattern is not allowed
    b = [r"^(.+)$"]
    try:
        cp.compare(a, b, ignore_list_seq=False)
    except Exception as e:
        print(e)
    else:
        raise AssertionError()

    long_line()

    a = [r"^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})$", "otherthing"]
    b = ["anything", r"^(.*)$"]  # this errors when comparing a[0] with b[1] due to the above rule
    try:
        cp.compare(a, b)
    except Exception as e:
        print(e)
    else:
        raise AssertionError()

    long_line()

    a = r'["^(2019-07-01 \\d{2}:\\d{2}:\\d{2})$"]'  # double slashes are needed because this is a json-string, not list
    # or use '["^(2019-07-01 \\\\\d{2}:\\\\\d{2}:\\\\\d{2})$"]' will also work
    b = ["2019-07-01 12:13:14"]
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is True

    long_line()

    a = r'[r"^(2019-07-01 \d{2}:\d{2}:\d{2})$"]'
    b = ["2019-07-01 12:13:14"]
    try:
        print("json cannot parse innter 'r' notation, so this won't work:\t" + a)
        cp.compare(a, b, ignore_list_seq=False)
    except Exception as e:
        print(e)
    else:
        raise AssertionError()

    long_line()

    a = [r"^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})"]  # only fully match will pass re-comparing
    b = ["2019-07-01 12:13:14.567"]
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is False

    long_line()

    a = [r"^.*?(\d)-(\d)"]  # two or more brackets will result certain False
    b = ["2019-07-01 12:13:14.567"]
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is False

    long_line()

    a = [0.1+0.1+0.1]  # default we use accurate compare, since float compute causes accumulative errors
    b = [0.3]
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is False

    long_line()

    cp.float_fuzzy_digits = 6  # so we can bear errors less than 10e-6 now in float comparing
    res = cp.compare(a, b, ignore_list_seq=False)
    print(res)
    assert res is True


if __name__ == "__main__":
    run_tests()

Rainy Chan

发布了25 篇原创文章 · 获赞 22 · 访问量 9330

私信关注

python中递归比较json、列表和字典，显示差别的库，可忽略顺序，可支持正则，可设浮点精度（已上传至pypi，库名jsoncomparedeep）

猜你喜欢