####python 中difflib内置模块之文本对比###

difflib为python的标准库模块,无需安装
作用是对比文本之间的差异,并且支持输出可读性比较强的HTML文档,与Linux下的diff 命令相似。

difflib模块提供的类和方法用来进行序列所谓差异化比较,能够对比文件并称成差异结果文件或html 格式差异化比较页面
1.对比文件生成差异结果的文本

生成的差异文本中的符号理解

符号 含义
‘-’ 包含在第一个系列行中,但不包含第二个
‘+’ 包含在第二个系列行中,但不包含第一个
’ ’ 两个系列行一致
‘?’ 存在增量差异
‘^’ 存在差异字符

先将文本内容按行分割
text1 = splitlines(keepends=False) #将多行文本按行分割,返回一个列表不保留行尾换行符
text2 = splitlines(keepends=True) #将多行文本按行分割,返回一个列表保留行尾换行符
生成差异结果文件
import difflib #导入模块
diff = difflib.Differ() #生成差异对象
result = diff.compare(text1,text2) #使用差异对象比较两个文本列表的差异,生成差异结果对象
生成的差异对象不能直接查看,需要转化成列表并生成链接文本

mport  difflib
text1 = '''  1. Beautiful is better than ugly.
       2. Explicit is better than implicit.
       3. Simple is better than complex.
       4. Complex is better than complicated.
		'''.splitlines(keepends=True)
print(text1)

text2 = '''  1. Beautiful is better than ugly.
       3.   Simple is better than complex.
       4. Complicated is better than complex.
       5. Flat is better than nested.
     '''.splitlines(keepends=True)
print(text2)

diff= difflib.Differ()
result =''.join(list(diff.compare(text1,text2)))
print(result)
运行结果:
['  1. Beautiful is better than ugly.\n', '       2. Explicit is better than implicit.\n', '       3. Simple is better than complex.\n', '       4. Complex is better than complicated.\n', '\t\t']   #text1 按行分割后生成的列表
['  1. Beautiful is better than ugly.\n', '       3.   Simple is better than complex.\n', '       4. Complicated is better than complex.\n', '       5. Flat is better than nested.\n', '     ']    #text2 按行分割后生成的列表
差异文本
    1. Beautiful is better than ugly.
-        2. Explicit is better than implicit.
-        3. Simple is better than complex.
+        3.   Simple is better than complex.
?          ++
-        4. Complex is better than complicated.
?                 ^                     ---- ^
+        4. Complicated is better than complex.
?                ++++ ^                      ^
- 		+        5. Flat is better than nested.
+      

对比文件生成html格式的差异化比较页面
1.将文本内容按行分割
2.生成差异结果文件
diff = difflib.HtmlDiff() #生成差异对象
result = diff.make_file(text1,text2) #对比生成差异结果
3.将差异结果写入html文件中
with open(‘diff.html’,‘w’) as f:
f.write(result)

import  difflib
text1 = '''  1. Beautiful is better than ugly.
       2. Explicit is better than implicit.
       3. Simple is better than complex.
       4. Complex is better than complicated.
		'''


text2 = '''  1. Beautiful is better than ugly.
       3.   Simple is better than complex.
       4. Complicated is better than complex.
       5. Flat is better than nested.
     '''.


text1=splitlines(keepends=True)
text2=splitlines(keepends=True)
diff = difflib.HtmlDiff()
result = diff.make_file(text1,text2)

with open('diff.html','w') as f:
    f.write(result)

使用浏览器打开生成的html文件,不同的地方会用不同颜色显示
在这里插入图片描述

在这里插入图片描述
文件之间的对比
先要将文件内容读取,然后按上述文本比较方法就可以了

import difflib

filename1 = '/tmp/passwd'
filename2 = '/tmp/passwd1'
with open(filename1) as f1,open(filename2) as f2:   #打开文件
    content1 = f1.read().splitlines(keepends=True)   #打开文件1读取内容并按行分割
    content2 = f2.read().splitlines(keepends=True)
d = difflib.HtmlDiff()
htmlContent = d.make_file(content1,content2)
with open('passwdDiff.html','w') as f:
    f.write(htmlContent)

猜你喜欢

转载自blog.csdn.net/weixin_44821839/article/details/91825261