python核心编程课后习题-正则式2

1-16 为gendata.py更新代码,使数据输出到redata.txt而不是屏幕。

from random import randrange ,choice
from string import ascii_lowercase as lc
from time import ctime
import  sys

tlds=('com','edu','net','org','gov')

f = open('redata.txt','w+')

for i in range(randrange(5,11)):
    dtint = randrange(i,11)
    dtstr = ctime(dtint)
    llen = randrange(4,8)
    login = ''.join(choice(lc) for j in range(llen))
    dlen = randrange(llen,13)
    dom = ''.join(choice(lc) for j in range(dlen))
    tmpstr= ('%s::%s@%s.%s::%d-%d-%d\n' % (dtstr,login,dom,choice(tlds),dtint,llen,dlen))
    f.writelines(tmpstr)
    print (tmpstr)

 

 

1-19 提取每行完整的时间戳

f = open('redata.txt','r')
lines = f.readlines( )
for line in lines:
    print (re.findall(r"(.+):", line))

 

1-20提取完整电子邮件地址

f = open('redata.txt','r')
lines = f.readlines( )
for line in lines:
    print (re.findall(r":(\w+@\w+\.\w+):", line))
 
1-21 提取时间戳中月份
f = open('redata.txt','r')
lines = f.readlines( )
for line in lines:
    print (re.findall(r"\s(\w+)\s+\d", line))
 
1-22提取时间戳的年份
f = open('redata.txt','r')
lines = f.readlines( )
for line in lines:
    print (re.findall(r"\s(\d+)::", line))

 

1-23提取时间戳中的时间

f = open('redata.txt','r')
lines = f.readlines( )
for line in lines:
    print (re.findall(r"(\d+:\d+:\d+)", line))

 

1-24仅仅提取电子邮件地址中提取登录名和域名(包括主域名好高级域名)

f = open('redata.txt','r')
lines = f.readlines( )
for line in lines:
    print (re.findall(r"(\w+)@(\w+.\w+):", line))

 

1-25 同上

 

1-26 使用你的电子邮件地址替换每一行电子邮件地址

f = open('redata.txt','r')
lines = f.readlines( )
for line in lines:
    print (line.replace(re.findall(r"(\w+@\w+.\w+):",line)[0],'[email protected]'))

 

1-27从时间戳中提取月、日、和年,然后以‘月、日、年’格式,每行迭代一次。

 

f = open('redata.txt','r')
lines = f.readlines( )
for line in lines:
    list = re.findall(r"\s(\w+)\s+(\d).+\s(\d+):", line)
    print ('%s %s %s' % (list[0][0],list[0][1],list[0][2]))

 

1-28 区号,正则表达式应该匹配800-555-1212,也能匹配555-1212

string  ="800-555-1212  555-1213"
patt = '((?:\d{3}-)?\d{3}-\d{4})'
print(re.findall(patt,string))

 

 

1-29支持圆括号连接的区号(800)555-1212

string  ="800-555-1212  555-1213 (800)555-1214 "
patt = '(?:\(\d{3}\))?(?:\d{3}-)?\d{3}-\d{4}'
print(re.findall(patt,string))

猜你喜欢

转载自blog.csdn.net/gaobobo138968/article/details/78419665
今日推荐