爬取安居客上的优质业务员信息

coding=utf-8

import urllib2
import urllib
import re

f = open(‘D:/python1/renwu.txt’,“a”,)
from bs4 import BeautifulSoup
for i in range(1,5):
url = ‘https://beijing.anjuke.com/tycoon/p’+str(i)+’/
user_agent = ‘Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)’
headers = {‘User-Agent’:user_agent}
request = urllib2.Request(url,headers = headers)
response = urllib2.urlopen(request)
content = response.read().decode(‘utf-8’)
soup = BeautifulSoup(content,‘html.parser’)
title = soup.find_all(‘div’,class_=‘jjr-itemmod’)
for a in title:
part1 = a.find(‘div’,class_=‘jjr-info’).get_text(’’,strip=True).encode(‘utf-8’).replace(‘’,’’)
part2 = part1.replace(’ ‘,’’)
part3 = part2.replace(’/n’,’’)
print part3
f.write(part3+’\n’)
学习总结:
1.在这个任务中学习到了一个新的模块bs4,这个模块在查找信息时比re模块更方便,更快捷

猜你喜欢

转载自blog.csdn.net/wyd117/article/details/83477128