<div class="share-person-data-top"> <a href="/share/home?uk=3924974212&suk=mOZidGjjyKS6Y6NecksgaQ" target="_blank" title="å»Taç å享主页" class="share- person-username global-ellipsis">ç¯å**å享</a> <a href="//yun.baidu.com/buy/center?tag=1&from=sicon" class="unvip-icon sicon"> <em></em> </a> </div>
Above: There <a href under div. We need to get data href
First, the data acquired in the regular div, response content for the return, and to output text, i.e., returns the contents of the above html
tr_content = re.findall('<div class="share-person-data-top">(.*?)</div', response, re.S)[0]
Print tr_content
Afterwards, regular data acquisition href
= the re.findall td_content ( ' <A href. *? = "(. +)." *?> (. *?) </a> ' , tr_content, re.S) # n value is acquired href
Print td_content
Removing the outermost "[]"
print(td_content[0])
Remove "3924974212" and print
td_content = re.findall("\d+", td_content, re.S) print(td_content[0])