White python learning record BeautifulSoup4 learning

from bs4 import BeautifulSoup
text = """
<ul id="navList" class="w1">
<li><a id="blog_nav_sitehome" class="menu" href="https://www.cnblogs.com/">博客园</a>
</li>
<li>
<a id="blog_nav_myhome" class="menu" href="https://www.cnblogs.COM / jswf / > Home </a>"
</ li>
<li>
<a id="blog_nav_newpost" class="menu" href="https://i.cnblogs.com/EditPosts.aspx?opt=1">新随笔</a>
</li>
<li>
<a id="blog_nav_contact" class="menu" href="https://msg.cnblogs.com/send/jswf">联系</a></li>
<li>
<a id="blog_nav_rss" class="menu"https://www.cnblogs.com/jswf/rss/"href =">订阅</a>
<!--<partial name="./Shared/_XmlLink.cshtml" model="Model" /></li>--></li>
<li>
<a id="blog_nav_admin" class="menu" href="https://i.cnblogs.com/">管理</a>
</li>
</ul>
<ul>
<li>1213123</li>
</ul>
"""
soup = BeautifulSoup(text,"lxml")
ul = soup.find_all("ul", the class_ = " W1 " , id = " navList " , limit = 2 ) [ 0 ] 
#, and find all the tags ul class and id is specified and only take two zeroth obtain a list obtained after listing 
#ul = Soup .find_all ( " ULS " , attrs = { " class " : " W1 " , " id " : " navList " }) [ 0 ] 
#, and find all the tags ul id is specified class and get a list of the list obtained after the zeroth a 
Print (ul) 
Print (List (ul.strings)) 
# get all the text under the label include carriage returns ul 
print (list (ul.stripped_strings))
# Get all non-empty text label under ul 
AESUl.find_all = ( " a " )
 for a in AES: 
    href = a [ " href " ] 
    # Get a tag href attribute 
    #href = a.attrs ( " href " ) 
    # Get the href attribute of a tag 
    Print (href )

 There is also a volume ul.get_text () and ul.strings the same role (both to return all the text in the label ul including spaces carriage returns)

However get_text () Returns a string format strings returns the format generator

Guess you like

Origin www.cnblogs.com/jswf/p/12294336.html