Python Challenge 第 4 关攻略:follow the chain

Python Challenge4 关攻略:follow the chain


题目地址
http://www.pythonchallenge.com/pc/def/linkedlist.php


题目内容


题目解法

  • 网页的标题是 follow the chain 追随链条
  • 网页的 URL 地址是 linkedlist 链表
  • 图中也是链条

首先查看网页源代码,发现注释:

<!-- urllib may help. DON'T TRY ALL NOTHINGS, since it will never 
end. 400 times is more than enough. -->

提示使用 urllib 库,还说不要尝试所有的 nothings ,因为它永远不会结束, 400 次足够了。

发现 a 标签里有链接,点击图片跳转,得到如下的 URL
http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345
发现网页内容是 and the next nothing is 44827
nothing 改了应该可以继续跳转,下面用 urllib 库试一下,获取 400 次会有什么样的响应。

from urllib.request import urlopen
import re

suffix = '12345'
contents = []
contents.append(suffix + '\n')
url = f'http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing={suffix}'

for i in range(400):
    response = urlopen(url)
    html = str(response.read())
    content = re.search(r"'(.+)'", html).group(1)
    try:
        suffix = re.search(r'\d+', content).group()
    except:
        print(content)
        break
    contents.append(suffix + '\n')
    url = f'http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing={suffix}'
    print(suffix)

发现获取到一半的时候报错了,加入异常处理,然后打印出出错的内容,得到:
Yes. Divide by two and keep going. 即没错,除以二然后继续。

那么把初始的 URL 设置为它的上一个数字除以二,即 16044 / 2 = 8022 , 继续循环。
相应地修改代码如下:

from urllib.request import urlopen
import re

suffix = '12345'
contents = []
contents.append(suffix + '\n')
url = f'http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing={suffix}'

for i in range(400):
    response = urlopen(url)
    html = str(response.read())
    content = re.search(r"'(.+)'", html).group(1)
    try:
        suffix = re.search(r'\d+', content).group()
    except:
        print(content)
        contents.append(content + '\n')
        suffix = str(int(int(suffix) / 2))
    contents.append(suffix + '\n')
    url = f'http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing={suffix}'
    print(suffix)

发现又报错了,报错之前的数字后缀是 82683 ,于是访问以下网址:
http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=82683
发现提示信息: You've been misleaded to here. Go to previous one and check.
即你被误导到这里了,返回上一页检查。
按照提示返回上一页:http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=82682
网页显示: There maybe misleading numbers in the text. One example is 82683. Look only for the next nothing and the next nothing is 63579
即文本中可能存在误导数字。一个例子就是 82683

所以为了提取正确的数字,需要修改正则表达式,另外我加入了文本写入的代码,方便以后查看,修改后的代码如下:

from urllib.request import urlopen
import re

suffix = '12345'
contents = []
contents.append(suffix + '\n')
url = f'http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing={suffix}'

for i in range(400):
    response = urlopen(url)
    html = str(response.read())
    content = re.search(r"'(.+)'", html).group(1)
    try:
        suffix = re.search(r'next nothing is (\d+)', content).group(1)
    except:
        print(content)
        contents.append(content + '\n')
        suffix = str(int(int(suffix) / 2))
    contents.append(suffix + '\n')
    url = f'http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing={suffix}'
    print(suffix)

with open('level4.txt', 'w', encoding = 'utf-8') as fp:
    fp.writelines(contents)

检查输出,发现 peak.html ,修改网址,进入下一关:
http://www.pythonchallenge.com/pc/def/peak.html

猜你喜欢

转载自blog.csdn.net/jpch89/article/details/81366982
今日推荐