20190613_day03初识爬虫

今天上课的内容很充实

主要有以下几点

1 函数的剩余操作：

　　包括空函数的作用，函数对象，函数名指向内存地址

　　函数的嵌套定义及调用

　　不同地址空间的区别

2 简略介绍了包与模块

　　如何导入包与模块

　　time模块

　　os,sys模块

　　json模块

3 爬虫的基本原理：

　　chrome调试模式的使用

　　如何找出自己想要的内容链接

　　requests的用法

　　爬虫程序的尝试

以下是课堂笔记：

上午

# 今日内容：

    # 函数剩余部分
    # 内置模块
    # 模块与包
    # 爬虫基本原理
    # requests模块

# AM

# 空函数
# 占空，pass表示什么都不做

'''
在调用函数时 需要接收函数体内部的结果
'''

# 函数对象
# 指的是函数名指向的
# 函数的内存地址
# def funk1():
#     pass
# print(funk1)
# def funk2():
#     pass
# dict1 = {
#     '1' : funk1,
#     '2' : funk2
# }
# '''
# 当需要实现功能选取的时候，
# 用字典结合函数地址远比写一堆ifelse要好得多
#
# '''
# choice = input('请输入功能编号').strip()
# if choice in dict1:
#     dict1[choice]()
#
# # 函数嵌套
# # 函数嵌套定义
# def func1():
#     print('func1...')
#
#     def func2():
#         print('func2...')
#
#         def func3():
#             print('func3...')

# # 函数嵌套调用
def func1():
    print('func1...')

    def func2():
        print('func2...')

        def func3():
            print('func3...')
        return func3

    return func2
res1 = func1()

res2 = res1()
res2()
# #
# # def func1():
# #     print('func1...')
# #
# #     def func2():
# #         print('func2...')
# #
# #         def func3():
# #             print('func3...')
# #         func3()
# #     func2()
# # func1()
#
# '''
# 函数的名称空间
# '''
#
# '''
# 包与模块
# '''
# from TEST import A
# A
#
# # 测试
#
# if __name__ == '__main__':
#     print("test")
# 此处内容写测试内容
# 当导入的时候在源文件中name为main  但是在导入语句所在的文件中name不为main，所以保证了导入后部被自动执行。

# # 常用内置模块
# # time
#
# import time
# # 获取时间戳
# print(time.time())

# os模块
# sys模块

# import os #于操作系统中的文件进行交互
# import sys #
#
# print(os.path.exists('1.mp4'))
#
# print(os.path.dirname(__file__))

下午

# import json
#
# user_info = {
#     'name': 'tank',
#     'pwd': '123'
# }
# # 序列化 把字典转化成json数据格式
# # 再把json数据转成字符串
# res = json.dumps(user_info)
# print(res)
# with open('json1.json','wt',encoding='utf-8') as f:
#     f.write(res)
#
# # dump 自动触发f.write
# user_info = {
#     'name': 'tank',
#     'pwd': '123'
# }
# with open('jason1.json','w',encoding='utf-8') as f:
#     json.dump(user_info,f)
#
#
# # 反序列化loads
# with open('json1.json','rt',encoding='utf-8') as w:
#     res = w.read()
#     user_dict = json.loads(res)
#     print(user_dict)
#     print(type(user_dict))
#
# # load 自动触发f.read方法
# with open('json1.json','rt',encoding='utf-8') as w:
#     user_dict = json.load(w)
#     print(user_dict)
#     print(type(user_dict))

# 爬虫
'''
http协议
    请求url
        https://www.baidu.com
    请求方式
        GET
    请求头：
    cookie:可能需要关注
    User-Agent 证明你是浏览器
        Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.80 Safari/537.36
    Host:
        www.baidu.com
'''

# # requests模块的使用
#
# import requests
# respn = requests.get(url='https://www.baidu.com')
# respn.encoding = 'utf-8'
# print(respn)
# # 返回响应状态码
# print(respn.status_code)
# # 返回响应文本
# print(respn.text)
#
# with open('baidu.html','w',encoding='utf-8') as f:
#     f.write(respn.text)

以下是作业 爬取某网站美女图片，我自作主张换成了喜欢的eva的图

import requests
res = requests.get('http://pic1.win4000.com/wallpaper/e/58e5aa2ea9215.jpg')

with open('shipin.jpg','wb') as f:
    f.write(res.content)

20190613_day03初识爬虫

猜你喜欢