Python自动化脚本【1】url提取及自动打开页面

读取一个五百行的csv文件，依次打开每行中的url，并依据打开的结果对每个url进行分类，有D、H、U、B、N五类，刚好也在学习Python，就想自己写个半自动化的脚本（所谓半自动化是因为type还得自己输入然后敲回车）以省去“复制url——粘贴到浏览器地址栏——回车”这三步操作。最后输出一个名为“501-1000”的csv文件。

代码现贴在下面。

# -*- coding: utf-8 -*-
"""
Created on Wed Apr 19 12:25:04 2017
@author: lewiskoo
"""
import pandas as pd
import webbrowser

data_df = pd.read_csv('1.csv')
n = 0
chromePath = r'C:\Program Files (x86)\Google\Chrome\Application\chrome.exe'
webbrowser.register('chrome', None, webbrowser.BackgroundBrowser(chromePath))  #register the webbrowser to openup pages
while n < 5:
    print('~~~~~~~~~~No. %d is processing~~~~~~~~~~~~~' %n)
    print('D1 H2 U3 B4 N5')
    webbrowser.get('chrome').open_new_tab(data_df.name[n]) #use chrome to open pages
    i = int(input('which type?'))
    if i == 0:
        break
    elif i == 1:
        data_df.type[n] = 'D'
    elif i == 2:
        data_df.type[n] = 'H'
    elif i == 3:
        data_df.type[n] = 'U'
    elif i == 4:
        data_df.type[n] = 'B'
    elif i == 5:
        data_df.type[n] = 'N'
    else:
        data_df.type[n] = i
    print('type of No.%d is decided' %n)
    n += 1
data_df.to_csv('501-1000.csv') #output the df into a csv file
print('All the stuff is done and the csv is written.')

写的很不Pythonic，也有些小bug。但作为初学者，自我感觉能用起来，就是很大的进步了。

顺手记录下自己的经验教训：

test以及本地保存多版本的重要性，最开始脚本写的更为简单甚至简陋，能满足基本功能，然后就想怎么搞得炫一点，然后手贱改了几个地方没test，最后输完500个值发现并没有保存在dataframe中，瞬间爆炸，幸好history记录的有一定数据。所以无论DDL多么紧，一旦打算使用程序进行自动化操作，一定得确保基本功能的可靠。
需要持续锻炼自己搜索信息的能力，就像以前专业课做课程设计找参考资料一样，不同类型的资料需要不同方法去找，代码就在github、简书、CSDN等网站上搜，算法就通过Google Scholar或者知网搜。使用google是种能力，快速找到自己想要的信息也是件不容易办到的事。
代码得常写，我葛优躺一般瘫在办公椅上机械地敲type类型的时候，老觉得自己哪个地方好像没写对（事后证明果然出问题了）。如果没有经常写经常debug，很难对自己的代码水平有自信，要误事的。

Python自动化脚本【1】url提取及自动打开页面

猜你喜欢