Get_domains(调用ICP及mail反查接口获取资产)

写了个小工具,目的是尽可能多的获取资产,这个工具调用的是站长工具的ICP备案查询和邮箱反查功能。

环境:Python3

get_domains.py:

import requests
import re


host = ''
mail = ''

headers = {
    'Cookie':'',
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'
}

def getdomain_icp(host):
    url = 'https://icp.chinaz.com/ajaxsync.aspx?at=beiansl&callback=jQuery111308504784465392665_1555747182932&host={}&type=host&_=1555747182933'.format(host)
    req = requests.get(url)
    domains = re.findall('MainPage:"(.*?)",',req.text,re.S)
    for domain in domains:
        with open('domains.txt','a+') as f:
            f.write(domain + "\n")



def getdomain_mail(mail):
    url = 'http://whois.chinaz.com/reverse?host={}&ddlSearchMode=1&page=1'.format(mail)
    req = requests.get(url)
    page = re.findall('<span class="col-gray02">共(.*?)页',req.text)[0] #获取页数
    page = int(page) + 1
    for i in range(1,page):
        real_req = requests.get('http://whois.chinaz.com/reverse?host={}&ddlSearchMode=1&page={}'.format(mail,i),headers=headers)
        domains = re.findall('<div class="w13-0 domain"><div class="listOther"><a href="/(.*?)"',real_req.text)
        for domain in domains:
            with open('domains.txt','a+') as f:
                f.write(domain + "\n")

def main(host,mail=''):
    getdomain_icp(host)
    if mail!='':
        getdomain_mail(mail)

if __name__ =='__main__':
    main(host=host,mail=mail)

使用时只需要修改host以及mail,host为类似jd.com/tencent.com这种域名,即根域名,另headers中的cookie字段需要自己登录站长工具后在headers中获取,否则mail反查不到所有信息,mail默认为空,可以只使用icp备案查询,即只修改host。

未进行去重处理,可以用我上一个工具在获取banner时去重,结果会存在当前目录下的domains.txt中。

脚本使用的为单线程,因为怕被封IP。

猜你喜欢

转载自www.cnblogs.com/P1g3/p/10741516.html
ICP