Hot style goods with your analysis in Python idle fish

Shop must see! Hot style goods with your analysis in Python idle fish

Hack  1 week ago

The following article comes from AirPython, author Ann Star fruit

AirPython

AirPython

Share Python automation and reptiles, data analysis combat dry, welcome attention.

1

Target scene

 

Friends often see small things to sell in the busy fish and some of the skills or their own good, for the good they can bring  back to sleep revenue.

 

A large number of goods on the busy fish, it is difficult to determine precisely what popular, which sell; one data entry to do data analysis, a waste of time at the same time, the efficiency is extremely inefficient.

 

The purpose of this article is to use  Python automated  to obtain certain goods in the best selling product for reference.

 

ps: This article only for technical exchanges, not for other purposes.

 

2

Ready to work

 

Before writing code, you need to do the following preparations:

1, configured Android ADB development environment

2, installation pocoui dependent libraries within Python virtual environment

3, mounting data visualization dependent libraries  pyecharts

 

# pocoui
pip3 install pocoui

# 数据可视化图表
pip3 install pyecharts -U

 

 

3

Write code

 

We divide  7  steps to achieve this are: Open the target application client, the search keyword to the Product List interface, calculate the best sliding distance, selection of goods, obtain product link address, write the file sorting and counting of goods, configuration parameter.

 

 

The first  one  step, using the open goal pocoui automation applications.

 

def __pre(self):
    """
    准备工作
    :return:
    """
    home()
    stop_app(package_name)
    start_my_app(package_name, activity)


    # 等待到达桌面
    self.poco(text='闲鱼').wait_for_appearance()
    self.poco(text='鱼塘').wait_for_appearance()
    self.poco(text='消息').wait_for_appearance()
    self.poco(text='我的').wait_for_appearance()

    print('进入闲鱼主界面')

 

After entering the idle fish home, will get the application side clipboard data, when there is a password of a specific law, a dialog box will pop up immediately, it is necessary to close the dialog box simulation operation.

 

# 如果指定时间内内有淘口令,就关闭
for i in range(10, -1, -1):
      close_element = self.poco('com.taobao.idlefish:id/ivClose')
      if close_element.exists():
            close_element.click()
            break
      time.sleep(1)

 

After opening the application, can for the first  two  steps of the operation.

 

To search by keyword, analog input to the input box and click the Search button, the search had been waiting list appears.

 

 

Further, in order to more easily handle data, product list to the list switching mode , i.e., row shows only a commodity.

 

def __input_key_word(self):
    """
    输入关键字
    :return:
    """
    # 进入搜索界面
    perform_click(self.poco('com.taobao.idlefish:id/bar_tx'))

    # 搜索框内输入文本
    self.poco('com.taobao.idlefish:id/search_term').set_text(self.good_msg)

    # 点击搜索按钮
    while True:
         # 等待检索结果列表出现
         if not self.poco('com.taobao.idlefish:id/list_recyclerview').exists():
              perform_click(self.poco('com.taobao.idlefish:id/search_button', text='搜索'))
         else:
              break

    # 等待商品列表完全出现
    self.poco('com.taobao.idlefish:id/list_recyclerview').wait_for_appearance()

    # 切换到列表
    perform_click(self.poco('com.taobao.idlefish:id/switch_search'))

 

The first  three  steps, to calculate the optimum sliding distance.

 

To ensure efficient data crawling, obtaining optimum distance calculated per slide.

 

First to get the UI control tree of the current interface , then get the coordinates of commodities through the property ID control, then get the height of each commodity.

 

Finally, by observing the number of occurrences of the screen product the best sliding distance.

 

def __get_good_swipe_distance(self):
    """
    获取每次滑动,最合适的距离
    :return:
    """
    element = Element()
    # 保存当前的UI树到本地
    element.get_current_ui_tree()

    # 第一个商品Item的坐标
    position_item = element.find_elment_position_by_id_and_index("com.taobao.idlefish:id/card_root",
                                                                     "1")
    # 商品的高度
    item_height = position_item[1][1] - position_item[0][1]

    # 通过观察,当前屏幕有3件商品
    return item_height * 3

 

 

 

The first  4  -step screening products.

 

The above steps to get the best sliding distance, the sliding stop Item page through a list of sub-elements.

 

Note that, in order to avoid errors due to the inertia of the slide, when the slide length of each is preferably set to more than 2s.

 

Screened by commodity Item want the number is greater than the preset number of goods.

 

 

# 多少人想要
want_element_parent = item.offspring('com.taobao.idlefish:id/search_item_flowlayout')

if want_element_parent.exists():
     # 想要数/已付款数目
     want_element = want_element_parent.children()[0]

     want_content = want_element.get_text()

     # 过滤掉【已付款】等其他商品,只保留个人发布商品
     if '人想要' not in want_content:
            continue

      # 拿到商品想要的具体数目,代表商品热度
      want_num = get_num(want_content)

      if int(want_num) < self.num_assign:
             # print('不达标,过滤掉')
             pass
      else:
            # 商品想要数达标,加入统计

 

The first  5  steps to obtain product link address.

 

For the last step to meet the conditions of the commodity, commodity Item Click into the product detail page.

 

Then click on the top right of the Share button will immediately pop-up share box.

 

 

Then click the password control, you are prompted to copy the password to the system clipboard success .

 

# 点击更多
while True:
     if self.poco('com.taobao.idlefish:id/ftShareName').exists():
          break
     print('点击更多~')
     perform_click(self.poco(text='更多'))

# 点击复制淘口令
perform_click(self.poco('com.taobao.idlefish:id/ftShareName', text='淘口令'))

# 拿到口令码
taobao_code_element = self.poco('com.taobao.idlefish:id/tvWarnDetail')

taobao_code = taobao_code_element.get_text()        

 

The first  6  -step, writes merchandise, sorting and statistical data.

 

 

Above the acquired title to the goods, the number you want to share the address written to the CSV file.

 

Then reads the data file through the second column in the table is reverse the order, so that in accordance with the desired commodity number in descending order.

 

def __sort_result(self):
    """
    对爬取的结果进行排序
    :return:
    """
    reader = csv.reader(open(self.file_path), delimiter=",")

    # 头部标题
    head_title = next(reader)

    # 按照第二列进行逆序排列
    sortedlist = sorted(reader, key=lambda x: (int(x[1])), reverse=True)

    # 写入头部数据
    write_to_csv(self.file_path, [(head_title[0], head_title[1], head_title[2])], False)

    for value in sortedlist:
       write_to_csv(self.file_path, [(value[0], value[1], value[2])], False)

    return sortedlist

 

Finally got the front 10 data, use  pyecharts  generate statistical charts.

 

def draw_image(self, sortedlist):
     """
     画图
     :param sortedlist:
     :return:
     """

     # 标题列表
     titles = []

     # 销量
     sales_num = []

     # 拿到爬取结果的标题、销量两个列表
     with open(self.file_path, 'r') as csvfile:
         # 读取文件
         reader = csv.DictReader(csvfile)

         # 加入列表中
         for row in reader:
             titles.append(row['title'])
             sales_num.append(row['num'])

     # 数目限制
     if len(titles) > self.num:
         titles = titles[:self.num]
         sales_num = sales_num[:self.num]

     # 画图
     bar = (
            Bar()
                .add_xaxis(titles)
                .add_yaxis("哪些好卖", sales_num)
                .set_global_opts(title_opts=opts.TitleOpts(title="我要卖货"))
        )
     bar.render('%s.html' % self.good_msg)

 

 

The first  seven  steps, the configuration parameters.

 

Write yaml file, specify the keyword to climb to take the goods, crawling time, you want to count the number of assessment indicators, the number of screening commodities.

 

goods:
  # 搜索商品1,包含搜索关键字、爬取时间
  good1:
    key_word: '资料'   # 搜索关键字
    key_num: 100  # 筛选【想要数】的临界点
    num: 10      # 只筛选爆款
    time: 600   # 爬取时间(秒)

 

4

Conclusions The results of

 

Preconfigured good product keywords, crawling time and other parameters that can be taken to meet the requirements of the climb, the best selling product data, and ultimately to graphically displayed.

 

 

 

I have all the source code uploaded to the background, the number of public attention reply "  selling things  " to get the download link.

 

If you feel pretty good article, please share thumbs down. You certainly are my greatest encouragement and support.


 

Recommended reading: (click on the title below to jump)

Published 117 original articles · won praise 41 · views 60000 +

Guess you like

Origin blog.csdn.net/pangzhaowen/article/details/102912933