Automation of Wechat Official Account Article Migration to Other Platforms

1. Demand

Publish the articles of the WeChat official account to another platform (hereinafter referred to as platform A)

Two, ideas

Idea 1: After opening the article link, copy the content of the article and paste it to the editor of platform A

Idea 2: Use the reference link function of platform A.

This time, solution 2 is adopted.

3. Tools

python、uibot

4. Steps

1. Enter the official account to view historical news;

2. Right-click - view source code (note: the source code is only the loaded content part, and the subsequent content needs to be loaded according to requirements, that is, scroll down) - save the file as a separate file;

3. Use the bs4 module of python to parse out the required content, this time extract the link, title and date of each article, and save it to an excel file;

4. Data cleaning. According to the extracted content and actual needs, this time, the first is to extract the necessary part of the link (that is, the content with the tail can be opened normally, because some of the tail content is the source tracking code or other, which can be removed); the second is to remove Date conversion to YYYY-MM-DD format

5. Use uibot to automate operations. Loop through the contents of excel, filling in the title of the page, the date, and the link needed for the reference function.

5. Difficulties encountered and solutions

1. The date input on platform A is a calendar control (as shown in the figure below), which cannot perform general text box input operations. The solution is to execute js code. First of all, some solutions on the Internet are to remove the readonly attribute first, and then assign it to the date box, but find that an error is reported at the code for removing the readonly attribute (see Note 1), and try to directly execute the code for assigning the date (see Note 2 ), although an error is also reported, but the calendar control of platform A has been positioned on the required date , so the exception is added, the error is skipped directly , and then automated according to the manual operation method. (That is, click the calendar control text box, click the OK button, because the correct date has been positioned at this time, you can directly click OK, see Note 3)

Try
        // sRet = WebBrowser.RunJS(hWeb,'''document.getElementById("wzrq").removeAttribute("readonly");''',True,{"bContinueOnError":False,"iDelayAfter":300,"iDelayBefore":200}) //注释1
        dateJs='document.getElementById("wzrq").value = ' & '"' & date & '"'
        sRet = WebBrowser.RunJS(hWeb,dateJs,True,{"bContinueOnError":False,"iDelayAfter":300,"iDelayBefore":200}) //注释2
Catch e
    TracePrint ""
End Try
// 点击日期框--注释3
Mouse.Action({"html":[{"id":"wzrq","tag":"INPUT"}],"wnd":[{"app":"iexplore","cls":"IEFrame","title":"*"},{"cls":"Internet Explorer_Server"}]},"left","click",10000,{"bContinueOnError":False,"iDelayAfter":300,"iDelayBefore":200,"bSetForeground":True,"sCursorPosition":"Center","iCursorOffsetX":0,"iCursorOffsetY":0,"sKeyModifiers":[],"sSimulate":"simulate","bMoveSmoothly":False})
// 点击日历控件的确定按钮--注释3
Mouse.Action({"html":[{"aaname":"确定","parentid":"layui-laydate1","tag":"SPAN"}],"wnd":[{"app":"iexplore","cls":"IEFrame","title":"*"},{"cls":"Internet Explorer_Server"}]},"left","click",10000,{"bContinueOnError":False,"iDelayAfter":300,"iDelayBefore":200,"bSetForeground":True,"sCursorPosition":"Center","iCursorOffsetX":0,"iCursorOffsetY":0,"sKeyModifiers":[],"sSimulate":"simulate","bMoveSmoothly":False})

2. After citing the link, it may not be loaded successfully due to network problems. If you go directly to the next step at this time, the content of the article will be empty. To solve this kind of problem, on the one hand, it is to delay the follow-up operation of the imported link so that it has enough time to load (that is, to set the delay); Some elements etc. The idea of ​​judging the change this time is to use the word count prompt of the editor (see the picture below), and adopt the method of positioning + screenshot + picture recognition. If the link is not loaded successfully, it will be recognized as "0 words and 0 pictures have been entered, and it is expected to read 0 Minutes", if the loading is successful, the result will be different. According to the characteristics of the sentence (X word X image X minutes), use regular expressions to extract the numbers (the result of this extraction is an array), and the array [0,0,0] Compare, if they are equal, the loading is not successful, and the cycle reloads, if they are not equal, the loading is successful. Note: In this specific scenario, one is to clear the content of the original editor before loading the link (the clear function of the editor), to ensure that there is no content (that is, "0 words and 0 pictures have been entered, and it is expected to read for 0 minutes") , The second is that the judging function of whether the arrays of uibot are equal is not found, and it is realized by a custom python plug-in function.

 

Guess you like

Origin blog.csdn.net/m0_49621298/article/details/109560949