Anti-climbing techniques:
1. UA anti-climb: carrying ua, ua construct effective pool;
2. Ip limit: limit access to the same frequency of ip, download_delay = 8, ip agent;
3. js script data hiding: re generally used to extract data;
4. Ajax request (dynamic data): or selenium may be used pypepeer, but the efficiency is too low, the impact machine performance,
recommended direct acquired data access interface.
5. codes:
1 letter codes + number, OpenCV image recognition technology, a code internet
sliding codes: ...
6. The reverse JS: encryption with a common MD5, RSA (asymmetric encryption), des (symmetric encryption) , Base64 encoding, js confusion (sojson.v5);
7. the encryption Font: find the font encoding mapping table;
a data encoding problems: gbk, gbk2312, unicode, url encoding, html special characters, and hybrid coding problem;