spider-flow 0.1.0 release, Java open source crawler platform

Which lasted more than three months, the first official release

spider-flow is a platform without writing code, reptiles, reptile to develop the process by defining the way

Now it has the following features:

  • Support css selector, regular extraction
  • Supports JSON / XML format
  • Support Xpath / JsonPath extract
  • Supports multiple data sources, SQL select / insert / update / delete / bulk insert
  • JS support page crawling dynamic rendering
  • Support agent
  • Support binary format, binary stream format
  • Support save / read the file (csv, xls, jpg, etc.)
  • Used string, date, file encryption and decryption and other functions
  • Support for nested processes
  • Support plug-in extensions (custom actuators, custom function)
  • Support for HTTP Interface

Existing plug as follows:

  • selenium plug
  • redis plugin
  • mongodb plug
  • IP Plug-agent pool
  • OCR recognition plug
  • OSS plugin
  • E-mail plug-in

Screenshot section:

Guess you like

Origin www.oschina.net/news/110954/spider-flow-0-1-0-released