Common software data and docking technology

Currently data islands everywhere, there is a docking business software or data acquisition software more difficult , especially C data S crawling software more difficult.

The most common way is docking system interface mode, under lucky circumstances, smooth docking, but the interface docking mode often takes a lot of time to coordinate the various software vendors.

In addition to software interfaces, whether there are other ways, small series summed up the focus on common data collection technology for your reference, is divided into the following categories:

CS software data acquisition technology.

C / S structure software belong to the older architecture, this software can collect data products is relatively small.

Common blog software to help small robot , without the need for software vendors to cooperate, based on the data on the "" WYSIWYG "way acquisition interface. The resulting output is a structured database or excel table. If only the needs of business data, the companies closed down or under difficult circumstances database analysis, this tool can collect data, especially data collection function detail pages have more features.

It is worth mentioning that the threshold for the use of this product is very low, there is no IT background, business students can use, greatly expand the use of the crowd.

 Second, network data acquisition the API . Provided by some websites and web crawler platform public API ( such as Twitter and Sina Weibo API) and other ways to get data from the website. This Web page data can be unstructured and semi-structured data is extracted from the web page.

        The whole process of data acquisition and processing large Internet web comprises four main modules: Web crawler ( Spider ), data processing ( the Data Process ), crawling URL queue ( URL Queue ) and data.

Database way

Both systems have separate databases, between the same type of database is more convenient:

1 ) If the two databases on the same server, there is no problem as long as the user name setting, you can reach each other directly, you need from the schema owner and the name of its database table can bring will. select * from DATABASE1.dbo.table1

2 ) If the two systems are not a database on the server, it is recommended that the form uses a linked server to handle or use openset and opendatasource way, this needs to be configured peripheral access to the database server.

The connection between the different types of databases more trouble, need to do a lot of settings to take effect, are not described in detail here.

Open Database way software vendors need to coordinate the various Open Database great difficulty; a platform If you want to connect a number of database software vendors, and in real-time access to data, this performance platform itself is a huge challenge.

We welcome the discussion.

Guess you like

Origin www.cnblogs.com/xiaobang101/p/11881203.html