Data import and export based on database and WEB

Part 1 Database-based data import and export  

Purpose:

        The "database connection" is actually a description of the database connection: that is, the parameters needed to establish the actual connection. Although they are all relational databases, the connection behavior of each database is not exactly the same.

        Kettle provides some default connection parameters and values. For the detailed parameter list of each database, please refer to the database JDBC driver manual.

        Finally, you can also choose Apache's common database "connection pooling" and "clustering" options. These configurations are necessary if you need to run many transformations or jobs in a cluster.

        Considering the commonality of relational databases, in the following chapters, we choose MySQL as an example to introduce parameterized queries.

Experimental steps:

  1. Transformation input and output requirements

(1) Read in the student table data, and output the student data whose height is greater than or equal to 185 and whose grade is greater than or equal to 85. The output data is stored in the StuOut table.

( 2) Converted design drawing

(3) Step configuration

 

 

 (4) Conversion result

 

 

Part 2 WEB-based data import and export

Purpose:

When the "HTTP Client" step directly accesses the HTML page, the returned data is HTML. The full name of HTML is Hypertext Markup Language. HTML and XML have the same grammatical structure, and there is no other similarity. The main purpose of HTML is to pass through the browser For users to read, therefore, HTML defines a fixed set of nodes and attributes used to define the structure or style of text .

Experimental steps:

HTML import and export case

(1) Input and output requirements for conversion

Read in the data of the webpage http://www.biqukan.com/1_1094/5403177.html, and output the HTML source code in the file E:\Textbook Cases\Chapter 3\webout.html

(2) Converted design drawings

(3) Step configuration

 

 

(4) Conversion result

 

 

 

Part 3 Import and export cases based on HTTP GET requests

( 1 ) Input and output requirements for conversion

https://api.douban.com/v2/movie/in_theaters is the API interface provided by Douban Movies, which returns the JSON format of the currently popular movies. For the meaning of each part of JSON, refer to https://developers.douban.com/wiki/?title=movie_v2#top250.

Send an HTTP GET request to this address to get the current popular movie, and store the movie name, category, score, and starring data

(2) Converted design drawings

 

(3) Step configuration

 

 

 

 (4) Conversion result

 

Experiment summary:

        This experiment mainly practiced data import and export based on database and web. The case of database-based conversion was involved in the first kettle experiment. First, import relevant data into the database, and then use spoon to connect to the database for specified convert. There are two web-based data import and export experiments. One is that the http client sends an HTTP GET request or directly accesses the HTML page, calling the URL in this step to obtain data from the Web; the other is that the structured data returned by the HTTP GET request mainly includes: XML and JSON, the API used in this case returns data in JSON format. Through the subsequent processing of the returned data, the data we need can be obtained.

 

 

 

Guess you like

Origin blog.csdn.net/weixin_56264090/article/details/128827526