Excel data source processing of kettle complex header

[Experimental purpose]
1. Use Insight's "Excel Input" and other components to complete the Excel data source processing of complex headers.
2. Familiar with "Excel input", the use of components, complete the Excel data source processing of complex headers.

[Experimental principle]
Select the fields to be finally output through "Excel Input" to the next step "Add Stream" (in the experiment, you must ensure that the output fields of each data source are consistent after field selection), and then pass "Add "Stream" sets the merging order of 2 data sources, and then merges multiple data sources through "Add Stream".

[Experimental environment]
Operating system: Windows10 
Kettle version: 7.1.0.0
jdk version: 1.8.0 and above

[Experimental steps]

One, create a conversion

Double-click spoon.bat to open kettle. (1) Click the New button:
Insert picture description here

, Click to select in the drop-down menu:
Insert picture description here

(2) Then click:
Insert picture description here
Rename the conversion file and save it in a specified path; (3) Select input step, output step and jump.
Insert picture description here

Second, the configuration of each component

1. Configuration of "Excel Input":
Step1: Double-click the "Excel Input" component, configure the form type of the "File" tab and
Insert picture description here

Step2: Configure the'worksheet' tab, select sheet1 of the input file, and define the starting row and column
Insert picture description here

Step3: Configure the "Field" tab to obtain the header field name (del delete the redundant fields)
Insert picture description here

2. Configuration of "Microsoft Excel Output":
Step1: Double-click the "Microsoft Excel Output" component, configure "File & Worksheet"
Insert picture description here

Step2: Configure the'Content' tab to obtain the field name and format
Insert picture description here

Three, perform the conversion

Click the button to perform the conversion, and the result is as follows:
Insert picture description here

Fourth, the experimental results:

Input file "Course Information Sheet.xlsx":
Insert picture description here

 Output file'test2.xls':
Insert picture description here

5. Difficulties encountered during the experiment:

(1) The'table input' step configuration is negligent, and the input file type is not selected, so that the step has no available input stream and an error is reported
Insert picture description hereInsert picture description here

(2) In the'table input' step, when the'worksheet' tab is configured, the final result is not displayed as expected due to the incorrect start row selected. The table input error configuration is as follows:
Insert picture description here

Make the field name of the output result wrong, as shown in the figure below:
Insert picture description here

The modified starting line is as follows, and finally get the expected result:
Insert picture description here

Six, experiment summary

This experiment is mainly to master the use of the "Excel Input" component, and complete the Excel data source processing and output of complex headers.
The steps required for the experiment are not difficult, but you must pay attention to some details in the process. For example, when processing complex data sources, you must clarify the input stream type of the data source and calculate the starting line of the input data to be obtained. And the starting value of the starting column to avoid unnecessary errors.

annex:
Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44727274/article/details/113102049