[Experimental purpose]
1. Use Insight's "Excel Input" and other components to complete the Excel data source processing of complex headers.
2. Familiar with "Excel input", the use of components, complete the Excel data source processing of complex headers.
[Experimental principle]
Select the fields to be finally output through "Excel Input" to the next step "Add Stream" (in the experiment, you must ensure that the output fields of each data source are consistent after field selection), and then pass "Add "Stream" sets the merging order of 2 data sources, and then merges multiple data sources through "Add Stream".
[Experimental environment]
Operating system: Windows10
Kettle version: 7.1.0.0
jdk version: 1.8.0 and above
[Experimental steps]
One, create a conversion
Double-click spoon.bat to open kettle. (1) Click the New button:
, Click to select in the drop-down menu:
(2) Then click:
Rename the conversion file and save it in a specified path; (3) Select input step, output step and jump.
Second, the configuration of each component
1. Configuration of "Excel Input":
Step1: Double-click the "Excel Input" component, configure the form type of the "File" tab and
Step2: Configure the'worksheet' tab, select sheet1 of the input file, and define the starting row and column
Step3: Configure the "Field" tab to obtain the header field name (del delete the redundant fields)
2. Configuration of "Microsoft Excel Output":
Step1: Double-click the "Microsoft Excel Output" component, configure "File & Worksheet"
Step2: Configure the'Content' tab to obtain the field name and format
Three, perform the conversion
Click the button to perform the conversion, and the result is as follows:
Fourth, the experimental results:
Input file "Course Information Sheet.xlsx":
Output file'test2.xls':
5. Difficulties encountered during the experiment:
(1) The'table input' step configuration is negligent, and the input file type is not selected, so that the step has no available input stream and an error is reported
(2) In the'table input' step, when the'worksheet' tab is configured, the final result is not displayed as expected due to the incorrect start row selected. The table input error configuration is as follows:
Make the field name of the output result wrong, as shown in the figure below:
The modified starting line is as follows, and finally get the expected result:
Six, experiment summary
This experiment is mainly to master the use of the "Excel Input" component, and complete the Excel data source processing and output of complex headers.
The steps required for the experiment are not difficult, but you must pay attention to some details in the process. For example, when processing complex data sources, you must clarify the input stream type of the data source and calculate the starting line of the input data to be obtained. And the starting value of the starting column to avoid unnecessary errors.
annex: