Data import and preprocessing - Experiment 1: Data import and export

1. Experimental content

Purpose: Master the method of data import and export using Kettle
Main equipment: computer, Kettle (PDI), MySQL database

In order to optimize its operation and management, a supermarket chain plans to build a business intelligence system to help the enterprise management team understand the business status through data more comprehensively and professionally. At present, the company has an order database, which records the detailed data of each order, including the following fields:

[field name, order number, order date, point of sale, payment method, delivery date, logistics time limit, customer number, customer name, customer type, customer city, customer province, customer region, product number, product name, product category , product category, amount, quantity, discount, profit, salesman, return or not, financial year】

However, each department has different concerns about data, and each department has different data format requirements for using data analysis tools. Therefore, the company entrusts you to develop a data conversion system to convert and output the data in the order database according to the needs of different departments. The specific needs of each department are as follows:

Department name sales department logistics department Customer Relations Department Warehousing department After sales department
output field Order number; Point of sale; Payment method; Amount; Quantity; Discount; Profit; Salesman Order number; order date; delivery date; logistics time limit Order number; customer number; customer name; customer type; customer city; customer province; customer region Order number; product number; product name; product category; product category Order number; product number; customer number; return or not; amount
Data Format .xlsx file .json file .csv file .xml file Database Table
Naming rules File Name: Name Pinyin File Name: Name Pinyin File Name: Name Pinyin File Name: Name Pinyin Database table name: return
output path The output file path is the current directory of the converted file
special requirements Worksheet name: Chinese name All data into 1 file with comma as delimiter Take 'order number' as the node attribute, and other fields as the node content Only export the data of [return or not=1] and create the database table by yourself

Task 1: Build KETTLE project development environment
(1) Create kettledb database and orders table structure and data in MySQL
(2) Create kettledb database connection in KTR, database connection parameters use variables (named parameters)
Task 2: Sales data import and Export
(1) A full picture of the conversion step design is required.
(2) A screenshot of the key configuration items of the main conversion step is required.
(3) A SQL statement or a screenshot of the operation interface of the newly created database table is required.
(4) A screenshot of the conversion execution result
is required . Screenshot of the output file/database table content

2. Task 1 Answer

1. Create kettledb database and orders table structure and data in MySQL

Use Navicat to connect to the database:

Order sql statement: See Baidu Netdisk: Link: [https://pan.baidu.com/s/1NbiWzWdm0EfCHBsLn3ucFA]
Extraction code: 12ws

insert image description here

2. Create a conversion project (KTR) in KETTLE SPOON, and convert the name to personal name

Use spoon software to create a database connection:
insert image description here

Task two solution

1. Convert the full design

insert image description here

2. Sales Data Sheet

Table input part:
input sql filter statement:
insert image description here
table output part:
insert image description here
output result:
insert image description here

3. Logistics Department Data Sheet

Table input section:
insert image description here
Table output section:
insert image description here
Output result:
insert image description here

4. Customer Relations Department Data Sheet

Table Input Section:
insert image description here
Table Output Section:
insert image description here
Content Field Section:
insert image description here
insert image description here
Output Result:
insert image description here

5. Warehousing department data sheet

Table Input Section:
insert image description here
Table Output Section:
insert image description here
Content Field Section:
insert image description here
insert image description here
Output Result:
insert image description here

6. After Sales Department Data Sheet

The return table return table is created in navicat:

CREATE TABLE returnb (
  `订单编号` char(14) ,
  `客户编号` varchar(20) ,
  `产品编号` char(14) ,
  `金额` float ,
  `是否退货` tinyint(1) 
);

insert image description here
Return table data loading:
input part:
insert image description here
output part:
insert image description here
output result:

insert image description here
The output file is as follows:
insert image description here

Guess you like

Origin blog.csdn.net/hexiaosi_/article/details/127272483
Recommended