Kettle study notes (10) - data inspection, statistics, partitioning and JS scripts

I. Overview

  Data Analysis and Data Verification:

    For data inspection and cleaning.
  Statistical steps:

    Provides functions for data sampling and statistics
  Partitioning:    

    According to the value of a field in the data, it is divided into multiple data blocks. Output to different library tables and files.

  script:

    Javascript Basics

2. Data analysis and data testing

  1. Data Analysis

    Analysis of the data type, length, value range, etc. of the original data belongs to the first step of ETL

Data analysis     using DataCleaner in kettle

  First, you need to install the plugin in tools-marketPlace and restart: https://wiki.pentaho.com/pages/viewpage.action?pageId=23533803

  2. Data verification

    Examples can be viewed in samples, which can be configured such as error codes, dictionary table verification, etc.:

    

    A simple test is as follows:

    

 

     It can also be followed by SWITCH CASE, and follow-up processing through error codes, etc.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325554367&siteId=291194637