The use of "string operation" function operator in Kettle

  The ELT process generally cannot avoid the operation of strings. The "string operation" in kettle can help us deal with some problems about strings very conveniently. Let's first introduce several functions of "string operation". .

Trim type : used to remove the blank characters at the beginning and end of the string (such as space, tab), here you can choose to remove the blank characters at the beginning of the string (left), the blank characters at the end of the string (right), and both the beginning and the end of the blank characters (both ).

Lower/Upper : It is a simple case conversion, of course, only for English characters, Chinese characters and numbers are invalid.

Padding : To append a string, you can choose to append to the head (left) or append to the end (right), but this is to be used with "Pad char" and "Pad length".

Pad char : Just input the character string you want to add in the input box.

Pad length : input length. This can be understood as such. If the string with no content added is abc with a length of 3, the string to be appended is wq, which is appended in the right way. If the input length is 3, then The result is unchanged; if the input length is 4, then the result is abcw; if the input length is 5, the result is abcwq; if the input length is 6, the result is abcwqw. I believe it should be clear by now. Regardless of whether "Padding" selects right or left, the rules are the same.

InitCap : The function of this is to ensure that the first letter of the string is capitalized and the remaining letters are all lowercase. For example, if a string is aBC, if the parameter of InitCap is "Yes", the result is "Abc".

Escape : This general application is less. For example, if you select "Use CDATA", the final string is the CDATA format string "<![CDATA[ test_str ]]>", the red part is the original The other strings will not be introduced one by one, and you will understand if you are interested in an experiment.

Digits : digits itself means digits, and the selected parameters are only three or two "none," "only" and "remove", none means no operation; only means only keep digits, other characters do not; remove means other characters All are left, only the numeric characters are removed.

Remove Special character : This is very simple. It is to remove special characters. You can select the characters to be deleted according to your needs, such as spaces, line feeds, etc., and I won’t introduce them here. If you have any doubts, you can Baidu translator.

  The function of each part is basically explained clearly. It is basically no problem to understand this to use. The following briefly introduces how to use it.

1. Core Object -> Conversion -> String Operation, drag it to the conversion page, as shown in the figure below
Insert picture description here
2. Double-click "String Operation", select or enter the relevant parameters, the functions and functions of each part have been explained before How to use, here is a simple configuration of some of them, as shown in the figure below
Insert picture description here
3. The original data in the library table is shown in Figure 1, and the dotted line is a blank character (space), and the result data is shown in Figure 2.

Figure 1 and
Insert picture description here
Figure 2
Insert picture description here
can be seen from the result data, that is, the string is operated according to the rules of the configured parameters, the blank characters at the head have been removed, and the additional character'Q' is also the length of the MAN added according to the length of the input It is 3, the input length is 5, the result data has two characters'Q' added, and the length of WOMEN is 5, the result data has not changed, so the result is correct.

Guess you like

Origin blog.csdn.net/AnameJL/article/details/115215302