Summary of pig usage problems

1. If a::tags#'pic' is used as a parameter and passed to another function method, it needs to be escaped multiple times, and

the function is called twice. In this method, other methods are called, and the parameters are also passed layer by layer. , need to be escaped twice, especially the function in map,
C1 = two_use_cart_filter_by_clkloc_distinct_vid_and_ic(C,0,2,'vid','cvid','tags#\\\\'pic\\\\'','cpic') ;

Call the function once, just turn it once
cx = get_distinct_data_by_field(cx,B::vid,'bvid','B::tags#\'pic\'','bpic') ;

2, in the pig function , if the incoming variable has an aliased scalar after the join, do not use the A:tags#'et' syntax to write, directly use tags#'et' to refer to


3, in the function script, the
registered variable must be enclosed in single quotes, REGISTER ' /home/lib/dhpig.jar
';
REGISTER '/home/lib/event-log.jar';
In non-function scripts, you don't need

4, in a.pig script, if a function is referenced For the script function.pig, pay attention to the parameter name of the a

script , which cannot be the same as the function name in the function.pig script

. pig if b.Pig defines a function inside, and then aliases x,
Use x2 = getx() to receive in a.pig. At this time, it should be noted that x2 cannot be the same as the alias of the schema that loads data in the function in b.pig. If there
is such a piece of code in getx:
bb = load 'xx ' as (x2:chararray)
 

7, a result set r obtained by a and b after joining, if it is passed to the next function for use, it needs to use r::a::xx reference
If there is an extension field, you can make r ::a::map#'field' refers to
 
 
8. When converting some types, if no schema is set by default, then it will be bytearray type. If you want to perform some join operations, or union, cross operations, you
must To ensure that the character types of the join keys on both sides are consistent
ho = join $a by bvik left outer , $b by okey; If the character types of bvik and okey are inconsistent, the following exception will occur:
                        int errCode = 1075;
                        String msg = "Received a bytearray from the UDF. Cannot determine how to convert the bytearray to string.";
So before joining, make sure the type is the same, as in the following pig statement: $11 and $3 are to explicitly declare the type they belong to
mz = foreach mf generate CONCAT((chararray)$11,(chararray)$3) as vidic , $4 as gno:chararray ;
mp = group mz by vidic;
$ord = foreach mp generate  group  as okey , BagToString($1.$1,'#') as rfxnos  ;


Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326686216&siteId=291194637
pig
pig