sqoop custom string delimiter

    The --fields-terminated-by option provided by Sqoop can support specifying custom separators, but it only supports single-byte separators. For our special needs: we want to use double-byte "<-T->" , the default is not supported.
When Sqoop performs each export task, it will call codegen, generate a java file, and compile and package it into a jar for MapReduce to use. This java file wraps a series of access interfaces to the exported data. We can try to find a way to specify the double-byte delimiter by analyzing this java file.
     Generally, if --query is used to obtain data with a query statement, the generated files are QueryResult.java and QueryResult.jar. If --table is used, the corresponding file is named with the specified table name. The java files are generated in the same directory as the sqoop script.

1. Modify the QueryResult.java file


to generate QueryResult.java to the /home/ncms/sp/sqoopjar2 directory

sqoop codegen --connect jdbc:mysql://192.168.0.1:3306/admin --username admin --password admin --query "SELECT i.*,c.description FROM PUBLISH_INFO i  INNER JOIN PUBLISH_CONTENT  c
ON i.id=c.id  where \$CONDITIONS  " --bindir /home/ncms/sp/sqoopjar2 --class-name QueryResult --outdir /home/ncms/sp/sqoopjar2




public String toString(DelimiterSet delimiters, boolean useRecordDelim) {
    StringBuilder __sb = new StringBuilder();
    String fieldDelim = "<-T->"; //Similar to the line separator, modify and append it to __sb.append("<-L->") at the end of the method;
    __sb.append(FieldFormatter.escapeAndEnclose(id==null?"null":"" + id, delimiters));
    __sb.append(fieldDelim);
    __sb.append(FieldFormatter.escapeAndEnclose(IS_STRUCTURED==null?"null":IS_STRUCTURED, delimiters));
    __sb.append(fieldDelim);
    __sb.append(FieldFormatter.escapeAndEnclose(is_deleted==null?"null":"" + is_deleted, delimiters));
    __sb.append(fieldDelim);
    __sb.append(FieldFormatter.escapeAndEnclose(last_modify==null?"null":"" + last_modify, delimiters));
    __sb.append(fieldDelim);
    __sb.append(FieldFormatter.escapeAndEnclose(table_name==null?"null":table_name, delimiters));
    __sb.append(fieldDelim);
    __sb.append(FieldFormatter.escapeAndEnclose(table_name2==null?"null":table_name2, delimiters));






2. Open the jar package The information found on the
Internet is incomplete and troublesome. I use a simple method (for hadoop2.*).
a. Create a QueryResult project through eclipse jar package, a QueryResult.java file, and import the jar package in hadoop2.* and sqoop1.4.1.jar in share. Use QueryResult.java to prompt that the jar package cannot be found.
b. Open the jar package through eclipse.
   Pay attention to this step, the jar package is completed







3. Run the command, you can
Note : I put QueryResult.jar in the current directory and execute it, other directories are not tested.

sqoop import --connect jdbc:mysql://192.168.0.1:3306/admin --username admin --password admin --query "SELECT * FROM PUBLISH_INFO where \$CONDITIONS " -m 1 --target-dir /sqoop/ test232 --class-name QueryResult --jar-file QueryResult.jar


4. Verify the result







Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326721290&siteId=291194637