When Sqoop performs each export task, it will call codegen, generate a java file, and compile and package it into a jar for MapReduce to use. This java file wraps a series of access interfaces to the exported data. We can try to find a way to specify the double-byte delimiter by analyzing this java file.
Generally, if --query is used to obtain data with a query statement, the generated files are QueryResult.java and QueryResult.jar. If --table is used, the corresponding file is named with the specified table name. The java files are generated in the same directory as the sqoop script.
1. Modify the QueryResult.java file
to generate QueryResult.java to the /home/ncms/sp/sqoopjar2 directory
sqoop codegen --connect jdbc:mysql://192.168.0.1:3306/admin --username admin --password admin --query "SELECT i.*,c.description FROM PUBLISH_INFO i INNER JOIN PUBLISH_CONTENT c ON i.id=c.id where \$CONDITIONS " --bindir /home/ncms/sp/sqoopjar2 --class-name QueryResult --outdir /home/ncms/sp/sqoopjar2
public String toString(DelimiterSet delimiters, boolean useRecordDelim) { StringBuilder __sb = new StringBuilder(); String fieldDelim = "<-T->"; //Similar to the line separator, modify and append it to __sb.append("<-L->") at the end of the method; __sb.append(FieldFormatter.escapeAndEnclose(id==null?"null":"" + id, delimiters)); __sb.append(fieldDelim); __sb.append(FieldFormatter.escapeAndEnclose(IS_STRUCTURED==null?"null":IS_STRUCTURED, delimiters)); __sb.append(fieldDelim); __sb.append(FieldFormatter.escapeAndEnclose(is_deleted==null?"null":"" + is_deleted, delimiters)); __sb.append(fieldDelim); __sb.append(FieldFormatter.escapeAndEnclose(last_modify==null?"null":"" + last_modify, delimiters)); __sb.append(fieldDelim); __sb.append(FieldFormatter.escapeAndEnclose(table_name==null?"null":table_name, delimiters)); __sb.append(fieldDelim); __sb.append(FieldFormatter.escapeAndEnclose(table_name2==null?"null":table_name2, delimiters));
2. Open the jar package The information found on the
Internet is incomplete and troublesome. I use a simple method (for hadoop2.*).
a. Create a QueryResult project through eclipse jar package, a QueryResult.java file, and import the jar package in hadoop2.* and sqoop1.4.1.jar in share. Use QueryResult.java to prompt that the jar package cannot be found.
b. Open the jar package through eclipse.
Pay attention to this step, the jar package is completed
3. Run the command, you can
Note : I put QueryResult.jar in the current directory and execute it, other directories are not tested.
sqoop import --connect jdbc:mysql://192.168.0.1:3306/admin --username admin --password admin --query "SELECT * FROM PUBLISH_INFO where \$CONDITIONS " -m 1 --target-dir /sqoop/ test232 --class-name QueryResult --jar-file QueryResult.jar
4. Verify the result