Impala编译:一个maven编译错误的解决

编译Impala时遇到了一个maven错误,准确地说是编译testdata模块时报的错。我用的指令是 “./buildall.sh -skiptests -format -testdata”,遇到的错误如下:

========================================================================
Running mvn  -U package
Directory /home/quanlong/workspace/Impala/testdata
========================================================================
19:54:15 [WARNING] Could not transfer metadata com.cloudera.cdh:cdh-root:6.x-SNAPSHOT/maven-metadata.xml from/to ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): Cannot access ${distMgmtSnapshotsUrl} with type default using the available connector factories: BasicRepositoryConnectorFactory
19:54:26 [WARNING] The POM for org.apache.parquet:parquet-avro:jar:1.10.99-cdh6.x-20200124.115524-1814051 is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details
19:54:29 [ERROR] COMPILATION ERROR : 
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[39,42] package org.apache.parquet.hadoop.metadata does not exist
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[79,46] cannot access org.apache.parquet.hadoop.ParquetWriter
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[80,26] cannot find symbol
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[81,26] cannot find symbol
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[78,47] cannot access org.apache.parquet.hadoop.api.WriteSupport
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[87,15] cannot find symbol
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[90,13] cannot find symbol
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[66,9] cannot find symbol
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[67,26] cannot find symbol
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[68,26] cannot find symbol
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[75,15] cannot find symbol
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[78,13] cannot find symbol
19:54:29 [INFO] BUILD FAILURE
19:54:29 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) on project impala-testdata: Compilation failure: Compilation failure: 
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[39,42] package org.apache.parquet.hadoop.metadata does not exist
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[79,46] cannot access org.apache.parquet.hadoop.ParquetWriter
19:54:29 [ERROR]   class file for org.apache.parquet.hadoop.ParquetWriter not found
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[80,26] cannot find symbol
19:54:29 [ERROR]   symbol:   variable DEFAULT_BLOCK_SIZE
19:54:29 [ERROR]   location: class org.apache.parquet.avro.AvroParquetWriter
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[81,26] cannot find symbol
19:54:29 [ERROR]   symbol:   variable DEFAULT_PAGE_SIZE
19:54:29 [ERROR]   location: class org.apache.parquet.avro.AvroParquetWriter
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[78,47] cannot access org.apache.parquet.hadoop.api.WriteSupport
19:54:29 [ERROR]   class file for org.apache.parquet.hadoop.api.WriteSupport not found
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[87,15] cannot find symbol
19:54:29 [ERROR]   symbol:   method write(org.apache.avro.generic.GenericRecord)
19:54:29 [ERROR]   location: variable writer of type org.apache.parquet.avro.AvroParquetWriter<org.apache.avro.generic.GenericRecord>
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/JsonToParquetConverter.java:[90,13] cannot find symbol
19:54:29 [ERROR]   symbol:   method close()
19:54:29 [ERROR]   location: variable writer of type org.apache.parquet.avro.AvroParquetWriter<org.apache.avro.generic.GenericRecord>
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[66,9] cannot find symbol
19:54:29 [ERROR]   symbol:   variable CompressionCodecName
19:54:29 [ERROR]   location: class org.apache.impala.datagenerator.RandomNestedDataGenerator
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[67,26] cannot find symbol
19:54:29 [ERROR]   symbol:   variable DEFAULT_BLOCK_SIZE
19:54:29 [ERROR]   location: class org.apache.parquet.avro.AvroParquetWriter
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[68,26] cannot find symbol
19:54:29 [ERROR]   symbol:   variable DEFAULT_PAGE_SIZE
19:54:29 [ERROR]   location: class org.apache.parquet.avro.AvroParquetWriter
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[75,15] cannot find symbol
19:54:29 [ERROR]   symbol:   method write(org.apache.avro.generic.GenericData.Record)
19:54:29 [ERROR]   location: variable writer of type org.apache.parquet.avro.AvroParquetWriter<org.apache.avro.generic.GenericRecord>
19:54:29 [ERROR] /home/quanlong/workspace/Impala/testdata/src/main/java/org/apache/impala/datagenerator/RandomNestedDataGenerator.java:[78,13] cannot find symbol
19:54:29 [ERROR]   symbol:   method close()
19:54:29 [ERROR]   location: variable writer of type org.apache.parquet.avro.AvroParquetWriter<org.apache.avro.generic.GenericRecord>
19:54:29 [ERROR] -> [Help 1]
19:54:29 [ERROR] 
19:54:29 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
19:54:29 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
19:54:29 [ERROR] 
19:54:29 [ERROR] For more information about the errors and possible solutions, please read the following articles:
19:54:29 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
mvn  -U package exited with code 0
ERROR in /home/quanlong/workspace/Impala/bin/create_testdata.sh at line 34: ${IMPALA_HOME}/bin/mvn-quiet.sh package
Generated: /home/quanlong/workspace/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.create_testdata.20200410_02_54_29.xml

这看着像是缺依赖导致的包和类找不到。正好我又有一个能正常编译Impala的环境,两边都执行以下指令对比下依赖:

(push testdata && mvn && mvn dependency:tree)

对比发现关于 org.apache.parquet:parquet-avro:jar 的依赖不一样,正常的环境里是这样:

[INFO] +- org.apache.parquet:parquet-avro:jar:1.10.99-cdh6.x-SNAPSHOT:compile
[INFO] |  +- org.apache.parquet:parquet-column:jar:1.10.99-cdh6.x-SNAPSHOT:compile
[INFO] |  |  +- org.apache.parquet:parquet-common:jar:1.10.99-cdh6.x-SNAPSHOT:compile
[INFO] |  |  \- org.apache.parquet:parquet-encoding:jar:1.10.99-cdh6.x-SNAPSHOT:compile
[INFO] |  +- org.apache.parquet:parquet-hadoop:jar:1.10.99-cdh6.x-SNAPSHOT:compile
[INFO] |  |  +- org.apache.parquet:parquet-jackson:jar:1.10.99-cdh6.x-SNAPSHOT:compile
[INFO] |  |  \- commons-pool:commons-pool:jar:1.6:compile
[INFO] |  \- org.apache.parquet:parquet-format-structures:jar:1.10.99-cdh6.x-SNAPSHOT:compile
[INFO] |     \- javax.annotation:javax.annotation-api:jar:1.3.2:compile
[INFO] \- org.kitesdk:kite-data-core:jar:1.0.0-cdh6.x-SNAPSHOT:compile
[INFO]    +- org.kitesdk:kite-hadoop-compatibility:jar:1.0.0-cdh6.x-SNAPSHOT:compile
[INFO]    +- org.xerial.snappy:snappy-java:jar:1.1.4:compile
[INFO]    +- net.sf.opencsv:opencsv:jar:2.3:compile
[INFO]    +- org.apache.commons:commons-jexl:jar:2.1.1:compile
[INFO]    +- com.fasterxml.jackson.core:jackson-annotations:jar:2.9.10:compile
[INFO]    \- com.fasterxml.jackson.core:jackson-core:jar:2.9.10:compile

出错的环境里是这样:

[INFO] +- org.apache.parquet:parquet-avro:jar:1.10.99-cdh6.x-SNAPSHOT:compile
[INFO] \- org.kitesdk:kite-data-core:jar:1.0.0-cdh6.x-SNAPSHOT:compile
[INFO]    +- org.kitesdk:kite-hadoop-compatibility:jar:1.0.0-cdh6.x-SNAPSHOT:compile
[INFO]    +- org.xerial.snappy:snappy-java:jar:1.1.4:compile
[INFO]    +- net.sf.opencsv:opencsv:jar:2.3:compile
[INFO]    \- org.apache.commons:commons-jexl:jar:2.1.1:compile

少了由 parquet-avro:jar 引入的传递依赖。怀疑是两个SNAPSHOT包的pom文件不一样,但对比后发现是一致的。这时在输出里搜parquet,才发现下面这个warning!

[WARNING] The POM for org.apache.parquet:parquet-avro:jar:1.10.99-cdh6.x-20200124.115524-1814051 is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details

这解释了出错的环境里为什么parquet-avro的传递依赖没有引入。为了具体找到原因,给mvn加-X打开debug日志:

(pushd testdata && mvn -X dependency:tree)

输出比较多,找到相关的一段是这样:

[WARNING] The POM for org.apache.parquet:parquet-avro:jar:1.10.99-cdh6.x-20200124.115524-1814051 is invalid, transitive dependencies (if any) will not be available: 1 problem was encountered while building the effective model for org.apache.parquet:parquet-avro:1.10.99-cdh6.x-SNAPSHOT
[FATAL] Non-parseable POM /home/quanlong/.m2/repository/org/apache/parquet/parquet/1.10.99-cdh6.x-SNAPSHOT/parquet-1.10.99-cdh6.x-SNAPSHOT.pom: processing instruction started on line 307 and column 14 was not closed (position: START_TAG seen ...<?ignore\n           <execution>... @308:23)  @ /home/quanlong/.m2/repository/org/apache/parquet/parquet/1.10.99-cdh6.x-SNAPSHOT/parquet-1.10.99-cdh6.x-SNAPSHOT.pom, line 308, column 23

这是说pom文件的 line 308, column 23 位置有个没close的语句块。人肉检查了一下感觉语法没问题,而且这个pom文件跟我正常机器上的pom文件是一样的。

难道是maven有bug?于是我看了下所用的maven版本,果然不一样。出问题的环境用的是apache-maven-3.6.1,能正常编译的环境用的是apache-maven-3.6.2。当我把出问题的环境的maven换成3.6.2时,编译就成功了!换回来3.6.1,仍是报同样的错。因此是maven的bug确认无疑了。看了maven-3.6.2的release note,找到就是这个bug导致的:https://issues.apache.org/jira/browse/MNG-6707

总结

编译Impala时不要使用 maven 3.6.1,否则在编译 testdata 模块时会报前述错误。

猜你喜欢

转载自blog.csdn.net/huang_quanlong/article/details/105435921