Spark(5)Upgrade the Spark to 1.0.2 Version
1. Upgrade the Version to 1.0.2
If plan to build from the source
>git clone https://github.com/apache/spark.git
check out the tag version
>git tag -l
It will list the tags
>git checkout v1.0.2
>git pull origin v1.0.2
>sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly
>sbt/sbt -Dhadoop.version=2.2.0 -Pyarn publish-local
Or build the normal version
>sbt/sbt update
>sbt/sbt compile
>sbt/sbt assembly
But I download the binary version from official website and go on with my example.
Error Message
14/08/08 17:33:03 WARN scheduler.TaskSetManager: Loss was due to java.lang.NoClassDefFoundError java.lang.NoClassDefFoundError: Could not initialize class scala.Predef$
Bad type in putfield/putstatic
14/08/08 22:07:07 ERROR executor.ExecutorUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,run-main-group-0] java.lang.VerifyError: Bad type on operand stack Exception Details: Location: scala/collection/IndexedSeq$.ReusableCBF$lzycompute()Lscala/collection/generic/GenTraversableFactory$GenericCanBuildFrom; @19: putfield Reason: Type 'scala/collection/IndexedSeq$$anon$1' (current frame, stack[1]) is not assignable to 'scala/collection/generic/GenTraversableFactory$GenericCanBuildFrom' Current Frame: bci: @19 flags: { } locals: { 'scala/collection/IndexedSeq$', 'scala/collection/IndexedSeq$' } stack: { 'scala/collection/IndexedSeq$', 'scala/collection/IndexedSeq$$anon$1' }
Solution:
https://spark.apache.org/docs/latest/tuning.html#data-serialization
joda time problem.
-Duser.timezone=UTC
new DateTime(DateTimeZone.forID("UTC"))
Object serializer
https://github.com/EsotericSoftware/kryo
Update the scala version to 2.10.4
2. Deployment
Standalone
Start Master
>vi conf/spark-env.sh
SPARK_MASTER_IP=localhost
SPARK_LOCAL_IP=localhost
>./sbin/start-master.sh
The main class org.apache.spark.deploy.master.Master
Web UI
http://localhost:8080/
Start Worker
>./bin/spark-class org.apache.spark.deploy.worker.Worker spark://localhost:7077
Error Message
java.lang.NoClassDefFoundError: com/google/protobuf/ProtocolMessageEnum
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
Solution:
>wget http://central.maven.org/maven2/com/google/protobuf/protobuf-java/2.4.1/protobuf-java-2.4.1.jar
>wget http://central.maven.org/maven2/org/spark-project/protobuf/protobuf-java/2.4.1-shaded/protobuf-java-2.4.1-shaded.jar
Check the log file, Error Message
>tail -f spark-root-org.apache.spark.deploy.master.Master-1-carl-macbook.local.out
Error Message:
14/08/09 03:48:45 ERROR EndpointWriter: AssociationError [akka.tcp://[email protected]:7077] -> [akka.tcp://[email protected]:62531]: Error [Association failed with [akka.tcp://[email protected]:62531]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://[email protected]:62531] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: /192.168.11.11:62531 ]
Solution:
After a while I switch to try to build it myself.
I directly go to System Conferences to add one user spark.
>ssh-keygen -t rsa
if needed
Find the public key
>cat /Users/carl/.ssh/id_rsa.pub
Seems not working. So I plan to do that on ubuntu VM machine.
>sudo adduser sparkWorker --force-badname
Checkout my host
>host ubuntu-master
Host ubuntu-master not found: 3(NXDOMAIN)
Check my running of spark
>netstat -at | grep 7077
tcp6 0 0 ubuntu-master:7077 [::]:* LISTEN
>bin/spark-submit --class com.sillycat.spark.app.ClusterComplexJob --master spark://192.168.11.12:7077 --total-executor-cores 1 /Users/carl/work/sillycat/sillycat-spark/target/scala-2.10/sillycat-spark-assembly-1.0.jar
Turn off ipv6 on MAC
networksetup -listallnetworkservices | sed 1d | xargs -I {} networksetup -setv6off {}
All of these tries are not working.
Try the latest version from github, 1.1.0-SNAPSHOT.
The standalone cluster is still not working.
References:
http://spark.apache.org/docs/latest/spark-standalone.html
http://spark.apache.org/docs/latest/building-with-maven.html
https://github.com/mesos/spark.git
http://www.iteblog.com/archives/1038
http://www.iteblog.com/archives/1016
My Spark Blogs
http://sillycat.iteye.com/blog/1871204
http://sillycat.iteye.com/blog/1872478
http://sillycat.iteye.com/blog/2083193
http://sillycat.iteye.com/blog/2083194
ubuntu add/remove user
https://www.digitalocean.com/community/tutorials/how-to-add-and-delete-users-on-ubuntu-12-04-and-centos-6
https://spark.apache.org/docs/latest/submitting-applications.html
https://spark.apache.org/docs/latest/spark-standalone.html
disable ipv6
http://askubuntu.com/questions/309461/how-to-disable-ipv6-permanently
spark source code
http://www.cnblogs.com/hseagle/p/3673147.html
Spark(5)Upgrade the Spark to 1.0.2 Version
猜你喜欢
转载自sillycat.iteye.com/blog/2103288
今日推荐
周排行