Deadlocks with java.util.concurrent._ in Scala in REPL

clog14 :

I came across the following scenario when studying the book "Functional Programming in Scala" by Paul Chiusano and Runar Bjanarson (Ch. 7 - Purely functional parallelism).

    package fpinscala.parallelism

    import java.util.concurrent._
    import language.implicitConversions


    object Par {
      type Par[A] = ExecutorService => Future[A]

      def run[A](s: ExecutorService)(a: Par[A]): Future[A] = a(s)

      def unit[A](a: A): Par[A] = (es: ExecutorService) => UnitFuture(a) // `unit` is represented as a function that returns a `UnitFuture`, which is a simple implementation of `Future` that just wraps a constant value. It doesn't use the `ExecutorService` at all. It's always done and can't be cancelled. Its `get` method simply returns the value that we gave it.

      private case class UnitFuture[A](get: A) extends Future[A] {
        def isDone = true
        def get(timeout: Long, units: TimeUnit) = get
        def isCancelled = false
        def cancel(evenIfRunning: Boolean): Boolean = false
      }

      def map2[A,B,C](a: Par[A], b: Par[B])(f: (A,B) => C): Par[C] = // `map2` doesn't evaluate the call to `f` in a separate logical thread, in accord with our design choice of having `fork` be the sole function in the API for controlling parallelism. We can always do `fork(map2(a,b)(f))` if we want the evaluation of `f` to occur in a separate thread.
        (es: ExecutorService) => {
          val af = a(es)
          val bf = b(es)
          UnitFuture(f(af.get, bf.get)) // This implementation of `map2` does _not_ respect timeouts. It simply passes the `ExecutorService` on to both `Par` values, waits for the results of the Futures `af` and `bf`, applies `f` to them, and wraps them in a `UnitFuture`. In order to respect timeouts, we'd need a new `Future` implementation that records the amount of time spent evaluating `af`, then subtracts that time from the available time allocated for evaluating `bf`.
        }

      def fork[A](a: => Par[A]): Par[A] = // This is the simplest and most natural implementation of `fork`, but there are some problems with it--for one, the outer `Callable` will block waiting for the "inner" task to complete. Since this blocking occupies a thread in our thread pool, or whatever resource backs the `ExecutorService`, this implies that we're losing out on some potential parallelism. Essentially, we're using two threads when one should suffice. This is a symptom of a more serious problem with the implementation, and we will discuss this later in the chapter.
        es => es.submit(new Callable[A] {
          def call = a(es).get
        })

      def lazyUnit[A](a: => A): Par[A] = fork(unit(a))

 def equal[A](e: ExecutorService)(p: Par[A], p2: Par[A]): Boolean =
    p(e).get == p2(e).get

}

You can find the original code on Github here. See here for the java.util.concurrent documentation.

I am concerned with the implementation of fork. In particular, allegedly fork can lead to deadlocks when the ThreadPool is too small.

I consider the following example:

val a = Par.lazyUnit(42 + 1)
val es: ExecutorService = Executors.newFixedThreadPool(2)
println(Par.fork(a)(es).get)  

I would not expect this example to end up in a deadlock as there are two threads. Yet, it does on my computer when I run it in the Scala REPL. Why is this so?

The output when initializing the ExecutorService is es: java.util.concurrent.ExecutorService =

java.util.concurrent.ThreadPoolE
xecutor@73a86d72[Running, pool size = 0, active threads = 0, queued tasks =
 0, completed tasks = 0]

Is pool size = 0 correct here? In other word, is this a problem of not understanding java.util.concurrent._ or is the issue with not understanding the Scala part?

SergGr :

OK, after a long investigation I believe I have an answer. The full story is long but I'll try to shorten it by simplifying and avoiding many details.

Note: Potentially Scala can be compiled to various different target platforms but this particular issue happened on the Java/JVM as the target so this is what this answer is about.

The deadlock you see has nothing to do with the size of the thread pool. Actually it is the outer fork call that hangs. It is related to a combination of REPL implementation details and multi-threading but it takes learning a few pieces to understand how it happens:

  • how Scala REPL works
  • how Scala compiles objects to Java/JVM
  • how Scala emulates the by-name parameters on Java/JVM
  • how Java/JVM runs the static initializers of classes

A short(er) version (see also Summary at the end) is that this code hangs under a REPL because when it is being executed by the REPL, it is logically similar to the following code:

object DeadLock {

  import scala.concurrent._
  import scala.concurrent.duration.Duration
  import scala.concurrent.ExecutionContext.Implicits.global

  val foo: Int = Await.result(Future(calc()), Duration.Inf)

  def printFoo(): Unit = {
    println(s"Foo = $foo")
  }

  private def calc(): Int = {
    println("Before calc")
    42
  }
}


def test(): Unit = {
  println("Before printFoo")
  DeadLock.printFoo()
  println("After printFoo")
} 

or very similar in the Java world:

class Deadlock {
    static CompletableFuture<Integer> cf;
    static int foo;

    public static void printFoo() {
        System.out.println("Print foo " + foo);
    }

    static {
        cf = new CompletableFuture<Integer>();
        new Thread(new Runnable() {
            @Override
            public void run() {
                calcF();
            }
        }).start();
        try {
            foo = cf.get();
            System.out.println("Future result = " + cf.get());
        } catch (InterruptedException e) {
            e.printStackTrace();f
        } catch (ExecutionException e) {
            e.printStackTrace();
        }
    }


    private static void calcF() {
        cf.complete(42);
    }
}

public static void main(String[] args) {
    System.out.println("Before foo");
    Deadlock.printFoo();
    System.out.println("After foo");
}

If it is clear to you why this code deadlocks, you already know most of the story and probably can deduce the rest yourself. You might just glance over the Summary section at the end.

How Java static initializer can deadlock?

Let's start from the end of this story: why the Java code hangs? It happens because of the two guarantees of the Java/JVM for the static initializer (for more details see the section 12.4.2. Detailed Initialization Procedure of the JLS):

  • the static initializer will be run before any other "external" use of the class

  • the static initializer will be run exactly once and it is done via global locking

The lock used for the static initializer is implicit and managed by the JVM but it is there. It means that the code logically is similar to something like this:

class Deadlock {

    static boolean staticInitFinished = false;
    // unique value for each thread!
    static ThreadLocal<Boolean> currentThreadRunsStaticInit = ThreadLocal.withInitial(() -> Boolean.FALSE);


    static CompletableFuture<Integer> cf;
    static int foo;

    static void enforceStaticInit() {
        synchronized (Deadlock.class) {
            // is init finished?
            if (staticInitFinished)
                return;
            // are we the thread already running the init?
            if(currentThreadRunsStaticInit.get())
                return;
            currentThreadRunsStaticInit.set(true);

            cf = new CompletableFuture<Integer>();
            new Thread(new Runnable() {
                @Override
                public void run() {
                    calcF();
                }
            }).start();
            try {
                foo = cf.get();
                System.out.println("Future result = " + cf.get());
            } catch (InterruptedException e) {
                e.printStackTrace();
            } catch (ExecutionException e) {
                e.printStackTrace();
            }
            currentThreadRunsStaticInit.set(false);
            staticInitFinished = true;
        }
    }

    private static void calcF() {
        enforceStaticInit();
        cf.complete(42);
    }

    public static void printFoo() {
        enforceStaticInit();
        System.out.println("Print foo " + foo);
    }
}

Now it is pretty clear why this code deadlocks: Our static initializer starts a new thread and blocks waiting for result from it. But that new thread tries to access the same class (the calcF method) and being another thread it has to wait for the already running static initializer to finish. Note that if the calcF method was in another class, everything would work just fine.

How Scala REPL works

Now let's get back to the start of the story on how Scala REPL works. This answer is a great simplification of the real deal but it captures the important for this situation details. Luckily for the REPL implementors the Scala compiler is written in Scala. It means the REPL doesn't have to somehow interpret the code, it can run it through the standard compiler and then run the compiled code via Java Reflection API. This still takes some decoration of the code to make the compiler happy and to get the results back.

Simplifying it a bit (or well, a lot), when you type something like

val a = Par.lazyUnit(42 + 1)

into the REPL the code is analyzed and transformed into something like this:

package line3

object read {
    val a = Par.lazyUnit(42 + 1)
    val res3 = a
}

object eval {
    def print() = {
        println("a: Par.Par[Int] = " + read.res3)
    }
}

and then line3.eval.print() is called via reflection.

A similar story happens for:

val es: ExecutorService = Executors.newFixedThreadPool(2)

and finally when you do

Par.fork(a)(es).get

the things get a bit more interesting because you have a dependency on the previous lines that is cleverly implemented using imports:

package line5

object read {
    import line2.read.Par
    import line3.read.a
    import line4.read.es

    val res5 = Par.fork(a)(es).get
}

object eval {
    def print() = {
        println("res5: Int = " + read.res5)
    }
}

The important thing here is that everything you write into REPL is wrapped into a brand new object and then compiled and run as a usual code.

How Scala emulates by-name parameters on Java/JVM

The definition of the fork method uses a by-name parameter:

def fork[A](a: => Par[A]): Par[A] =

Here it is used to evaluate a lazily that is crucial to the whole logic of the fork. The Java/JVM has not standard support for lazy evaluation but it can be emulated and this is what the Scala compiler does. Internally the signature is changed to use a Function0:

def fork[A](aWrapper: () => Par[A]): Par[A] = 

and every access to a is substituted with a call to aWrapper.apply(). Another part of the magic happens on the caller-side of a method with a by-name parameter: there the parameter should also be wrapped into a Function0 so code becomes something like

object read {
    import line2.read.Par
    import line3.read.a
    import line4.read.es

    val res5 = Par.fork(() => a)(es).get
}

But actually it is a bit different. Naively it would take another class just for this small function and this feels wasteful for such a simple logic. In practice in the Scala 2.12 the magic of the Java 8 LambdaMetafactory is used so the code really becomes something like

object read {
    import line2.read.Par
    import line3.read.a
    import line4.read.es

    def aWrapper():Int = a

    val res5 = Par.fork(aWrapper _)(es).get
}

where aWrapper _ signifies converting a method to a Funciton0 that is done with the LambdaMetafactory. As you might suspect from the chapter on the Java static initializer deadlock, introduction of the def aWrapper is a crucial difference. You already can see that this code is very similar to the first Scala snippet in the answer that hangs.

How Scala compiles the object on Java/JVM

The final piece of the puzzle is how Scala object is compiled in Java/JVM. Well it is actually compiled to something similar to a "static class" but since you can use object as an object parameter, it has to be a bit more complicated. In reality all the initialization logic is moved to the constructor of the object class and there is simple static initializer that calls it. So our last read object in Java would (ignoring the imports) look like this:

class read$ {
    static read$ MODULE$

    static {
        new read$()
    }

    private Par[Int] res5;

    private read$() {
        MODULE$ = this;
        res5 = Par.fork(read$::aWrapper)(es).get
    }

    private static int aWrapper(){
        return line3.read$.MODULE$.a;
    }
}

here again read$::aWrapper signifies building a Function0 form the aWrapper method using the LambdaMetafactory. In other words the initialization of the Scala object is translated into a code that is run as a part of the Java static initializer.

Summary

To sum up how things get screwed:

  • REPL turns your code into a new object for each line and compiles it

  • object initialization logic is translated into a Java static initialization logic

  • call of a method with a by-name parameter in simple cases is translated into a method that wraps the "return the value" logic and that method is added to the same class or object

  • Par.fork being executed as a part of the object initialization (i.e. a part of the Java static initializer) tries to evaluate the by-name parameter (i.e. calls the method on the same class) on a different thread and blocks waiting for the result of that thread

  • Java static initializer is logically executed under a global lock so it blocks that different thread calling the method. But it is itself blocked waiting for that method call to finish.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=70438&siteId=1