Do you really understand thread switching in RxJava?

Using RxJava can easily achieve thread switching, so it is often used to replace native tool classes such as AsyncTask and Handler in Android. Although it is simple to use, if you don't understand the basic principles behind it, you may write a bug due to improper use. This article will take you to briefly understand the implementation principle of RxJava thread switching and the precautions in development

1. Basic Usage

  • Scheduler

If you want to introduce multithreading into your cascade of Observable operators, you can do so by instructing those operators to operate on particular Schedulers.
Through the Scheduler, let the operators run on the specified thread, so as to achieve multi-thread scheduling

  • observerOn

specify the Scheduler on which an observer will observe this
Observable

  • subscribeOn

specify the Scheduler on which an Observable will operate
their own on which to perform the specified scheduler Observable

Each operator in the RxJava call chain will create a new Observable, and the new Observable generated by the operator will register a callback to the upper Observable. The implementation principle of subscribeOn and observeOn is the same:

  • subscribeOn subscribes upstream in the specified thread (call the upstream subscribe method in the specified thread)
  • ObserveOn calls downstream callback methods (onNext/onError/onComplete, etc.) in the specified thread after receiving the data

RxJava establishes a subscription from bottom to top, and then transmits data from top to bottom, so even if subscribeOn appears after observeOn, it can guarantee the thread that the data source runs, because the subscription always occurs first.

2. subscribeOn

2.1 Implementation principle

Learn about the basic principle of subscribeOn to achieve thread switching through the source code

//ObservableSubscribeOn.java
final class ObservableSubscribeOn extends Observable<T> {
    
    @Override
    public void subscribeActual(final Observer<? super T> s) {
        final SubscribeOnObserver<T> parent = new SubscribeOnObserver<T>(s);
        s.onSubscribe(parent);
        // 没有直接调用subscribe订阅,而是先进行了线程变换(scheduler.scheduleDirect)
        parent.setDisposable(
            scheduler.scheduleDirect(new SubscribeTask(parent)));
    }
    
    final class SubscribeTask implements Runnable {
        @Override
        public void run() {
            // run()会在指定的scheduler调用,向上游订阅时线程已经发生了变化
            // 所以保证了上游所运行的线程
            source.subscribe(parent);
        }
    }
    
    static final
    class SubscribeOnObserver<T> implements Observer<T>, Disposable {
       
        @Override
        public void onNext(T t) {
            // 收到数据后不进行线程变换
            actual.onNext(t);
        }
    }
}

2.2 subscribeOn only takes effect once

subscribeOn changes Observable.createthe thread by switching the subscription thread, thereby affecting the data transmission thread.

Since the subscription process is bottom-up, Observable.create is only affected by the latest subscribeOn. When there are multiple subscribeOns in the call chain, only the first one is valid. Other subscibeOn can still affect its upstream doOnSubscribeexecution thread.

@Test
fun test() {
    Observable.create<Unit> { emitter ->
        log("onSubscribe")
        emitter.onNext(Unit)
        emitter.onComplete()
    }.subscribeOn(namedScheduler("1 - subscribeOn"))
        .doOnSubscribe { log("1 - doOnSubscribe") }
        .subscribeOn(namedScheduler("2 - subscribeOn"))
        .doOnSubscribe { log("2 - doOnSubscribe") }
        .doOnNext { log("onNext") }
        .test().awaitTerminalEvent() // Wait until observable completes
 }

2.3 Correctly understand the meaning of subscribeOn

Even though we added .subscribeOn() that is not enough. SubscribeOn operator only switches the subscribing process to the desired thread, but that doesn't mean the items will be emitted on that thread.
subscribeOn is used to determine the subscription thread, but this does not Does not mean that upstream data must come from this thread

@Test
fun test() {
    val observable = Observable.create<Int> { emitter ->
        log("onSubscribe")
        thread(name = "Main thread", isDaemon = false) {
            log("1 - emitting"); emitter.onNext(1)
            log("2 - emitting"); emitter.onNext(2)
            log("3 - emitting"); emitter.onNext(3)
            emitter.onComplete()
        }
    }
    
    observable
        .subscribeOn(Schedulers.computation())
        .doOnNext { log("$it - after subscribeOn") }
        .test().awaitTerminalEvent() // Wait until observable completes
}

A correct understanding of the meaning of subscribeOn helps to avoid some misunderstandings in use:

Invalid for PublishSubject

@Test
fun test() {
    val subject = PublishSubject.create<Int>()
    val observer1 = subject
        .subscribeOn(Schedulers.io())
        .doOnNext { log("$it - I want this happen on an IO thread") }
        .test()
    val observer2 = subject
        .subscribeOn(Schedulers.newThread())
        .doOnNext { log("$it - I want this happen on a new thread") }
        .test()
    
    sleep(10); 
    subject.onNext(1)
    subject.onNext(2)
    subject.onNext(3)
    subject.onComplete()
    
    observer1.awaitTerminalEvent()
    observer2.awaitTerminalEvent()
}

For PublishSubject, which thread the upstream data comes from is determined at onNext, so it does not make sense to use subscribeOn for a PublishSubject.

Not valid for Observable.just()

Usually subcribeOn can determine the execution thread of Observable.create {...}, so a mistake that many beginners are likely to make is to do time-consuming tasks in Observable.just(...) and mistakenly think that they will run in the subscribeOn thread :

As above, readFromDb()it is obviously inappropriate to put in just. just() is executed immediately in the current thread, so it is not affected by subscribeOn and should be modified as follows:

//Observable.defer
Observable.defer { Observable.just(readFromDb()) }
    .subscribeOn(Schedulers.io())
    .subscribe { ... }

//Observable.fromCallable
Observable.fromCallable { readFromDb() }
    .subscribeOn(Schedulers.io())
    .subscribe { ... }

Use flatMap to handle concurrency

The current Observable subscription thread determined by subscribeOn, so pay special attention to the use of flatMap

Observable.fromIterable(listOf("id1", "id2", "id3"))
    .flatMap { id -> loadData(id) }
    .subscribeOn(Schedulers.io())
    .observeOn(mainThread())
    .toList()
    .subscribe { result -> log(result) }

If we want multiple loadData(id)concurrent executions, the above writing is wrong.

subscribeOn determines the upstream thread of flatMap. FlatMap returns multiple Observable subscriptions that occur in this thread. Multiple subscriptions loadDatacan only run on a single thread and cannot be parallelized.

To achieve the effect of parallel execution, you need to modify the following:

Observable.fromIterable(listOf("id1", "id2", "id3"))
    .flatMap { id ->
        loadData(id)
            .subscribeOn(Schedulers.io())
    }
    .observeOn(mainThread())
    .toList()
    .subscribe { result -> log(result) }

3.observeOn

3.1 Implementation principle

Through the source code to understand the basic principle of observeOn to achieve thread switching

//ObservableObserveOn.java
final class ObservableObserveOn extends Observable<T> {

    @Override
    protected void subscribeActual(Observer<? super T> observer) {
        if (scheduler instanceof TrampolineScheduler) {
            source.subscribe(observer);
        } else { 
            Scheduler.Worker w = scheduler.createWorker();
            // 直接向上游订阅数据,不进行线程切换,切换操作在Observer中进行
            source.subscribe(
                new ObserveOnObserver<T>(observer, w, delayError, bufferSize));
        }
    }
    
    static final class ObserveOnObserver<T> implements Observer<T>, Runnable {
        
        @Override
        public void onNext(T t) {
            if (done) {
                return;
            }
            // 这里选把数据放到队列中,增加吞吐量,提高性能
            if (sourceMode != QueueDisposable.ASYNC) {
                queue.offer(t);
            }
            // 在schedule方法里进行线程切换并把数据循环取出
            // 回调给下游,下游会在指定的线程中收到数据
            schedule();
        }
    
        void schedule() {
            if (this.getAndIncrement() == 0) {
                //切换线程
                this.worker.schedule(this);
            }
    
        }
    }
}

3.2 observeOn takes effect multiple times

Unlike subscribeOn, observeOn can have multiple and each will take effect

  • The thread switched by subscribeOn can be monitored through doOnSubscribe
  • observeOn switching threads can listen through doOnNext

3.3 Can serialization be guaranteed when multiple items are continuously transmitted?

After observeOn uses Scheduler to schedule threads, does the downstream run in a single thread or in multiple threads? Can you guarantee the orderliness of downstream data?

@Test
fun test() {
    Observable.create<Int> { emitter ->
        repeat(10) {
            emitter.onNext(it)
        }
        emitter.onComplete()
    }.observeOn(Schedulers.io())
        .subscribe {
            log(" - $it")
        }
}

It can be seen from the results that even after being scheduled by the Scheduler, the downstream is still running in a single thread, which can ensure the order of data in the entire call chain.

So why are all running in a single thread after being scheduled by the Scheduler?

4. Scheduler

4.1 Implementation principle

Scheculer does not directly schedule Runnable, but creates Worker, and then Worker schedules specific tasks.

SubscribeTaskBoth subscribeOn and observeOn ObserveOnObserverimplement Runnable, so they are ultimately executed in Worker.

4.2 Tasks are scheduled by Worker

One Scheduler can create multiple Workers, and one Worker can manage multiple Tasks (Runnable)

Workers exist to ensure two things:

  • Tasks created by the same Worker ensure serial execution, and tasks executed immediately comply with the first-in-first-out principle.
  • Worker is bound to the Runnable that calls his method. When the Worker is cancelled, all tasks based on him are cancelled

4.3 How does Worker guarantee serialization?

Very simple, each Worker has only one thread

Now we can answer the question: Why does observeOn still run in a single thread after being scheduled by Scheduler?

Scheduler assigns a unique Worker to each observeOn, so the downstream of observeOn can guarantee serial execution in a single thread.

//ObservableObserveOn.java
final class ObservableObserveOn extends Observable<T> {

    @Override
    protected void subscribeActual(Observer<? super T> observer) {
        if (scheduler instanceof TrampolineScheduler) {
            source.subscribe(observer);
        } else { 
            Scheduler.Worker w = scheduler.createWorker();
            source.subscribe(
                new ObserveOnObserver<T>(observer, w, delayError, bufferSize)); //传入worker
        }
    }
  
  ...
  
}

As above, Worker is held as a member variable of ObserveOnObserver

4.4 Preset Schedulers

Just like Executors provides a variety of ThreadPoolExecutors, Schedulers provides a variety of preset Schedulers

No. Schedulers & Descriptions
1 Schedulers.single() is a
globally unique thread, no matter how many observables there are, they all share this unique thread.
2 Schedulers.io() is
one of the most common schedulers, used for IO-related operations, such as network requests and file operations. The IO scheduler is supported by the thread pool. It first creates a worker thread, which can be reused for other operations. When this worker thread (in the case of a long-time task) cannot be reused, it creates a new thread to handle other operations.
3 Schedulers.computation()
is very similar to IO scheduler and is also implemented based on thread pool. The number of threads available is fixed, consistent with the number of cpu cores. When all threads are busy, the new task can only be in a waiting state. Therefore, it is not suitable for IO related operations. It is suitable for some calculation operations, and a single calculation task will not occupy threads for a long time.
4 Schedulers.newThread()
creates a new thread every time it is called
5 Schedulers.trampoline()
is executed in the current thread without switching threads.
6 Schedulers.from(java.util.concurrent.Executor executor) is
more like a custom IO scheduler. We can create a custom thread pool by specifying the size of the thread pool. The number of observables is suitable for use in scenarios with too many IO schedulers,
//Sample of Schedulers.from
fun namedScheduler(name: String): Scheduler {
    return Schedulers.from(
        Executors.newCachedThreadPool { Thread(it, name) }
    )
}

Thread-Safety

5.1 Are RxJava operators thread safe?

@Test
fun test() {
    val numberOfThreads = 1000
    val publishSubject = PublishSubject.create<Int>()
    val actuallyReceived = AtomicInteger()

    publishSubject
        .take(300).subscribe {
            actuallyReceived.incrementAndGet()
        }

    val latch = CountDownLatch(numberOfThreads)
    var threads = listOf<Thread>()

    (0..numberOfThreads).forEach {
        threads += thread(start = false) {
            publishSubject.onNext(it)
            latch.countDown()
        }
    }

    threads.forEach { it.start() }
    latch.await()

    val sum = actuallyReceived.get()
    check(sum == 300) { "$sum != 300" }
}

The result is not as expected because it takeis not thread-safe

Take a look at the source code of take

public final class ObservableTake<T> extends AbstractObservableWithUpstream<T, T> {
    final long limit;

    public ObservableTake(ObservableSource<T> source, long limit) {
        super(source);
        this.limit = limit;
    }
    protected void subscribeActual(Observer<? super T> observer) {
        this.source.subscribe(new ObservableTake.TakeObserver(observer, this.limit));
    }

    static final class TakeObserver<T> implements Observer<T>, Disposable {
        final Observer<? super T> downstream;
        boolean done;
        Disposable upstream;
        long remaining;

        TakeObserver(Observer<? super T> actual, long limit) {
            this.downstream = actual;
            this.remaining = limit;
        }

        public void onNext(T t) {
            if (!this.done && this.remaining-- > 0L) {
                boolean stop = this.remaining == 0L;
                this.downstream.onNext(t);
                if (stop) {
                    this.onComplete();
                }
            }

        }
    }
}

Right on cue remaining--is not locked operation

5.2 Thread safety of observableOn

Then if observableOn is added, serialization is guaranteed, because take can run on a single thread

@Test
fun test() {
    repeat(10000) {
        val numberOfThreads = 1000
        val publishSubject = PublishSubject.create<Int>()
        val actuallyReceived = AtomicInteger()
    
        publishSubject
            .observeOn(Schedulers.io())
            .take(300).subscribe {
                actuallyReceived.incrementAndGet()
            }
    
        val latch = CountDownLatch(numberOfThreads)
        var threads = listOf<Thread>()
    
        (0..numberOfThreads).forEach {
            threads += thread(start = false) {
                publishSubject.onNext(it)
                latch.countDown()
            }
        }
    
        threads.forEach { it.start() }
        latch.await()
    
        check(actuallyReceived.get() == 300)
    }
}

Unfortunately, there are still problems after running it many times, because observableOn itself is not thread-safe. ObservableOn uses queuea non-thread-safe queue.

5.3 The Observable Contract

Rx has clearly told us in the definition of Observable:

Observables must issue notifications to observers serially (not in parallel). They may issue these notifications from different threads, but there must be a formal happens-before relationship between the notifications.
reactivex.io/documentati…

As a conclusion, RxJava's operators are not thread-safe by default .

But operators that receive multiple Observables, such as merge(), combineLatest(), zip(), etc., are thread-safe, so even if multiple Observables come from different threads, you don’t need to consider thread safety issues.

 

Guess you like

Origin blog.csdn.net/qq_39477770/article/details/113096415