【Android】Kotlin 中的Flow是个什么东西

前言

Kotlin Flow 是 Kotlin Coroutine 用于异步获取数据流的一个库。它允许我们以类似于集合的方式发射多个异步生成的值，并通过类似于 RxJava 的操作符链式处理这些值。

基本概念

Flow 的基本概念是，一个 Flow 代表了一个异步生成的值序列，这些值可能会在不同的时间点被发送出去，而接收方可以使用 suspend 函数来订阅这个 Flow 并逐个消费这些值。使用 Flow 可以避免一次性拉取大批量的数据而导致内存问题，同时也比 LiveData 更加灵活。

示例

一个最简单的 Flow 示例代码如下：

import kotlinx.coroutines.*
import kotlinx.coroutines.flow.*

fun main() = runBlocking<Unit> {
    
    
   val flow = flow {
    
     
       for (i in 1..3) {
    
    
           delay(100)
           emit(i) // 发射数据
       }
   }
   flow.collect {
    
     value -> println(value) } 
}

上述代码中，我们创建了一个 Flow，该 Flow 在每隔 100 毫秒后发射 1、2 和 3 三个数字。使用 collect 方法可以订阅这个 Flow 并依次消费这些数字，将其打印到控制台中。

这个例子包含了两种操作符：构造操作符flow与末端操作符collect

总结来说，flow调用流程简化为：两个操作符+两个闭包+emit函数：

collect操作符触发调用，执行了flow的闭包

flow闭包里调用emit函数，执行了collect闭包

操作符分为三类：

`构建操作符`

构建操作符用来创建 Flow 对象，可以类比于 Java 中的 Stream.of() 方法。常见的构建操作符有：

flowOf：从给定的值序列中创建 Flow。

asFlow：将集合、数组或迭代器转换成 Flow。

emptyFlow：创建一个空的 Flow。

channelFlow：通过发送到通道并从中接收数据来创建 Flow。

使用构建操作符创建 Flow 时，需要注意 Flow 的运行环境。如果 Flow 在协程作用域之外创建，那么它会立即执行。如果 Flow 是在协程作用域中创建的，则其运行受协程范围的限制。

`中间操作符`

中间操作符用于对 Flow 进行数据转换、过滤等处理，返回的仍然是 Flow 对象。常见的中间操作符有：

map：对 Flow 中的每个元素进行变换操作，返回一个新的 Flow。

filter：根据条件过滤 Flow 中的元素，返回一个新的 Flow。

transform：可以对 Flow 元素进行自定义的变换操作。

take：选取 Flow 中的前几个元素。

zip：将两个 Flow 组合成一个 Flow，每个元素由两个 Flow 中相应的元素组成。

Flow 的中间操作符可以串联起来，形成一个操作链，并且在流逝的过程中只进行数据的转换与处理，直到最后调用末端操作符时才会触发数据的实际消费。

`末端操作符`

末端操作符用于触发 Flow 的执行，并返回一个结果。常见的末端操作符有：

collect：遍历 Flow 并对每个元素执行指定操作。

reduce：将 Flow 中的元素聚合成单个值。

singleOrNull：判断 Flow 是否只有一个元素，如果是，返回该元素，否则返回 null。

toList/toSet：将 Flow 中的元素转换为 List 或 Set。

注意，末端操作符是会阻塞协程的，因为它需要等待 Flow 中所有的元素都被消费完才能结束。若协程中存在多个末端操作符，则每个末端操作符都会重新遍历整个 Flow。

`操作符的使用示例：`

1.Flow 变换操作符 `map`

map，它可以将 Flow 中的数据进行映射转换，返回一个新的 Flow.

import kotlinx.coroutines.*
import kotlinx.coroutines.flow.*

fun main() = runBlocking<Unit> {
    
    
    val lengthFlow = flowOf(1.0, 2.0, 3.14, 10.0) // 表示各种单位的长度
        .map {
    
     it * 1000 } // 将长度转换为 mm
        .map {
    
     it.toString() + "mm" } // 转换为字符串表示并加上单位
    lengthFlow.collect {
    
     println(it) }
}

我们首先创建了一个 Flow lengthFlow 表示各种单位的长度。使用 map 将 Flow 中的数据转换成以毫米（mm）为单位的长度，然后再使用 map 将长度转换成字符串表示，并加上单位。最后，使用 collect 方法遍历 Flow 并打印输出。

通过上述例子，我们可以看到 map 操作符的作用，通过变换操作符将 Flow 中的元素映射成另一种形式，得到新的 Flow 传递下去。

2.Flow过滤操作符 `filter`

filter，它可以将 Flow 中的数据按照指定的条件进行过滤筛选，并返回一个新的 Flow。

import kotlinx.coroutines.*
import kotlinx.coroutines.flow.*

fun main() = runBlocking<Unit> {
    
    
    val ageFlow = flowOf(15, 27, 18, 20, 12, 25) // 表示人的年龄集合
        .filter {
    
     it >= 18 } // 过滤出大于或等于 18 岁的年龄
        .map {
    
     "成年人 $it 岁" } // 转换成字符串表示格式
    ageFlow.collect {
    
     println(it) }
}

我们首先创建了一个 Flow ageFlow 表示人的年龄集合。使用 filter 进行筛选，将所有大于或等于 18 岁的年龄保留，然后再使用 map 将保留下来的年龄转换成以字符串形式表示的成年人信息。最后，使用 collect 方法遍历 Flow 并打印输出。

通过上述例子，我们可以看到 filter 操作符的作用，通过该操作符将 Flow 中的元素按照条件进行过滤，得到新的 Flow 传递下去。

3.Flow线程切换 `flowOn`

Flow 切换线程相较于RxJava会更加简单。只需使用 flowOn就可以了。
下面的例子中，展示了 flow 构建操作和 map 操作符都会受到 flowOn 的影响。

flow {
    
      
     for(i in 1..5){
    
    
        delay(100)
        emit(i)
     }
}.map {
    
    it *it}
.flowOn(Dispatchers.IO)
.collect {
    
    
   println(it)        
}

而 collect() 处于哪个线程，则需要看整个 flow 处于哪个 CoroutineScope 下。

例如，下面的代码 collect() 则是在 main 线程：

fun main() = runBlocking {
    
    
  flow {
    
    
    for(i in1..5){
    
    
       delay(100)
       emit(i)
    }
  }.map {
    
     it *it}
  .flowOn(Dispatchers.IO)
  .collect {
    
    
      println("${
      
      Thread.currentThread().name}: $it")
  }
}

执行结果：

main:1
main:4
main:9
main:16
main:25

4.Flow处理背压 `buffer`

假设我们有一个生产者 producer，它会向消费者 consumer 发送一些耗时的数据。我们可以模拟这个过程，通过调用 Thread.sleep 函数来让生产者和消费者的工作变得更慢：

fun producer(): Flow<Int> = flow {
    
    
    repeat(10) {
    
     // Generate 10 numbers
        // Simulate slow production process
        Thread.sleep(1000)
        
        // Send the data to the consumer
        emit(it)
        println("Producer sent: $it")
    }
}

suspend fun consumer() {
    
    
    producer()
        .onEach {
    
    
            // Simulate slow consumption process
            Thread.sleep(2000)
            
            // Process the data
            println("Consumer received: $it")
        }
        .collect()
}

fun main() = runBlocking<Unit> {
    
    
    consumer()
}

在上面的代码中，我们使用 flow 函数定义了一个生产者函数 producer，它会发送 0 到 9 的整数。每次生产者发送数据后，都会睡眠 1 秒钟。为了让消费者更容易出现背压问题，我们模拟了一个比较慢的消费过程，让消费者每次处理数据之前都睡眠 2 秒钟。

虽然上面的代码没有明确指定 Flow 的缓存大小，但是实际上 Flow 已经为我们提供了默认的缓存处理机制，它会尽可能地缓存数据，并尝试让所有数据都被顺利地处理。然而，由于消费者处理数据的速度比生产者发送数据的速度还要慢，这个程序最终还是会因为背压问题而崩溃。

我们可以在消费者函数中添加一个 buffer 操作符来为 Flow 明确地指定缓存区的大小，从而更好地控制数据的流量：

suspend fun consumer() {
    
    
    producer()
        .buffer(10) // Set buffer size to 10
        .onEach {
    
    
            // Simulate slow consumption process
            Thread.sleep(2000)
            
            // Process the data
            println("Consumer received: $it")
        }
        .collect()
}

在这个例子中，我们为 Flow 明确地指定了缓存区的大小为 10，这意味着生产者只有在消费者处理完一部分数据之后才能继续发送数据。通过确定性的背压策略，我们可以避免数据积压问题，保证程序的稳定性和可靠性。

5.上游覆盖旧数据 `conflate`

conflate 操作符可以将连续的数据合并为一个数据，并将其发送给下游。当下游处理完前一个数据后，才会接收到新的数据。这意味着，中间的数据可能被丢弃，只有最新的数据保留下来。这种机制也被称为“合并背压策略”。

例如，我们可以通过以下方式模拟一个生产者，来生成一些带时间戳的数据：

fun producer(): Flow<Pair<String, Long>> = flow {
    
    
    var counter = 0L
    while (true) {
    
    
        // Simulate a slow production process
        delay(1000)

        // Generate a new data point
        val data = Pair("Data point ${
      
      counter}", System.currentTimeMillis())
        
        // Emit the data point to the consumer
        emit(data)
        counter++
    }
}

在这个生产者中，我们每秒钟生成一个新的数据点，并将其包装为一个带时间戳的 Pair 对象，然后通过调用 emit 发送给消费者。

接下来，我们可以为 Flow 添加 conflate 操作符，来合并数据并覆盖旧数据：

suspend fun consumer() {
    
    
    producer()
        .onEach {
    
     data ->
            // Simulate a slow processing time
            delay(2000)

            // Process the data
            println("Processing data: ${
      
      data.first}, timestamp: ${
      
      data.second}")
        }
        .conflate() // Use conflation to merge data
        .collect()
}

在这个消费者中，我们先模拟一个比较慢的处理过程，然后使用 conflate 操作符将数据合并起来并覆盖旧数据。最终，我们再通过调用 collect 函数来启动 Flow 的执行过程。

当生产者和消费者处理速度不匹配时，上游可能会产生大量的数据，而下游可能无法及时处理。在这种情况下，使用 conflate 操作符可以确保下游只获得最新的数据，并尽可能减少数据积压的问题。

6.Flow变换取最新值 `flatMapLatest`或者`transformLatest`

这个操作符会为每个元素生成一个新的流，并只保留最新的流以供下游消费。

例如，假设我们有一个更新数据的函数 updateData，它会返回一个新的数据对象，更新数据的时间间隔不定。我们可以将其包装成一个 Flow，然后通过调用 flatMapLatest 来自动更新最新的数据：

fun updateData(): Flow<Data> = flow {
    
    
    while (true) {
    
    
        delay(Random.nextLong(1000, 5000))

        // Generate a new data point
        val newData = Data(Random.nextInt(), "Updated at ${
      
      System.currentTimeMillis()}")

        emit(newData)
    }
}

suspend fun processData() {
    
    
    updateData()
        .flatMapLatest {
    
     data ->
            // Process the data
            flow {
    
    
                delay(2000)
                println("Processing data: $data")
                emit(data)
            }
        }
        .collect()
}

在这个例子中，我们先定义了一个 updateData 函数，该函数会模拟一些数据的更新过程，并返回一个带有 Data 类型对象的 Flow。接着，我们使用 flatMapLatest 操作符将更新后的 Data 对象转换为新的 Flow，并自动取消前一个 Flow 的订阅。这意味着，下游只会接收到最新的数据。最后，我们再通过调用 collect 函数来启动 Flow 的执行过程。

当我们运行程序时，可以看到 processData 函数会持续输出最新的数据。这说明 flatMapLatest 操作符可以帮助我们准确地获取最新的数据，避免旧数据的混乱。

7.收集最新的数据 `collectLatest`

这个函数类似于 collect 函数，但它只会保留最新的数据，而取消之前的数据。

例如，假设我们有一个更新数据的函数 updateData，它会返回一个新的数据对象，更新数据的时间间隔不定。我们可以将其包装成一个 Flow，并在下游通过调用 collectLatest 来获取最新的数据：

fun updateData(): Flow<Data> = flow {
    
    
    while (true) {
    
    
        delay(Random.nextLong(1000, 5000))

        // Generate a new data point
        val newData = Data(Random.nextInt(), "Updated at ${
      
      System.currentTimeMillis()}")

        emit(newData)
    }
}

suspend fun processData() {
    
    
    updateData()
        .onEach {
    
     data ->
            // Process the data
            println("Processing data: $data")
            delay(2000)
        }
        .collectLatest {
    
    
            // Collect latest data only
            println("Latest data: $it")
        }
}

在这个例子中，我们先定义了一个 updateData 函数，该函数会模拟一些数据的更新过程，并返回一个带有 Data 类型对象的 Flow。接着，我们使用 onEach 函数来处理每个更新后的 Data 对象，并打印出其内容。然后，我们再通过调用 collectLatest 函数来获取最新的数据并打印出来。

当我们运行程序时，可以看到 processData 函数会持续输出最新的数据。这说明 collectLatest 函数可以帮助我们准确地获取最新的数据，避免旧数据的混乱。

多Flow操作符

很多时候我们不止操作单个Flow，有可能需要结合多个Flow来实现特定的业务场景。

1.展平流`flatMapConcat`

使用 flatMapConcat 操作符展示如何在请求第一个接口后，将其返回的数据用作参数请求第二个接口：

data class User(val id: Int, val name: String)

fun getUserById(id: Int): Flow<User> = flow {
    
    
    // 模拟网络请求获取用户数据
    delay(1000)
    emit(User(id, "User-$id"))
}

fun getOrdersByUserId(userId: Int): Flow<String> = flow {
    
    
    // 模拟网络请求获取订单数据
    delay(1000)
    emit("Orders for user $userId")
}

suspend fun main() {
    
    
    getUserById(1)
        .flatMapConcat {
    
     user ->
            getOrdersByUserId(user.id)
        }
        .collect {
    
     orders ->
            println(orders)
        }
}

在这个示例中，我们首先定义了两个函数，getUserById 和 getOrdersByUserId。getUserById 函数接收一个用户 ID，返回一个带有 User 对象的 Flow，表示一个用户对象。getOrdersByUserId 函数接收一个用户 ID，并返回订单数据的 Flow。

在 main 函数中，我们首先调用 getUserById 函数来获取用户数据。然后，我们使用 flatMapConcat 操作符将该用户对象转换为代表其订单的另一个 Flow。在 flatMapConcat 中，我们指定了一个 lambda 表达式，该表达式接收从上游 Flow 发射的用户对象，并使用其 ID 来调用第二个函数，即 getOrdersByUserId。

由于 flatMapConcat 操作符在等待第一个 Flow 中的所有元素被处理完之后才会将第二个 Flow 的元素依次合并展平，因此我们保证了先获取用户数据再获取订单数据的顺序。

当第二个 Flow 发射新元素时，它们将被发送到下游进行处理。

2. 展平流`flatMapMerge`

假设我们有两个接口：

/api/users/{id}/posts，用于获取指定用户发表的所有帖子
/api/posts/{id}/comments，用于获取指定帖子的所有评论

现在，我们需要同时获取某个用户发表的所有帖子的所有评论。我们可以使用 flatMapMerge 操作符来实现这个目标：

data class Post(val id: Int, val title: String, val userId: Int)

data class Comment(val id: Int, val postId: Int, val content: String)

fun getPostsByUserId(userId: Int): Flow<List<Post>> = flow {
    
    
    // 模拟网络请求获取用户发表的所有帖子
    delay(1000)
    emit(listOf(Post(1, "Post 1", userId), Post(2, "Post 2", userId)))
}

fun getCommentsByPostId(postId: Int): Flow<List<Comment>> = flow {
    
    
    // 模拟网络请求获取帖子的所有评论
    delay(1000)
    emit(listOf(Comment(1, postId, "Comment 1"), Comment(2, postId, "Comment 2")))
}

suspend fun main() {
    
    
    getPostsByUserId(1)
        .flatMapMerge {
    
     posts ->
            flow {
    
    
                for (post in posts) {
    
    
                    emit(getCommentsByPostId(post.id))
                }
            }
        }
        .flattenConcat()
        .collect {
    
     comment ->
            println(comment)
        }
}

在这个示例中，我们定义了两个函数，getPostsByUserId 和 getCommentsByPostId，分别用于获取某个用户发表的所有帖子和某个帖子的所有评论。每个函数都返回一个带有对象列表的 Flow。

在 main 函数中，我们首先调用 getPostsByUserId 函数来获取某个用户发表的所有帖子。然后，我们使用 flatMapMerge 操作符将该帖子列表转换为代表其评论的多个 Flow。在 flatMapMerge 中，我们指定了一个 lambda 表达式，该表达式接收从上游 Flow 发射的帖子列表，并通过 for 循环为每个帖子调用 getCommentsByPostId 函数，以获取其评论数据。

由于 flatMapMerge 操作符可以同时处理多个 Flow，因此在确保所有评论数据都被发射前，我们不必等待任何帖子的评论数据。

然后，我们使用 flattenConcat 操作符将多个 Flow 中的元素依次合并展平到一个 Flow 中，以确保顺序正确。最后，我们使用 collect 函数将评论信息打印出来。

3.展平流`flatMapLatest`

假设我们需要从两个接口中获取数据：

/api/posts/latest，用于获取最新帖子列表
/api/posts/search?key={keyword}，用于根据关键字搜索帖子列表

当用户输入关键字并点击“搜索”按钮时，我们需要立即取消最新帖子列表的请求，以便节省资源并加快响应速度。我们可以使用 flatMapLatest 操作符以及 Kotlin 协程提供的 withTimeoutOrNull 函数来实现这个目标。

在下面的示例中，我们首先定义了两个函数，getLatestPosts 和 searchPosts，分别用于获取最新帖子列表和根据关键字搜索帖子列表。每个函数都返回一个带有对象列表的 Flow。我们指定了一定的延迟时间，以便模拟网络请求的时间。

然后，在 main 函数中，我们使用 flatMapLatest 操作符将这两个 Flow 组合起来。在 flatMapLatest 中，我们指定了一个 lambda 表达式，该表达式接收从上游 Flow 发射的关键字，并调用 searchPosts 函数以获取匹配的帖子列表。如果此时 getLatestPosts 函数还在进行中，我们会使用 withTimeoutOrNull 函数来取消该请求。

data class Post(val id: Int, val title: String)

fun getLatestPosts(): Flow<List<Post>> = flow {
    
    
    delay(1000)
    emit(listOf(Post(1, "Latest Post 1"), Post(2, "Latest Post 2")))
}

fun searchPosts(keyword: String): Flow<List<Post>> = flow {
    
    
    delay(1000)
    val matchedPosts = listOf(
        Post(3, "Matched Post 1"),
        Post(4, "Matched Post 2"),
        Post(5, "Matched Post 3")
    )
    emit(matchedPosts)
}

suspend fun main() {
    
    
    val keywordFlow = MutableStateFlow("")

    keywordFlow
        .debounce(300) // 防抖动，避免过于频繁的搜索请求
        .flatMapLatest {
    
     keyword ->
            flow {
    
    
                // 如果最新帖子列表请求仍在进行中，则取消该请求
                withTimeoutOrNull(500) {
    
    
                    getLatestPosts().collect()
                }
                emit(searchPosts(keyword).toList())
            }
        }
        .collect {
    
     posts ->
            println(posts)
        }

    // 模拟用户输入关键字并发出搜索请求
    keywordFlow.value = "Kotlin"
}

在这个示例中，我们使用了 MutableStateFlow 来模拟用户输入关键字的情况，每当用户输入内容时，都会触发一次新的搜索请求。我们还使用了 debounce 操作符来避免过于频繁的搜索请求。

当用户输入关键字时，flatMapLatest 操作符会立即开始执行搜索请求。如果此时 getLatestPosts 函数的请求仍在进行中，那么我们将取消该请求。接着，我们等待 searchPosts 函数的请求返回并打印搜索结果。

4.组合流`combine`

假设我们需要从两个接口中获取数据：

/api/users/{id}/profile，用于获取指定用户的基本信息
/api/users/{id}/posts，用于获取指定用户发表的所有帖子

当我们需要获取一个用户的两个数据时，我们可以使用 combine 操作符将这两个 Flow 组合起来，并返回一个新的 Flow 对象，该对象包含这两个数据。

在下面的示例中，我们将这两个数据合并成了一个 UserProfile 的类。

data class UserProfile(val userInfo: UserInfo, val posts: List<Post>)

data class UserInfo(val id: Int, val name: String, val age: Int)

data class Post(val id: Int, val title: String)

fun getUserInfo(userId: Int): Flow<UserInfo> = flow {
    
    
    // 模拟网络请求获取用户基本信息
    delay(1000)
    emit(UserInfo(userId, "user_$userId", 20))
}

fun getPostsByUserId(userId: Int): Flow<List<Post>> = flow {
    
    
    // 模拟网络请求获取用户发表的所有帖子
    delay(1000)
    emit(listOf(Post(1, "Post 1"), Post(2, "Post 2")))
}

suspend fun main() {
    
    
    getUserInfo(1)
        .combine(getPostsByUserId(1)) {
    
     userInfo, posts ->
            UserProfile(userInfo, posts)
        }
        .collect {
    
     userProfile ->
            println(userProfile.userInfo)
            println(userProfile.posts)
        }
}

在这个示例中，我们首先定义了两个函数，getUserInfo 和 getPostsByUserId，分别用于获取指定用户的基本信息和所有帖子列表。每个函数都返回一个带有对象列表的 Flow。

在 main 函数中，我们首先调用 getUserInfo 函数来获取指定用户的基本信息。然后，我们使用 combine 操作符将该 Flow 与 getPostsByUserId 函数的 Flow 组合起来，并指定一个 lambda 表达式，该表达式接收从上游 Flow 发射的用户基本信息和帖子列表，并将它们合并成一个 UserProfile 对象。

最后，我们使用 collect 函数将 UserProfile 对象打印出来。

5.组合流`zip`

假设我们需要从两个接口中获取数据：

/api/users/{id}/profile，用于获取指定用户的基本信息
/api/users/{id}/posts，用于获取指定用户发表的所有帖子

当我们需要获取一个用户的两个数据时，我们可以使用 zip 操作符将这两个 Flow 组合起来，并返回一个新的 Flow 对象，该对象包含这两个数据。不同的是，当你获取到其中一个数据之后，另一个请求会立即被取消，从而节省资源和加快响应速度。

在下面的示例中，我们将这两个数据合并成了一个 UserProfile 的类，并使用 zip 操作符来实现只获取一个数据时就立即取消另一个请求的功能。

data class UserProfile(val userInfo: UserInfo, val posts: List<Post>)

data class UserInfo(val id: Int, val name: String, val age: Int)

data class Post(val id: Int, val title: String)

fun getUserInfo(userId: Int): Flow<UserInfo> = flow {
    
    
    // 模拟网络请求获取用户基本信息
    delay(1000)
    emit(UserInfo(userId, "user_$userId", 20))
}

fun getPostsByUserId(userId: Int): Flow<List<Post>> = flow {
    
    
    // 模拟网络请求获取用户发表的所有帖子
    delay(1000)
    emit(listOf(Post(1, "Post 1"), Post(2, "Post 2")))
}

suspend fun main() {
    
    
    getUserInfo(1)
        .zip(getPostsByUserId(1)) {
    
     userInfo, posts ->
            UserProfile(userInfo, posts)
        }
        .catch {
    
     e ->
            println("Error: ${
      
      e.message}")
        }
        .collect {
    
     userProfile ->
            println(userProfile.userInfo)
            println(userProfile.posts)
        }
}

在这个示例中，我们仍然使用了 getUserInfo 和 getPostsByUserId 函数来获取指定用户的基本信息和所有帖子列表，每个函数都返回一个带有对象列表的 Flow。

不同的是，在 main 函数中，我们使用了 zip 操作符将这两个 Flow 组合起来，并指定一个 lambda 表达式，该表达式接收从上游 Flow 发射的用户基本信息和帖子列表，并将它们合并成一个 UserProfile 对象。一旦其中一个请求返回数据，另一个请求会立即被取消，以节省资源和加快响应速度。

我们还使用 catch 函数来处理可能出现的异常情况，并使用 collect 函数将 UserProfile 对象打印出来。