How to achieve the Depth Effect picture effect brought by iOS 16

Author: phenol

0x01 Preface

The iOS 16 system has brought us an amazing desktop lock screen effect: Depth Effect. It can use an ordinary picture as the background, and at the same time, it can cover some desktop components in appropriate places to form a depth of field effect (as shown in the figure below).

So can we achieve a similar effect in our own App? At first I thought that iOS 16 has added a new UIKit control, UIVisualEffectViewwhich can be implemented with a few lines of simple APIs like , but in the end I found that there is no. If the given picture is multiple pictures that have been divided into layers, then the implementation is simply to sandwich the clock control in the middle like a sandwich biscuit. However, it has been found in practice that this effect can also be achieved by setting a single picture randomly downloaded from the Internet as the lock screen background. Reminiscent of the iOS 16 system album that can directly segment and drag the subject in the photo after pressing it again, I think it must have used some image segmentation algorithm to separate the foreground from the background, thus obtaining a multi-layered image. image

0x02 Image Segmentation (Image Segmentation)

The more classic image segmentation algorithm is the watershed algorithm (Watershed). The image it segments is very accurate and the edge processing is very good, but it requires manual strokes on the approximate positions of the foreground and background (only one stroke is fine). , the latter algorithm will automatically separate the foreground and background), which does not apply to the fully automatic requirements of this paper. In recent years, many achievements have emerged in machine learning, one of which is fully automated image segmentation. Sure enough, after a simple search, I found that Apple has provided a pre-trained model.

Visit Apple's machine learning official website developer.apple.com/machine-lea... to download the trained model DeeplabV3 . Drag the model file to the Xcode project, and you can view some information about it after selecting it:

image

In fact, here we mainly focus on the input and output of the model. Click the Predictions tab, and you can see that the model requires input of a 513x513 image, and the output is a two-dimensional array with a member type of Int32 and a size of 513x513. Each value represents the corresponding image pixel Classification of points. The reason why the members here are Int32 instead of simple Bool is because the model can split the image into many different parts, not just foreground and background. In practice, we found that a value of 0 can be considered as the background, and a value other than 0 is the foreground.

image

The following is the result obtained after running segmentation on a sample image:

image

It is divided into two values ​​of 0 and 15, which are background and foreground respectively.

0x03 practice

The model already exists, and the implementation plan is almost ready, and the next step is the specific practice.

After the model is dragged into the Xcode project, Xcode will automatically generate a class for us: DeepLabV3. We can directly create an instance of it without any imports :

    lazy var model = try! DeepLabV3(configuration: {
        let config = MLModelConfiguration()
        config.allowLowPrecisionAccumulationOnGPU = true
        config.computeUnits = .cpuAndNeuralEngine
        return config
    }())
复制代码

Then, use this instance to create a VNCoreMLRequestrequest to analyze the image through the machine learning engine, and get the result in the callback:

    lazy var request = VNCoreMLRequest(model: try! VNCoreMLModel(for: model.model)) { [unowned self] request, error in
        if let results = request.results as? [VNCoreMLFeatureValueObservation] {
            // 最终的分割结果在 arrayValue 中
            if let feature = results.first?.featureValue, let arrayValue = feature.multiArrayValue {
                let width = arrayValue.shape[0].intValue
                let height = arrayValue.shape[1].intValue
                let stride = arrayValue.strides[0].intValue
                // ...
            }
            
        }
    }
复制代码

Finally, VNImageRequestHandlercreate :

    private func segment() {
        if let image = self.imageView.image {
            imageSize = image.size
            DispatchQueue.global().async { [unowned self] in
                self.request.imageCropAndScaleOption = .scaleFill
                let handler = VNImageRequestHandler(cgImage: image.resize(to: .init(width: 513, height: 513)).cgImage!)
                try? handler.perform([self.request])
            }
        }
    }
复制代码

Notice:

  1. The callback of the request and the code that the handler initiates the request are in the same thread, waiting for the result synchronously, so it is best to dispatch to the sub-thread operation here
  2. The request needs to set imageCropAndScaleOption to .scallFill, otherwise it will automatically crop the middle part by default, and you will get unexpected results

Enter the following sample image,

arrayValueProcess the returned result into a black and white image:

image

It was found to be quite accurate. Of course, if you want to use it as a mask in the code, you should treat it as a picture with a fully transparent background and an opaque foreground:

image

Finally, we put the original image on the bottom layer, other controls in the middle, and the original image + mask view on the top layer, forming the final effect:

image

The actual principle behind it is the sandwich biscuit:

A few more renderings:

0x04 postscript

Of course, this model is not a panacea, and there are still limitations in specific applications. For photos with people, it can be segmented better, but for landscape photos like large scenes, it may not be able to be segmented at all. A demo of this article can be found on Github .

References

  1. developer.apple.com/documentati…
  2. www.appcoda.com.tw/vision-pers…
  3. enlight.nyc/projects/im…

This article was published by the NetEase Cloud Music Technology Team. Any form of reprinting without authorization is prohibited. We recruit various technical positions all year round. If you are going to change jobs and you happen to like cloud music, then join us at grp.music-fe(at)corp.netease.com!

Guess you like

Origin juejin.im/post/7197608023430283319