Android profiler: Slow rendering/stuttering

android official website: rendering speed is slow

Interface rendering refers to the action of generating frames from an application and displaying them on the screen. To ensure users can interact smoothly with your app, your app should render each frame no longer than 16ms to achieve a rendering speed of 60 frames per second (why 60fps?). If your app has a slow UI, the system will have to skip some frames, causing your app to feel choppy to the user. We call this situation stuck.

To help you improve the quality of your app, Android automatically monitors your app for lags and displays relevant information in the Android Vitals information center. To learn how data is collected, see the Play Console documentation.

If your app is stuck, you can use the guide on this page to diagnose and resolve the issue.

Note: The Android Vitals dashboard and the Android system record render time statistics for apps that use the Interface Toolkit (where the user-visible portion of the app is drawn against a Canvas or View hierarchy). If your app doesn't use the interface toolkit (which is the case for apps built with Vulkan, Unity, Unreal, or OpenGL), the Android Vitals dashboard will not provide render time statistics. To determine whether your device records render time metrics for your app, you can run adb shell dumpsys gfxinfo .

1. Identify lags

Finding the code in your app that's causing the lag can be tricky. This section introduces three methods to identify lags:

  • Visual inspection
  • Systrace
  • Custom Performance Monitoring
    Visual inspection allows you to quickly see all the use cases in your application in a matter of minutes, but the information you get with this method is not as detailed as when you use the Systrace method. Systrace can provide more detailed information, but if you run Systrace for all use cases in your application, you will be overwhelmed by too much data, making analysis difficult. Both visual inspection and Systrace detect lags on your local device. If you can't reproduce lag on a local device, you can build a custom performance monitoring feature to measure specific parts of your app on a device running in the field.

1. Visual inspection method

Visual inspection can help you identify use cases that are causing lags. For a visual inspection, open your app and manually look at different parts of the app to see if there are any lags. Here are some tips for conducting a visual inspection:

  • Run the release version of your app (or at least the non-debuggable version). To support debugging, the ART runtime disables several important optimizations, so it's important to ensure that what you see is similar to what users will see.
  • Enables GPU rendering mode profiling. The GPU rendering mode profiling feature displays bars on the screen to provide a quick visual representation of how long it takes to render an interface window frame, relative to a baseline of 16ms per frame. Each bar has a colored segment that corresponds to a stage in the rendering pipeline, so you can see which part takes the longest. For example, if a frame spends a lot of time processing input, you should look at the application code responsible for processing user input.
  • Certain components, such as RecyclerView, are a common source of lag. If your app uses these components, it's a good idea to review these parts of your app.
  • Sometimes, stuttering can only be reproduced when the app is launched via a cold boot.
  • You can try running your app on a slower device to highlight this issue.
    After discovering use cases that cause lag, you probably already have a good idea of ​​what causes lag in your app. But if you need more information, you can use Systrace to drill down further.

2. Systrace method

The Systrace tool is used to show what the entire device is doing, but can also be used to identify lags in your application. Systrace has very little overhead, so you can experience real-life stuttering during instrumentation testing.

When executing a stuck use case on the device, you can use Systrace to record trace information. For instructions on how to use Systrace, see Systrace Demo. System trace information is broken down by process and thread. You can view the application's progress in Systrace, which should look like Figure 1.

Insert image description here

Figure 1: System trace information

The system trace information in Figure 1 contains the following information to identify lags:

  • Systrace displays the draw time of each frame and color-codes each frame to highlight times when rendering is slow. This method helps you pinpoint individual stuck frames more accurately than visual inspection. For more information, see Checking UI Frames and Alerts.
  • Systrace detects issues in your app and displays alerts simultaneously across frames and alert panels. It's best to follow the instructions in the alert.
  • Some parts of the Android framework and libraries, such as RecyclerView, include tracking tags. Therefore, the system trace timeline shows when and how long these methods were executed on the UI thread.
    After looking at the Systrace output, you may suspect that some method in your application is causing the lag. For example, if the timeline shows that a frame is rendering slowly because the RecyclerView is taking a long time, you can add a trace tag to the relevant code and rerun Systrace to get more information. In the new system trace information, the timeline shows when methods in your application were called and how long they took to execute.

If Systrace doesn't show details about why the UI thread is taking a long time to work, you need to use the Android CPU Profiler to log method trace information for sampled or instrumented tests. Typically, method traces are not suitable for identifying lags because they are too expensive to cause false positive lags, and there is no way to see when threads are running and when they are blocked. However, method tracking information can help you identify the methods in your app that take the most time. Once you identify these methods, you can add trace flags and rerun Systrace to see if these methods cause lag.

Note: When recording system trace information, each trace mark (start and end pair of execution) adds approximately 10μs of overhead. In order to avoid false positive lags, do not add tracking tags to methods that will be called dozens of times in one frame or take less than 200us.
To learn more, see Understanding Systrace.

3. Customized performance monitoring methods

If you can't reproduce the lag on your local device, you can have custom performance monitoring built into your app to help identify the source of the lag on your device in the field.

To take this approach, use FrameMetricsAggregator to collect frame rendering times from specific parts of your app and use the FFirebase Performance Monitoring feature to log and analyze the data.

To learn more, see Using Firebase Performance Monitoring features with Android vitals.

2. Solve the stuck problem

To resolve stuttering, check which frames are taking longer than 16.7ms and see where the problem is. Does Record View#draw take too long in some frames, or could it be a layout issue? For these and other issues, see the common sources of lag below.

To avoid lag, long-running tasks should run asynchronously outside of the interface thread. Be sure to always know what thread your code is running on, and be cautious when dispatching important tasks to the main thread.

If your app has a very complex and non-trivial main interface (perhaps a central scrolling list), consider writing instrumented tests to automatically detect times of slow rendering, and run these tests frequently to prevent regressions. For more information, see the Automated Performance Testing Codelab.

3. Common sources of lagging

The following sections describe common sources of lag in your app and best practices for resolving them.

1. Scrollable list

ListView and RecyclerView (especially the latter) are often used for complex scrolling lists that are most prone to lag. They all contain Systrace flags, so you can use Systrace to determine if they are the cause of your app's lag. Be sure to pass the command line argument -a so that the tracking portion of the RecyclerView (and any tracking markers you add) are displayed. Follow the guidance provided by the alert generated in the system trace output, if any. In Systrace, you can click the RecyclerView trace section to see an explanation of the work the RecyclerView is doing.

RecyclerView:notifyDataSetChanged

If you see every item in the RecyclerView being rebound (and therefore relayout and redrawn) in a frame, make sure you are not calling notifyDataSetChanged(), setAdapter(Adapter) or swapAdapter(Adapter, boolean) to do so Minor updates. These methods indicate to the system that the entire list content has changed and will appear in Systrace as RV FullInvalidate. SortedList or DiffUtil should be used instead to generate minimal updates when content changes or is added.

Let's take the example of an application that receives a new version of a list of news content from a server. When you publish that information to the adapter, you can call notifyDataSetChanged() as follows:

void onNewDataArrived(List<News> news) {
    
    
    myAdapter.setNews(news);
    myAdapter.notifyDataSetChanged();
}

But this has a big drawback - if it's a trivial change (perhaps a single item of content added to the top), the RecyclerView won't be able to detect this - it's told to discard all cached content state, and therefore needs to rebind every item.

It's much better to use DiffUtil, which will calculate and dispatch minimal updates for you.

void onNewDataArrived(List<News> news) {
    
    
    List<News> oldNews = myAdapter.getItems();
    DiffResult result = DiffUtil.calculateDiff(new MyCallback(oldNews, news));
    myAdapter.setNews(news);
    result.dispatchUpdatesTo(myAdapter);
}

Just define your MyCallback as a DiffUtil.Callback implementation to tell DiffUtil how to check your list.

RecyclerView: Nested RecyclerView

Nested RecyclerViews are common, especially for vertical lists that consist of horizontally scrolling lists (such as the app grid on the main Play Store page). This approach works well, but it also results in a lot of views moving back and forth. If you see a lot of inner content bloating when you first scroll down the page, you may want to check whether the RecyclerView.RecycledViewPool is being shared between inner (horizontal) RecyclerViews. By default, each RecyclerView will have its own content pool. However, with a dozen itemViews displayed on the screen at the same time, problems can arise when different horizontal lists cannot share itemViews if all rows display similar types of views.

class OuterAdapter extends RecyclerView.Adapter<OuterAdapter.ViewHolder> {
    
    
    RecyclerView.RecycledViewPool sharedPool = new RecyclerView.RecycledViewPool();

    ...

    @Override
    public void onCreateViewHolder(ViewGroup parent, int viewType) {
    
    
        // inflate inner item, find innerRecyclerView by ID…
        LinearLayoutManager innerLLM = new LinearLayoutManager(parent.getContext(),
                LinearLayoutManager.HORIZONTAL);
        innerRv.setLayoutManager(innerLLM);
        innerRv.setRecycledViewPool(sharedPool);
        return new OuterAdapter.ViewHolder(innerRv);

    }
    ...

If you wish to optimize further, you can also call setInitialPrefetchItemCount(int) on the inner RecyclerView's LinearLayoutManager. For example, if you always display 3.5 items in a row, call innerLLM.setInitialItemPrefetchCount(4);. This indicates to the RecyclerView that when a horizontal row is about to be displayed on the screen, the RecyclerView should try to prefetch the contents of that row if there is free time in the UI thread.

RecyclerView: Too much bloat/Creation process takes too long

The prefetch feature in RecyclerView performs work ahead of time while the UI thread is idle, so should help with the overhead caused by bloat in most cases. If you see bloat in frames (rather than in sections marked RV prefetching), make sure you're testing on a newer device (prefetching is currently only available in Android 5.0 API level 21 and above supported) and using a newer version of the support library.

If you frequently see bloat issues that cause stuttering when new content appears on the screen, make sure you don't have more view types than you need. The fewer view types in a RecyclerView's content, the less bloat it needs to do when new content types appear on the screen. If possible, view types can be combined where appropriate. If only icons, colors, or text fragments differ between types, you can make these changes at binding time to avoid bloat (and reduce the memory footprint of your app).

If the view type seems appropriate, consider reducing the overhead caused by bloat. Reducing unnecessary container and structural views will help – consider building itemViews with ConstraintLayout to easily reduce structural views. If you want to really optimize for performance, the hierarchy of your content is very simple, and you don't need complex theming and styling features, you might consider calling the constructor yourself, but be aware that it's usually not worth sacrificing the simplicity and ease of XML for this. Function.

RecyclerView: Binding takes too long

Binding (i.e. onBindViewHolder(VH, int)) should be very simple and all but the most complex content should take far less than 1 millisecond to bind. It should just get the POJO content from the adapter's internal content data and call the setter on the view in the ViewHolder. If RV OnBindView takes a long time, make sure you're doing only the minimum amount of work in your binding code.

If you use simple POJO objects to hold data in the adapter, you can use a data binding library to completely avoid writing binding code in onBindViewHolder.

RecyclerView or ListView: Layout/draw takes too long

For drawing and layout issues, see the section on layout and rendering performance.

ListView: inflated

If you're not careful enough, it's easy to accidentally disable recycling in a ListView. If you see bloat every time something new comes to the screen, check that your Adapter.getView() implementation is using, rebinding, and returning the convertView parameter. If your getView() implementation always bloats, your app won't be able to benefit from recycling in the ListView. The structure of getView() should almost always look like the following implementation:

View getView(int position, View convertView, ViewGroup parent) {
    
    

    if (convertView == null) {
    
    
        // only inflate if no convertView passed
        convertView = layoutInflater.inflate(R.layout.my_layout, parent, false)
    }
    // … bind content from position to convertView …
    return convertView;
}

2. Layout performance

If Systrace indicates that the layout portion of Choreographer#doFrame is doing too much work or doing work too frequently, it means you are experiencing layout performance issues. Your app's layout performance depends on which part of the view hierarchy contains layout parameters or inputs that change.

Layout performance: overhead

If these parts take more than a few milliseconds, you may be experiencing the worst possible nesting performance for RelativeLayouts or weighted-LinearLayouts. Each of these layouts can trigger multiple evaluation/layout passes of its children, so nesting these layouts can result in O(n^2) behavior in terms of nesting depth. Try to avoid using RelativeLayout in all leaf nodes of the hierarchy except the lowest leaf node, or avoid using the weight functionality of LinearLayout. There are several methods you can use:

You can adjust how the structure view is organized.
You can define custom layout logic. For a specific example, see Optimizing Layout Hierarchies. You can try converting to a ConstraintLayout, which provides similar functionality without the performance hit.

Layout Performance: Frequency

Layout should occur when new content appears on the screen, such as when new content scrolls onto the view in a RecyclerView. If you're laying out significantly every frame, you're probably animating the layout, which is likely to result in dropped frames. Generally speaking, animations should be run with the View's draw properties (such as setTranslationX/Y/Z(), setRotation(), setAlpha(), etc.). These properties are much less expensive to change than layout properties such as padding or margins. Changing a view's draw properties is also much less expensive, typically calling a setter that triggers invalidate(), followed by draw(Canvas) in the next frame. This re-records the draw operations of the expired view and is generally much less expensive than layout.

3. Present performance

Android interface work is divided into two stages: Record View#draw on the interface thread and DrawFrame on RenderThread. The first stage runs draw(Canvas) on each failed View and can call custom views or code. The second stage runs on the native RenderThread, but will run based on the work generated by the Record View#draw stage.

Rendering performance: interface thread

If Record View#draw takes a long time, it's usually because the bitmap is being drawn on the interface thread. Drawing to a bitmap uses CPU rendering, so this should be avoided whenever possible. Use method tracing with the Android CPU Profiler to see if this is causing the problem.

Drawing to a bitmap is typically performed when an application wishes to decorate the bitmap before displaying it. Decoration sometimes refers to operations like adding rounded corners:

Canvas bitmapCanvas = new Canvas(roundedOutputBitmap);
Paint paint = new Paint();
paint.setAntiAlias(true);
// draw a round rect to define shape:
bitmapCanvas.drawRoundRect(0, 0,
        roundedOutputBitmap.getWidth(), roundedOutputBitmap.getHeight(), 20, 20, paint);
paint.setXfermode(new PorterDuffXfermode(PorterDuff.Mode.MULTIPLY));
// multiply content on top, to make it rounded
bitmapCanvas.drawBitmap(sourceBitmap, 0, 0, paint);
bitmapCanvas.setBitmap(null);
// now roundedOutputBitmap has sourceBitmap inside, but as a circle

If you're doing this kind of work on the interface thread, you can do it on the decoding thread in the background. In some similar cases, you can even do the work while drawing, so if your Drawable or View code looks like this:

void setBitmap(Bitmap bitmap) {
    
    
    mBitmap = bitmap;
    invalidate();
}

void onDraw(Canvas canvas) {
    
    
    canvas.drawBitmap(mBitmap, null, paint);
}

You can replace it with the following code:

void setBitmap(Bitmap bitmap) {
    
    
    shaderPaint.setShader(
            new BitmapShader(bitmap, TileMode.CLAMP, TileMode.CLAMP));
    invalidate();
}

void onDraw(Canvas canvas) {
    
    
    canvas.drawRoundRect(0, 0, width, height, 20, 20, shaderPaint);
}

Note that this operation can also often be used for background protection (drawing a gradient in front of the bitmap) and image filtering (using a ColorMatrixColorFilter), two other common operations for modifying bitmaps.

If you are drawing to a bitmap for other reasons (perhaps to use it as a cache), try drawing directly to the hardware-accelerated canvas passed to the View or Drawable; if necessary, consider calling setLayerType() with LAYER_TYPE_HARDWARE to cache Complex rendering output and still take full advantage of GPU rendering capabilities.

Rendering performance: RenderThread

Some canvas operations, although very cheap to record, trigger very expensive calculations on the RenderThread. Systrace usually indicates this type of operation with an alert.

Canvas.saveLayer()

Avoid Canvas.saveLayer() - it may trigger rendering each frame in a very expensive and uncached off-screen manner. Although performance has been improved in Android 6.0 (optimized to avoid rendering target switching on the GPU), it is still best to avoid using this very expensive API if possible, or at least make sure to pass Canvas.CLIP_TO_LAYER_SAVE_FLAG (or call without variations of the logo).

Animate large paths

When Canvas.drawPath() is called on a hardware-accelerated canvas passed to a view, Android first draws these paths on the CPU and then uploads them to the GPU. If the path is large, avoid modifying it on a frame-by-frame basis so that it can be cached and drawn efficiently. drawPoints(), drawLines(), and drawRect/Circle/Oval/RoundRect() are more efficient – ​​it's better to use them even if you end up using more draw calls.

Canvas.clipPath

clipPath(Path) triggers very expensive clipping behavior, so it should generally be avoided. If possible, choose to use a drawn shape rather than clipping to a non-rectangular shape. It works better and supports anti-aliasing. For example, the following clipPath call:

canvas.save();
canvas.clipPath(circlePath);
canvas.drawBitmap(bitmap, 0f, 0f, paint);
canvas.restore();

It can be expressed as:

// one time init:
paint.setShader(new BitmapShader(bitmap, TileMode.CLAMP, TileMode.CLAMP));
// at draw time:
canvas.drawPath(circlePath, mPaint);

Bitmap upload

Android displays the bitmap as an OpenGL texture, and the bitmap is uploaded to the GPU the first time it is displayed in a frame. You can see this in Systrace as texture upload(id) "width x height". This may take a few milliseconds (see Figure 2), but the GPU must be used to display the image.

If these operations take a long time, first check the width and height data in the tracking information. Please make sure that the displayed bitmap is not significantly larger than its display area on the screen, otherwise upload time and memory will be wasted. Typically, bitmap loading libraries provide some easy way to request an appropriately sized bitmap.

In Android 7.0, the bitmap loading code (usually done by a library) can call prepareToDraw() to trigger the upload before it is needed. In this way, the upload operation will proceed early when the RenderThread is idle. As long as you know the bitmap, you can do this after decoding or when binding the bitmap to the view. Ideally, your bitmap loading library will do this for you, but if you want to manage it yourself, or want to ensure that the upload is not triggered on newer devices, you can call prepareToDraw( in your own code ).

Insert image description here

Figure 2: The application spends a lot of time uploading a large bitmap in a certain frame. Its size can be reduced, or the upload can be triggered early while decoding using prepareToDraw().

4. Thread scheduling delay

The thread scheduler in the Android operating system is responsible for determining which threads in the system should run, when and for how long. Sometimes, lag occurs because the app's UI thread is blocked or not running. Systrace uses different colors (see Figure 3) to indicate when a thread is sleeping (gray), runnable (blue: can run, but the scheduler has not chosen to let it run), running (green), or uninterruptible Sleep state (red or orange). This is useful for debugging lags caused by thread scheduling delays.

Note: Older versions of Android will more frequently encounter scheduling issues that are not caused by application bugs. There are ongoing improvements in this area, so consider debugging thread scheduling issues more often on newer operating system versions, where unscheduled threads are more likely to be the result of application bugs.

Insert image description here

Figure 3: Highlighting the time period when the interface thread is sleeping.

Note: For some parts of the frame, the interface thread or RenderThread is not expected to run. For example, when RenderThread's syncFrameState is running and the bitmap has been uploaded, the interface thread will be blocked so that RenderThread can safely copy the data used by the interface thread. As another example, a RenderThread might block while using IPC to get a buffer at the beginning of a frame, query it for information, or pass buffer information back to the compositor via eglSwapBuffers.
Long pauses in application execution are often caused by binder calls (the inter-process communication [IPC] mechanism on Android). In newer Android versions, this is one of the most common reasons for the UI thread to stop running. Generally speaking, the solution is to avoid calling functions that make binder calls; if this is unavoidable, the value should be cached or the work should be moved to a background thread. As your codebase gets larger, it's easy to accidentally add binder calls when you call some low-level methods, but it's equally easy to find and fix them by tracing them.

If you have a binder transaction, you can capture its call stack using the following adb command:

$ adb shell am trace-ipc start
… use the app - scroll/animate ...
$ adb shell am trace-ipc stop --dump-file /data/local/tmp/ipc-trace.txt
$ adb pull /data/local/tmp/ipc-trace.txt

Sometimes seemingly harmless calls (such as getRefreshRate()) can trigger binder transactions, which can cause serious problems if these transactions are called frequently. Regular tracking can help you quickly identify and resolve these issues as they arise.

Insert image description here

Figure 4: Shows the interface thread sleeping due to a binder transaction in an RV throw. Keep the binding logic simple and use trace-ipc to track and remove binder calls.

If you don't see binder activity, but you don't see the interface thread running, make sure you're not waiting for some lock or other action from another thread. Normally, the UI thread should not need to wait for results from other threads – other threads should publish information to the UI thread.

5. Object allocation and garbage collection

Object allocation and garbage collection (GC) issues have been significantly alleviated since ART was introduced as the default runtime in Android 5.0, but this extra work still has the potential to tax threads. You can allocate for rare events that don't happen many times per second (such as a user clicking a button), but remember that each allocation incurs overhead. If it's in a tight loop that is called frequently, consider avoiding allocations to reduce the load on the GC.

Systrace will show if the GC is running frequently, while Android Memory Profiler can show the source of allocations. If you avoid allocations as much as possible (especially in tight loops), you shouldn't run into problems.

Insert image description here

Figure 5: Showing 94ms GC on HeapTaskDaemon thread

In newer versions of Android, the GC usually runs on a background thread called HeapTaskDaemon. Note that a large number of allocations may mean more CPU resources are spent on the GC, as shown in Figure 5.

Guess you like

Origin blog.csdn.net/sinat_31057219/article/details/132451522