Five questions, three strategies, teach you how to customize App performance monitoring program

Author: Friends of the Union + U-APM team

Why? Why do application performance monitoring?

First of all, we need to know what application performance monitoring specifically refers to? And purpose:

Monitoring is a complete "monitoring + alarm" system. For App developers like us, application performance monitoring is the first hurdle to measure the App. If the quality of the application is not good, it will bring the most direct experience damage to the user. After the App is online, developers cannot obtain the user's usage and experience in real time 7*24. At this time, a set of high-quality monitoring tools is needed.

So, what indicators do we need to monitor?

There are many different client-side monitoring indicators between Android and iOS. For example, Android needs Java, Native, ANR errors, etc., while iOS needs Objective-C, Swift, C++ layer errors, and so on.

In the definition of error indicators, the most basic is the number of errors of different types. If you consider the comparison of the number of errors with the overall application usage, you can consider using a ratio method. For example, you can define the error rate:

1.png

If you want to pay attention to the number of occurrences of errors and the number of users affected by the errors, you can influence the number of users based on the number of errors and calculate according to the user re-weighting.

How to define independent users ? We can consider using device ID for identification, such as imei, idfa, AndroidID, etc. If this information is difficult to obtain, you can also use the user ID on the business, such as login account, member name, etc. In addition, it is also a good choice to use the device identification definition ID provided by the third-party SDK. After using this type of ID to sort the weights, you can get the number of incorrectly affected users.

If we know the number of users affected by the error, but cannot determine the proportion of its impact, we can look at the following indicator:

2.png


In summary, we can count the number of errors of different types of errors in a certain time range, the error rate, the number of affected users, and the percentage of affected users. In the detailed classification of indicators, we can also use different dimensions to define monitoring, such as version number.

How? How to flexibly formulate your alarm plan?

We first ask you to take a quiz to determine your monitoring alarm type (a total of 5 questions, only 1.5 minutes)

The rules are as follows: Option A is scored 5 points, Option B is scored 10 points, Option C is scored 15 points, Option D is scored 20 points

Q1: What stage is your product currently in?

A: Already online, in a relatively stable state, low demand for monitoring alarms

B: It is still in the development stage, some errors in the test need to be captured, and the demand for monitoring and warning is general

C: Just launched, it is relatively stable on the whole, and there is a high demand for monitoring and alarming

D: Just launched, the effect is unknown, 7*24 hours real-time attention is very much needed, and the demand for monitoring alarms is very high

Q2: What is your position in your company/department?

A: Leaders, pay attention to the quality of the application

B: Operation and maintenance personnel, online problem supervisor responsible for monitoring the overall application performance

C: Tester, responsible for the quality control of the application before release

D: Android/iOS client developer

Q3: How many people in your team are concerned about application performance quality and are involved in it?

A: 1. The polished commander works on his own

B: 2~5 people, small development team

C: 6~25 people, cooperate with each other and optimize application quality together

D: 25+, super large development team, unmodestly said to be the industry leader

Q4: Which application performance monitoring indicators do you pay attention to daily:

A: The most basic number of errors is enough

B: Taking into account the scope of users affected by the client, the number and proportion of users affected need to be monitored on the basis of the above

C: In addition to the number of errors mentioned above and the impact on users, the distribution of each version should also be considered

D: It is necessary to formulate combined warning rules: For example, when the number of errors> 100 and the error rate> 1% or the number of affected users is 1% more than 1 day ago, the alarm will be triggered, and the version distribution should also be considered

Q5: Do you have detailed requirements for the notification method of alarms?

A: There is no requirement, as long as you can receive it

B: There are some requirements on time, so I don’t want to be disturbed in the middle of the night

C: There are some requirements on the channel, which requires email or specific office chat software

D: There are requirements for time and access channels

What? So how to set up an alert plan?

If you add up the above points, please determine your total test score first (5 points for option A, 10 points for option B, 15 points for option C, and 20 points for option D), and see which of your App monitors below Within the range of alarm demand level: (Which range is the data? Or is the monitoring alarm at which level?)

Blood Bronze (25-50 points): You are a primary user of monitoring alarms, and you do not need to check the occurrence status of various errors very carefully in your daily work. It may be because your application is still in its infancy, or you have a high priority, so you don't need to fix the alarm information yourself, you only need to monitor the whole thing. Please see plan 1 below

Heroic Gold (50-75 points): You are an intermediate user of monitoring alarms. You or your team have the awareness of monitoring alarms, and you will pay attention to real-time application quality in your daily work. You can already set alarms with certain refined rules, please skip to scenario 2

King of Glory (75-100 points): You are already a high-energy player in monitoring and alerting. You only need a little guidance to become the "super trump card" in the monitoring and alerting world.

According to the score of the above test, you can judge the difficulty of the alarm setting you need. The whole is divided into the following schemes, and the degree of realization is from easy to difficult . If you want to learn the most comprehensive alarm setting function, please skip directly to solution 3.

Option 1: Simple type-overall application quality monitoring

As the initial alarm setting, you only need to consider two issues:

a. Under what circumstances should I receive an alert?  

b. How can I receive application alert messages?

To solve the first problem, you can consider the simplest state, as long as there is an error, I will receive a warning, then as long as the number of errors> 0 can be resolved. If you feel that you are disturbed a lot in this way, you can set an alarm rule such as the number of errors> xx according to your own application.

3.png

To solve the second problem, you need to have a medium that can receive messages, the simplest is the mailbox:

4.png

A simple monitoring alarm plan is set up like this

Option 2: Advanced type-refined application quality monitoring

You can already set different alarm messages for a single application, which can be distinguished according to the type or version of the monitored indicator. For example, our requirement for the newly launched version is that the number of users affected> 10 will trigger an alarm. The requirement for the old version is that the overall error rate does not exceed 5% compared to last week. Then we can follow the following Mode setting:

a. The new version of the warning rules:

5.png

b. The warning rules of the old version:

6.png

In this scheme, we apply threshold and contrast alarm trigger conditions respectively. The definitions of these two rules are as follows:

Threshold rule

You can choose an indicator (number of errors, error rate, number of affected users, percentage of affected users), and choose "greater than" a certain value or a certain percentage

Comparative rule

You can select an indicator (number of errors, error rate, number of affected users, and percentage of affected users), and select the "compared to" the historical time period and the percentage increase. The calculation method is: (number of past 1 hour-1 hour of history) Value) / 1 hour historical value, if greater than or equal to the selected value, an alarm will be sent

Option 3: King Type-Combined Index Monitoring

You can already set up monitoring alarms very skillfully, then through the hints below, I believe you can flexibly formulate your alarm plan according to your daily work needs

a. Flexible setting of alarm effective time:

7.png

You can add the time period for the alarm to take effect, such as Monday to Friday from 9:00 to 19:00, and from 12:00 to 20 on weekends. You can flexibly set your working hours without being disturbed by invalid information.

b. Key error type/single error alarm

You can choose the type of error that needs your focus

8.png

Or directly focus on an error in a repair to continue to pay attention to the alarm

9.png

c. Combined alarm trigger conditions

10.png

You can flexibly set the alarm triggering conditions you want through a variety of indicators, threshold or comparison rules, and a combination of intersection/union.

d. Multiple alarm access channels

11.png

If you also have requirements for the access channels for monitoring alarms, you can consider using the company's office software for group access, and work with other colleagues in your group to pay attention to and fix application problems.

All monitoring alarm setting functions mentioned in this solution can be experienced through U-APM, and an alarm plan can be made in 2 minutes.


Guess you like

Origin blog.51cto.com/10636575/2663489