Mistakes to Avoid in Database Benchmarking

Benchmarking is arguably the only convenient and efficient way to learn what happens to a system under a given workload. It can be seen the importance of benchmarking. Before designing benchmarks, we need to understand the common mistakes of benchmarking and prevent ourselves from making these mistakes in testing.

Common mistakes in database benchmarking:

  • Use a subset of real data instead of the full set: For example, in a real situation, the application needs to process 100G of data, but during the test, only 1G of data is used; or only the current data is used for testing, but it is hoped to simulate the situation after the business has grown significantly in the future.
  • Using the wrong data distribution: For example, using a uniformly distributed data test, while the real data of the system has many hot spots (randomly generated test data usually cannot simulate the real data distribution)
  • Use unreal distribution parameters: eg assume that all users' profiles are read equally.
  • In multi-user scenarios, only single-user tests are performed.
  • Test distributed applications on a single server.
  • Does not match real user behavior. For example "think time" in a web page. Real users, after requesting a page, will read it for a period of time, rather than clicking the relevant links one after the other without pausing.
  • Execute the same query repeatedly. Real queries are all different, which can lead to lower cache hit rates. Repeated execution of the same query will, to some extent, cache the results in whole or in part.
  • Not checking for errors: If the results of a test cannot be reasonably interpreted, such as a query that should be slow suddenly becomes faster, it should be checked for errors. Otherwise it might just be a test of how fast MySQL can detect syntax errors. After benchmarking, be sure to check the error log, which should be a basic requirement.
  • The process of warming up the system is ignored. For example, test immediately after the system restarts. Sometimes you need to know how long it takes for the system to restart to reach normal performance capacity, and pay special attention to the length of warm-up. On the other hand, if you want to analyze normal performance, you need to pay attention that if the benchmark test starts immediately after restarting, the cache is cold and there is no data yet. At this time, even if the test pressure is the same, the result obtained is the same as the cache installed. Full data is different from time to time.
  • Use default server configuration
  • The test time is too short. Benchmarking takes a while.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325092054&siteId=291194637