Use C program to calculate how many machine instructions are executed by C code (only some Linux systems are supported) [Video introduction]

! ! ! Friends who like to watch videos please click here! ! !

1. Source

When I was studying time complexity recently, I had an idea. I thought that in addition to the running time, can the performance of the two algorithms be compared by the number of instructions?
I don’t know how to write, so I searched online, and the result closest to my imagination is an answer on stackoverflow: Quick way to count number of instructions executed in a C program .


Two, C language code

#include <asm/unistd.h>
#include <linux/perf_event.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <unistd.h>
#include <inttypes.h>
#include <sys/types.h>

static long perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags) {
    
    
  int ret;
  ret = syscall(__NR_perf_event_open, hw_event, pid, cpu, group_fd, flags);
  return ret;
}

int main(int argc, char **argv) {
    
    
  struct perf_event_attr pe;
  long long count;
  int fd;
  uint64_t n;
  if (argc > 1) {
    
    
    n = strtoll(argv[1], NULL, 0);
  } else {
    
    
    n = 10000;
  }
  memset(&pe, 0, sizeof(struct perf_event_attr));
  pe.type = PERF_TYPE_HARDWARE;
  pe.size = sizeof(struct perf_event_attr);
  pe.config = PERF_COUNT_HW_INSTRUCTIONS;
  pe.disabled = 1;
  pe.exclude_kernel = 1;
  pe.exclude_hv = 1; // Don't count hypervisor events.
  fd = perf_event_open(&pe, 0, -1, -1, 0);
  if (fd == -1) {
    
    
    fprintf(stderr, "Error opening leader %llx\n", pe.config);
    exit(EXIT_FAILURE);
  }
  ioctl(fd, PERF_EVENT_IOC_RESET, 0);
  ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);

  /* ---------------------- 以下是需要测试的业务代码 ---------------------- */
  for (int i = 0; i < 10000; ++i);
  /* ---------------------- 以上是需要测试的业务代码 ---------------------- */

  ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
  read(fd, &count, sizeof(long long));
  printf("Used %lld instructions\n", count);
  close(fd);
}

illustrate:

  1. The business code to be tested is on line 43. In this example, it is to test the number of instructions to be executed in the for loop 1000 times. I basically can't understand other codes, so I suggest you don't change them either.
  2. The program relies on the Linux environment and cannot be executed under Windows.

3. Compile and run

There is no difference between compiling and running an ordinary C program. Assuming the file name is perf_event_open.c, the command to compile and run is as follows:

gcc perf_event_open.c -o perf_event_open.out  && ./perf_event_open.out

When running normally, the following information should be printed:

Used 30018 instructions

When the operation is abnormal, the following information may be printed:

Error opening leader 1

4. Error opening leader problem

If you can run it successfully directly, congratulations, you can skip this section.
If you fail to run and Error opening leader 1a prompt appears, there are two possible reasons:

  1. You are running on a vmware virtual machine, but the virtual machine is not set up correctly.
  2. Your CPU does not support running this program.

4.1 vmware virtual machine settings

If you are running on a vmware virtual machine, you need to enable 虚拟化 CPU 性能计数器(U)the function.
Menu: 编辑虚拟机配置(Need to close the virtual machine first) -> 硬件Tab -> 处理器-> Check 虚拟化 CPU 性能计数器(U).
insert image description here
insert image description here
Then start the virtual machine.


4.2 The CPU does not support this program

This program uses the PMU (Performance Monitoring Units) function provided by the CPU, that is, the performance monitoring unit. If your CPU does not support PMU, there is no way around this situation.
How to determine whether to support it? I didn't find any relevant articles on this, so there is no exact answer, but there are two guessed answers.


4.2.1 The first method: use the dmesg command

Execute the following command:

 dmesg | grep PMU

If it is supported, you will get something like the following;
insert image description here
if not, nothing will be printed.


4.2.1 The second method: use the perf command

If perfthe command is not present, it needs to be installed first.
Centos8 install perf:

yum install -y perf

Ubuntu20.04 install perf:

apt install -y linux-tools-common linux-tools-generic linux-tools-5.11.0-44-generic linux-cloud-tools-5.11.0-44-generic

After installation, execute the following command:

perf list pmu

If PMU is supported, the result is similar to this (instructions are related to instructions):

List of pre-defined events (to be used in -e):

  branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
  branch-misses OR cpu/branch-misses/                [Kernel PMU event]
  bus-cycles OR cpu/bus-cycles/                      [Kernel PMU event]
  cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
  cache-references OR cpu/cache-references/          [Kernel PMU event]
  cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
  instructions OR cpu/instructions/                  [Kernel PMU event]
  ref-cycles OR cpu/ref-cycles/                      [Kernel PMU event]
  topdown-fetch-bubbles OR cpu/topdown-fetch-bubbles/ [Kernel PMU event]
  topdown-recovery-bubbles OR cpu/topdown-recovery-bubbles/ [Kernel PMU event]
  topdown-slots-issued OR cpu/topdown-slots-issued/  [Kernel PMU event]
  topdown-slots-retired OR cpu/topdown-slots-retired/ [Kernel PMU event]
  topdown-total-slots OR cpu/topdown-total-slots/    [Kernel PMU event]
  msr/pperf/                                         [Kernel PMU event]
  msr/smi/                                           [Kernel PMU event]
  msr/tsc/                                           [Kernel PMU event]

If not supported, the result is similar to this:

List of pre-defined events (to be used in -e):

  ref-cycles OR cpu/ref-cycles/                      [Kernel PMU event]
  topdown-fetch-bubbles OR cpu/topdown-fetch-bubbles/ [Kernel PMU event]
  topdown-recovery-bubbles OR cpu/topdown-recovery-bubbles/ [Kernel PMU event]
  topdown-slots-issued OR cpu/topdown-slots-issued/  [Kernel PMU event]
  topdown-slots-retired OR cpu/topdown-slots-retired/ [Kernel PMU event]
  topdown-total-slots OR cpu/topdown-total-slots/    [Kernel PMU event]
  msr/pperf/                                         [Kernel PMU event]
  msr/smi/                                           [Kernel PMU event]
  msr/tsc/                                           [Kernel PMU event]

5. Error description

In the above C language code, comment out the business code on line 43, save, compile, and run, and you will find that the number of instructions is not 1. On 0my Ubuntu20.24 14, it is on my Centos8 13. You Your own environment may also get different values. This value is the error value, so you need to subtract this value when you actually calculate.


end of text

Guess you like

Origin blog.csdn.net/h837087787/article/details/122384335