Why do array subscripts start from 0?

In many programming languages, most array subscripts start from zero, so why not start from one?

First of all, we must first understand the definitions related to arrays.

  • Array is a linear table data structure. It uses a set of contiguous memory spaces to store a set of data of the same type.
  • In C language, the meaning of subscript is: the offset from the current element to the first element . The subscript of the first element is naturally 0, the subscript of the second element is 1, and the subscript of the nth element is n-1.

The reason why the subscript starts from 0 is for convenience of addressing .

The computer assigns an address to each memory unit and accesses the data in the memory through the address. When a computer wants to randomly access an element in an array, it uses the following addressing formula:

a[i]_address = base_address + i * data_type_size

And if the following table starts from 1, it will become:

a[i]_address = base_address + ( i - 1 ) * data_type_size

Comparing the two codes, you will find that when the subscript starts from 1, each time the array element is randomly accessed according to the subscript, there will be one more subtraction operation for the CPU. Arrays are a very basic data structure, and random access to array elements through subscripts is a very basic programming operation. The optimization of efficiency must be as extreme as possible. So in order to reduce one subtraction operation, the array is numbered from 0 instead of 1.

There is another way of understanding (offset): an address space is opened in the memory (the address spaces in the memory are continuous, and the content in the address is the value you store in it), and then the variable a points to The "first" address of this address space, if you want to access other addresses in this address, you have to use the offset to calculate it.

The variable a already points to the first address, a[0] = a + 0, 0 represents the offset, and the offset of a is 0 units, which is itself, so a[0] represents the first address. a[1] = a + 1; a is offset by one unit, then a[1] can be accessed, which is the second address. So you only need the variable name + [offset], (such as a[0]) to access the corresponding memory address. This is the case where the subscript starts from 0.

Guess you like

Origin blog.csdn.net/2301_78131481/article/details/134101826