C language variable array nested variable array, over the mountain, crossed the river and fell into the pit

mutable array

​Column content :
postgresql kernel source code analysis
handwritten database toadb
concurrent programming
personal homepage : my homepage
motto: Tian Xingjian, a gentleman strives for self-improvement;

insert image description here

overview

The elements in the array are stored sequentially. This feature makes it easy for us to store and access data,
but also because of this feature, when we write code, we often cannot determine the number of array tuples, and we can only do it according to the maximum number. Pre-allocation,
which not only causes a waste of space, but also is unfriendly to use. Obviously we want to run a small data set, but it requires a lot of memory space.

This results in mutable arrays whose number of elements need not be determined in code, but rather at runtime.

Method to realize

Variable arrays are often encountered in our programs, but what are their implementations?
According to the different storage memory areas of the array, it can be divided into

  • Stack memory implementation
  • Heap memory implementation
    Let's take a look at how they are implemented and what are the differences

Stack memory implementation

Here, the newly added VLA (variable-length array) feature in C99 allows us to define an array when using it. The length of the array is no longer a static value, but can be a value in a variable.
That is to say, the length of the array is uncertain at the program compilation stage, and it can be determined at runtime, which avoids us defining a largest array and causing a lot of waste of space.

  • example
void test(int n)
{
    
    
    /* check */
    if(n <= 0)
    {
    
    
        return;
    }

    // int arr[n] = {0};
    int arr[n];
    /* todo  */
    for(int i=0; i < n; i++)
    {
    
    
        arr[i] = i;
    }
    return;
}

The length of the array arr is determined by the variable n

  • Precautions
  1. This feature was introduced by C99, and not all compilers can fully support it. The gcc version I use supports it.
[senllang@hatch toadbtest]$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --disable-libmpx --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 8.5.0 20210514 (Red Hat 8.5.0-19) (GCC)
  1. The array defined by VLA cannot be initialized at the time of definition, otherwise the following error will be generated, because it cannot use the default initializer and must be initialized by the user;
[senllang@hatch toadbtest]$ gcc test.c
test.c: In function ‘test’:
test.c:9:5: error: variable-sized object may not be initialized
     int arr[n] = {
    
    0};
     ^~~
test.c:9:19: warning: excess elements in array initializer
     int arr[n] = {
    
    0};

heap memory implementation

When in use, dynamically apply for array space through malloc. The space size is n times the type of the array element, and n is the size of the array we need. It can be an input or a variable value during the running of the program.

This method is commonly used by us and supported by all compilers.

  • example
void test(int n)
{
    
    
    int *arr = NULL;

    /* check */
    if(n <= 0)
    {
    
    
        return;
    }

    arr = (int *)malloc(sizeof(int)*n);
    if(NULL == arr)
    {
    
    
        return ;
    }

    /* todo  */
    for(int i=0; i < n; i++)
    {
    
    
        arr[i] = i;
    }
    return;
}

interview method

There are generally pointer and subscript methods for array access, which are no different from ordinary arrays. Why do we talk about array access methods? Because there will be a shocking pit hidden here, let's look down.

Generally, in the C language, arrays can be converted into pointers, and of course pointers can also be converted into arrays for use.

Array subscript access

This is very simple. The elements in the array are arranged in order, so you can access them according to their position numbers.

For the VLA definition or the space allocated by the dynamic application method, the memory space of their element storage is continuous, so both methods can be accessed by subscripting.

  • For arrays, it is normal, and the value of each element can be obtained by incrementing the subscript;
  • As for the dynamically applied array, it points to the first address of the memory space, and can also be understood as a pointer to the array, that is, the array pointer that is often called, and the corresponding element value can be directly obtained by subscripting.
/* 如上面举例,指针类型定义的数组,也可以下标进行访问 */
int *arr = NULL;
arr[i] = i;

pointer access

Access in the form of a pointer, the movement step of each pointer is the number of bytes of the basic type of the pointer;
at this time, when fetching the value, it is necessary to fetch the value in the form of a pointer;

For the VLA definition or the space allocated by the dynamic application method, the memory space of their element storage is continuous, so both methods can be accessed by pointers.

  • For an array, the array name is the address of the first element, and each increment of +1 during traversal will move to the address of the next element;
  • As for the dynamically applied array, it points to the first address of the memory space, which is also the first address of element 0;
int testarr[n];
int *arr = testarr;

for(int i = 0; i < n; i++,arr++)
{
    
    
    *arr = i;
}

An array is specifically defined here, and then the first address of the array is assigned to the pointer, and the pointer is used to access the array elements

Nested use of mutable arrays

If a structure contains a variable array and the structure is nested, it seems a bit complicated, how does it allocate space and access?

definition

If we define the following structure, we finally use the structure stGroupData;

typedef struct Postion
{
    
    
    int x;
    int y;
}stPosition, *pstPostion;

typedef struct MemberData
{
    
    
    int posCnt;
    stPosition posArr[];
}stMemberData, *pstMemberData;

typedef struct GroupData
{
    
    
    int group_id;
    int memberCnt;
    stMemberData memberData[];
}stGroupData, *pstGroupData;

Are you curious about the size of the above structure? This is an assignment for everyone, and students who know the answer can give it in the comment area.

allocate space

Because of nesting, the feature of VLA cannot be used, and only dynamic allocation can be used.
During dynamic allocation, the elements of the outer structure and the inner structure need to be calculated separately, which is easy to miss here;

Suppose we have a set of data and need 2 memberdata:

memberdata 0: There are 3 postions
memberdata 1: There are 1 postions

Pit 1: Occupying space

How much space needs to be allocated?

  • At first glance, it may seem that sizeof(stGroupData) is enough;
  • Look again, actually need sizeof(stGroupData) + 2*sizeof(stMemberData) size;

This fell into the pit. Below is the correct size calculation;

Calculate the size of the space

int size = 0;
pstGroupData pgData = NULL;

/* 计算一个要分配的空间大小,假设2个memberdata:
 * memberdata 0: 有3个postion
 * memberdata 1: 有1个postion 
 */
size = sizeof(stGroupData) + 2*sizeof(stMemberData) + 4 * sizeof(stPosition);
pgData = (pstGroupData)malloc(size);

When calculating the size here, first calculate the size of the structure header, because the array part has no defined length, and the value from sizeof is not included, so it needs to be calculated separately; the outer layer stGroupData contains two elements, and the inner layer
stMemberData They are 3 and 1 respectively, that is, 4 element spaces, plus the size of the outer structure, is the entire memory space occupied.
Their memory space distribution, assuming the first address starts from 0

insert image description here

access array

Then according to the above example, a structure is defined, how to access each array element?
Some friends may immediately think of the way of subscripting, so let's take a look

Pit 2: Subscript access

Will it be correct for us to use subscript references at this time?

pgData->memberData[0] 
pgData->memberData[1] 

The address difference between memberData[0] and memberData[1] should be the sizeof(stMemberData) = 4 of an element, that is, the space size of an int posCnt; from the memory distribution diagram, it will become like
this

insert image description here

Access to nested mutable arrays

At this time, subscript access is wrong, and the default type size cannot be used for movement; it can
only be accessed by pointer, and at the same time, it is necessary to calculate the offset size of the next element by itself

pstMemberData pmData = NULL;

/* memberData[0] */
pmData = pgData->memberData;

/* memberData[1] */
pmData = (pstMemberData)((char*)(pgData->memberData) + sizeof(stMemberData) + 3 * sizeof(stPosition));

end

Thank you very much for your support. Don’t forget to leave your valuable comments while browsing. If you think it is worthy of encouragement, please like and bookmark, I will work harder!

Author email: [email protected]
If there are any mistakes or omissions, please point them out and learn from each other.

Note: Do not reprint without consent!

Guess you like

Origin blog.csdn.net/senllang/article/details/132223309