Data Structures and Algorithms - better understanding of the array

Array

Mentioned array, I believe we are not unfamiliar, after all, each programming language will have its shadow.

Array is the most basic data structures, despite the array looks very simple and basic, but the underlying data structure to grasp its essence, is not so simple thing.


Straight to the point

Array (the Array) is a linear table data structure with a set of contiguous memory space to store a set of data having the same type.

This definition has several key words, but also the essence of an array of lies. Here's a few key words from a further understanding of the array.

The first is a linear table . As the name suggests, characterized in that the linear form of the same row of data forming a line structure. Data for each of at most two linear form front and rear directions. In addition arrays, linked lists, queues, stacks and other data structures also linear table structure.

For chestnut, candied fruit string characteristics very similar to the linear form. Candied fruit (data) strung together in a straight bamboo, and each of candied fruit (data) at most only two front and rear directions.

The second is a contiguous memory space and the same type of data . Because of the limitations of these two conditions, with an array of very important features: random access to the elements , the elements of a random access time complexity is O (1). But when good foundation, that these two constraints lead to data during the insertion and deletion of data, in order to ensure the continuity of data, you need to do move operational data.


Random access

How the array is achieved in accordance with the table next random access array elements of it?

We take a length of 5 inttypes of arrays int a[5], here as example. When we define the array, an array of computer will int a[5]assign a piece of contiguous memory space.

Assumptions, the array int a[5]'s first address of the memory block is base_address=100then

  • a[0]The address is 100 (the first address)
  • a[2]The address is 104
  • a[3]The address is 108
  • a[3]The address is 112
  • a[4]The address is 116

The computer is by accessing the memory address to access data stored in memory. Then, when the computer is a random access to an element in the array, this will be addressed by the following formula to calculate the address corresponding to the memory element, the memory address to access data.

a[i]_address = base_address + i * data_type_size

a[i]_addressIt indicates that the corresponding target address of the next memory array data_type_sizerepresents an array type data storage size of the array int a[5]. 5 is stored in the inttype of data, it data_type_sizeit is 4 bytes.

Group two bits addressed formula, assuming two dimensional bit array is m * n, the formula is:

a[i][j]_address = base_address + ( i * n + j ) * data_type_size


Why array subscript starts at 0?

When the first answer this question, we imagine assume array index starts at 1, a [1] represents the first address of the array, then the computer will become the addressing formula is:

a[index]_address = base_address + (i - 1) * data_type_size

Contrast array subscript starts at 0 and set the array subscript from the start addressing Formula 1, we can easily see, beginning a number of random access array elements each have more than a subtraction, for the CPU, it is more a subtraction instruction.

Not to mention the array is very basic data structure, the frequency of use is very high, it is necessary to achieve the ultimate efficiency optimization. Therefore, in order to reduce a subtraction instruction of the CPU, the array number selected from zero, rather than from the beginning.

Two-digit group

The above is from the perspective of computer addressing formula, of course, in fact, there are historical reasons.


Insertion and removal process arrays

The aforementioned definition of the array, the memory array in order to maintain continuity of data, insertion and deletion will cause less efficient than two operations. Followed by the code to illustrate why the lead to inefficient it? What are ways to improve?

Insert operation

Insertion of data for different scenarios and different insertion positions, the time complexity is slightly different. Next, an array data is ordered and no regular two scenarios analysis inserting operation.

Whatever the scenario, if the element is inserted at the end of the array, then it is very simple, do not need to move data directly into the end of the array elements, then it is space complexity O (1).

If you insert it in the beginning or middle of the data array? In this case according to different scenarios, different ways.

If the array of data is ordered (ascending or descending), when inserting a new element in the k-th point, the data must be moved back after a k, the worst case time complexity is O (n).

If the array of data without any rules , then the k-th position when inserting a new element, first the old data of the k-th position move to the end of the data, the data into the new element directly to the k-th position. So in this particular scenario, the insertion of an element in the k-th position on the time complexity is O (1).

A picture is worth a thousand words, show us the way to the array of map data is ordered and no regular scene of insert elements of the process.

Delete operation

Insert data with similar, if we want to delete the data of the k-th position, for the continuity of memory, but also the need for data movement, otherwise there will be empty middle, the memory is not continuous.

If you delete data at the end of the array, the time complexity is O (1); If you delete the beginning of the data, the data because of the need to move forward after a position of k, then it is time complexity is O (n).

A picture is worth a thousand words, we figure a way to show the array delete operation.


Code combat array insert, delete, and query

The present example, the array is an ordered data (data in ascending order) of the scene to implement an array of insert, delete and query operations .

A first attribute to define an array of structures, each having a length, the number of occupied array of pointers and arrays.

struct Array_t
{
    int length; // 数组长度
    int used;   // 被占用的个数
    int *arr;   // 数组地址
};

Create an array:

The length of the array structure of the set, and creates a corresponding continuous spatial array of the same type

void alloc(struct Array_t *array)
{
    array->arr = (int *)malloc(array->length * sizeof(int));
}

Insertion process :

  1. Analyzing occupancy exceeds the number of array array length
  2. Through the array to find new elements to be inserted in the index idx
  3. If the index is not the end position of the insertion element is found, it will need to turn back move a data idx
  4. In idx subscript inserting a new element, the array and the number of occupied +1
/*
 *  插入新元素
 *  参数1:Array_t数组结构体指针
 *  参数2:新元素的值
 *  返回:成功返回插入的数组下标,失败返回-1
 */
int insertElem(struct Array_t *array, int elem)
{
    // 当数组被占用数大于等于数组长度时,说明数组所有下标都已存放数据了,无法在进行插入
    if (array->used >= array->length)
    {
        std::cout << "ERROR: array size is full, can't insert " << elem << " elem." << std::endl;
        return -1;
    }

    int idx = 0;

    // 遍历数组,找到大于新元素elem的下标idx
    for (idx = 0; idx < array->used; idx++)
    {
        // 如果找到数组元素的值大于新元素elem的值,则退出
        if (array->arr[idx] > elem)
        {
            break;
        }
    }

    // 如果插入的下标的位置不是在末尾,则需要把idx之后的
    // 数据依次往后搬移一位,空出下标为idx的元素待后续插入
    if (idx < array->used)
    {
        // 将idx之后的数据依次往后搬移一位
        memmove(&array->arr[idx + 1], &array->arr[idx], (array->used - idx) * sizeof(int));
    }

    // 插入元素
    array->arr[idx] = elem;
    // 被占用数自增
    array->used++;

    // 成功返回插入的数组下标
    return idx;
}

The removal process :

  1. Judgment to be deleted subscript is legal
  2. Data will be deleted after the index idx move forward one
/*
 *  删除新元素
 *  参数1:Array_t数组结构体指针
 *  参数2:删除元素的数组下标位置
 *  返回:成功返回0,失败返回-1
 */
int deleteElem(struct Array_t *array, int idx)
{
    // 判断下标位置是否合法
    if (idx < 0 || idx >= array->used)
    {
        std::cout << "ERROR:idx[" << idx << "] not in the range of arrays." << std::endl;
        return -1;
    }

    // 将idx下标之后的数据往前搬移一位
    memmove(&array->arr[idx], &array->arr[idx + 1], (array->used - idx - 1) * sizeof(int));

    // 数组占用个数减1
    array->used--;

    return 0;
}

Queries index :

Through the array, the query index element value, if found array element is returned; not find the error prompt

/*
 *  查询元素下标
 *  参数1:Array_t数组结构体指针
 *  参数2:元素值
 *  返回:成功返回元素下标,失败返回-1
 */
int search(struct Array_t *array, int elem)
{
    int idx = 0;

    // 遍历数组
    for (idx = 0; idx < array->used; idx++)
    {
        // 找到与查询的元素值相同的数组元素,则返回元素下标
        if (array->arr[idx] == elem)
        {
            return idx;
        }

        // 如果数组元素大于新元素,说明未找到此数组下标, 则提前报错退出
        // 因为本例子的数组是有序从小到大的
        if (array->arr[idx] > elem)
        {
            break;
        }
    }

    // 遍历完,说明未找到此数组下标,则报错退出
    std::cout << "ERROR: No search to this" << elem << " elem." << std::endl;

    return -1;
}

Print array :

Each element of the output array

void dump(struct Array_t *array)
{
    int idx = 0;

    for (idx = 0; idx < array->used; idx++)
    {
        std::cout << "INFO: array[" << idx << "] : " << array->arr[idx] << std::endl;
    }
}

main functions :

Creating a length of 3, an array of type int, the array and insert elements, delete elements, elements of the query and print elements.

int main()
{
    struct Array_t array = {3, 0, NULL};

    int idx = 0;

    std::cout << "alloc array length: " << array.length << " size: " << array.length * sizeof(int) << std::endl;
    alloc(&array);
    if (!array.arr)
        return -1;

    std::cout << "insert 1 elem" << std::endl;
    insertElem(&array, 1);

    std::cout << "insert 0 elem" << std::endl;
    insertElem(&array, 0);

    std::cout << "insert 2 elem" << std::endl;
    insertElem(&array, 2);

    dump(&array);

    idx = search(&array, 1);
    std::cout << "1 elem  is at position " << idx << std::endl;

    idx = search(&array, 2);
    std::cout << "2 elem  is at position " << idx << std::endl;

    std::cout << "delect position [2] elem " << std::endl;
    deleteElem(&array, 2);

    dump(&array);

    return 0;
}

operation result:

[root@lincoding array]# ./array
alloc array length: 3 size: 12
insert 1 elem
insert 0 elem
insert 2 elem
INFO: array[0] : 0
INFO: array[1] : 1
INFO: array[2] : 2
1 elem  is at position 1
2 elem  is at position 2
delect position [2] elem
INFO: array[0] : 0
INFO: array[1] : 1

summary

Array is the most basic, most simple data structure. With an array of contiguous memory space to store a set of data of the same type, the biggest feature is random access to the elements, and the time complexity is O (1). However, insertion, deletion and therefore less efficient, time complexity is O (n).

Disclaimer: This article is a reference geeks time - data structures and algorithms part of the contents.

Guess you like

Origin www.cnblogs.com/xiaolincoding/p/11564454.html