[Linux] Basic literacy learning for programmers

written in front

This article records my study career, learn a little and remember a little, and be ready to carry a bucket at any time.

1. Memory management

1. Segmentation

The virtual space of the memory space size required by the program is mapped to a certain physical address space.
Problem: The entire memory cannot be used efficiently, and it is easy to cause waste of memory (allocate physical memory for the program, and the program does not fully use the physical memory).

2. Pagination

Paging: 1. In order to solve the problems caused by segmentation, that is, the efficient use of memory; 1. For protection, the attributes and permissions of each page can be set individually.
Divide the memory into pages of fixed size, such as 1K or 4K (determined by the hardware), on this basis, the memory of the program can be further subdivided. Allocate part of the memory used by the program to physical memory, do not allocate actual physical memory to the part of memory that is not used yet, and allocate the actual physical address after subsequent use.
The VP0 and VP1 currently being used by the process are allocated to physical addresses PP2 and PP0, the other part of VP3 and VP2 are allocated to the disk, and the unused VP4, VP6 and VP7 are not allocated.

insert image description hereNow the translation of virtual address to physical address is done by dedicated hardware (MMU):
insert image description here

2. Thread management

3. Static library

1. Compile

insert image description here
The ac file is used for the demonstration of the following result steps
The documentation has been condensed as much as possible in order to simplify the display.

#define PI (111)
int main()
{
    
    
    if(PI);
    return 0;
}

1.1. Preprocessing

gcc -E a.c -i a.i
pi@NanoPi-NEO2:~/project/test$ cat b.i
# 1 "b.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "b.c"

int main()
{
    
    
    if((111));
    return 0;
}

Expand the macro, replace the header file, remove the comment, add the line number, and keep the compilation command –>xx.i

1.2, compile

gcc -S a.i -i a.s
pi@NanoPi-NEO2:~/project/test$ cat b.s 
        .arch armv8-a
        .file   "b.c"
        .text
        .align  2
        .global main
        .type   main, %function
main:
.LFB0:
        .cfi_startproc
        mov     w0, 0
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0"
        .section        .note.GNU-stack,"",@progbits

Generate assembly file –>xx.s

1.3, compilation

gcc -E a.c -i a.i

Generate machine code from assembly file –>xx.o

1.4. Links

gcc -E a.c -i a.i

Link together previous references (function variables) of the file.
A symbol table will be generated in different files, and all symbols (functions and variables) in the file are clearly indicated in the table, which is convenient for other files to reference.
–>xx.out

2. Compiler

3. Target file

.text

code snippet

.data

Initialized non-zero data segments (global variables, local static variables, allocated space occupying actual memory)

.bss

A data segment that is uninitialized or initialized to 0 (uninitialized may default to 0, there is no need to allocate space at this stage, so no space is allocated if it is empty, and only the symbol table is reserved) For example
insert image description here:

#include"stdio.h"
#include"stdint.h"

uint16_t temp_1 = 222;
uint16_t temp_2;

int mian()
{
    
    
    static uint16_t temp_3 = 111;
    static uint16_t temp_4;

    temp_1 = temp_3++;
    temp_2 = temp_1++;

    printf("this is test:%d\r\n",temp_1);
    return 0;
}

Compile the above .c:

gcc -c main.c

Use objdump to view the information about the .o file generated by compiling the above code.

objdump -x -s -d main.o 

The output file is as follows:

pi@NanoPi-NEO2:~/project/test$ objdump -x -s -d main.o 

main.o:     file format elf64-littleaarch64
main.o
architecture: aarch64, flags 0x00000011:
HAS_RELOC, HAS_SYMS
start address 0x0000000000000000
private flags = 0:

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000088  0000000000000000  0000000000000000  00000040  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000004  0000000000000000  0000000000000000  000000c8  2**1
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000002  0000000000000000  0000000000000000  000000cc  2**1
                  ALLOC
  3 .rodata       00000012  0000000000000000  0000000000000000  000000d0  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .comment      0000002b  0000000000000000  0000000000000000  000000e2  2**0
                  CONTENTS, READONLY
  5 .note.GNU-stack 00000000  0000000000000000  0000000000000000  0000010d  2**0
                  CONTENTS, READONLY
  6 .eh_frame     00000038  0000000000000000  0000000000000000  00000110  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l    df *ABS*  0000000000000000 main.c
0000000000000000 l    d  .text  0000000000000000 .text
0000000000000000 l    d  .data  0000000000000000 .data
0000000000000000 l    d  .bss   0000000000000000 .bss
0000000000000000 l    d  .rodata        0000000000000000 .rodata
0000000000000002 l     O .data  0000000000000002 temp_3.3838
0000000000000000 l     O .bss   0000000000000002 temp_4.3839
0000000000000000 l    d  .note.GNU-stack        0000000000000000 .note.GNU-stack
0000000000000000 l    d  .eh_frame      0000000000000000 .eh_frame
0000000000000000 l    d  .comment       0000000000000000 .comment
0000000000000000 g     O .data  0000000000000002 temp_1
0000000000000002       O *COM*  0000000000000002 temp_2
0000000000000000 g     F .text  0000000000000088 mian
0000000000000000         *UND*  0000000000000000 printf


Contents of section .text:
 0000 fd7bbfa9 fd030091 00000090 00000091  .{
    
    ..............
 0010 00004079 01040011 223c0012 01000090  ..@y...."<......
 0020 21000091 22000079 01000090 21000091  !..."..y....!...
 0030 20000079 00000090 00000091 00004079   ..y..........@y
 0040 01040011 223c0012 01000090 21000091  ...."<......!...
 0050 22000079 01000090 210040f9 20000079  "..y....!.@. ..y
 0060 00000090 00000091 00004079 e103002a  ..........@y...*
 0070 00000090 00000091 00000094 00008052  ...............R
 0080 fd7bc1a8 c0035fd6                    .{
    
    ...._.        
Contents of section .data:
 0000 de006f00                             ..o.            
Contents of section .rodata:
 0000 74686973 20697320 74657374 3a25640d  this is test:%d.
 0010 0a00                                 ..              
Contents of section .comment:
 0000 00474343 3a202855 62756e74 7520392e  .GCC: (Ubuntu 9.
 0010 332e302d 31377562 756e7475 317e3230  3.0-17ubuntu1~20
 0020 2e303429 20392e33 2e3000             .04) 9.3.0.     
Contents of section .eh_frame:
 0000 10000000 00000000 017a5200 04781e01  .........zR..x..
 0010 1b0c1f00 20000000 18000000 00000000  .... ...........
 0020 88000000 00410e10 9d029e01 60dedd0e  .....A......`...
 0030 00000000 00000000                    ........        

Disassembly of section .text:

0000000000000000 <mian>:
   0:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
   4:   910003fd        mov     x29, sp
   8:   90000000        adrp    x0, 0 <mian>
                        8: R_AARCH64_ADR_PREL_PG_HI21   .data+0x2
   c:   91000000        add     x0, x0, #0x0
                        c: R_AARCH64_ADD_ABS_LO12_NC    .data+0x2
  10:   79400000        ldrh    w0, [x0]
  14:   11000401        add     w1, w0, #0x1
  18:   12003c22        and     w2, w1, #0xffff
  1c:   90000001        adrp    x1, 0 <mian>
                        1c: R_AARCH64_ADR_PREL_PG_HI21  .data+0x2
  20:   91000021        add     x1, x1, #0x0
                        20: R_AARCH64_ADD_ABS_LO12_NC   .data+0x2
  24:   79000022        strh    w2, [x1]
  28:   90000001        adrp    x1, 0 <mian>
                        28: R_AARCH64_ADR_PREL_PG_HI21  temp_1
  2c:   91000021        add     x1, x1, #0x0
                        2c: R_AARCH64_ADD_ABS_LO12_NC   temp_1
  30:   79000020        strh    w0, [x1]
  34:   90000000        adrp    x0, 0 <mian>
                        34: R_AARCH64_ADR_PREL_PG_HI21  temp_1
  38:   91000000        add     x0, x0, #0x0
                        38: R_AARCH64_ADD_ABS_LO12_NC   temp_1
  3c:   79400000        ldrh    w0, [x0]
  40:   11000401        add     w1, w0, #0x1
  44:   12003c22        and     w2, w1, #0xffff
  48:   90000001        adrp    x1, 0 <mian>
                        48: R_AARCH64_ADR_PREL_PG_HI21  temp_1
  4c:   91000021        add     x1, x1, #0x0
                        4c: R_AARCH64_ADD_ABS_LO12_NC   temp_1
  50:   79000022        strh    w2, [x1]
  54:   90000001        adrp    x1, 2 <mian+0x2>
                        54: R_AARCH64_ADR_GOT_PAGE      temp_2
  58:   f9400021        ldr     x1, [x1]
                        58: R_AARCH64_LD64_GOT_LO12_NC  temp_2
  5c:   79000020        strh    w0, [x1]
  60:   90000000        adrp    x0, 0 <mian>
                        60: R_AARCH64_ADR_PREL_PG_HI21  temp_1
  64:   91000000        add     x0, x0, #0x0
                        64: R_AARCH64_ADD_ABS_LO12_NC   temp_1
  68:   79400000        ldrh    w0, [x0]
  6c:   2a0003e1        mov     w1, w0
  70:   90000000        adrp    x0, 0 <mian>
                        70: R_AARCH64_ADR_PREL_PG_HI21  .rodata
  74:   91000000        add     x0, x0, #0x0
                        74: R_AARCH64_ADD_ABS_LO12_NC   .rodata
  78:   94000000        bl      0 <printf>
                        78: R_AARCH64_CALL26    printf
  7c:   52800000        mov     w0, #0x0                        // #0
  80:   a8c17bfd        ldp     x29, x30, [sp], #16
  84:   d65f03c0        ret

The .data segment is initialized with non-zero data, temp_1(de) and temp_3(6f). However, temp_2 and temp_4 have not allocated space and are stored in bss.

attribute

attribute ((section("dame"))) , add this before the function or variable, which means to place the function or variable in the name section. As follows, add adamepart

#include"stdio.h"
#include"stdint.h"

__attribute__((section(".demo"))) uint8_t tttt;

int mian()
{
    
    
    printf("this is test\r\n");
    return 0;
}

pi@NanoPi-NEO2:~/project/test$ objdump -x -s -d main_1.o 

main_1.o:     file format elf64-littleaarch64
main_1.o
architecture: aarch64, flags 0x00000011:
HAS_RELOC, HAS_SYMS
start address 0x0000000000000000
private flags = 0:

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000020  0000000000000000  0000000000000000  00000040  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000000  0000000000000000  0000000000000000  00000060  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  0000000000000000  0000000000000000  00000060  2**0
                  ALLOC
  3 .demo         00000001  0000000000000000  0000000000000000  00000060  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  4 .rodata       0000000e  0000000000000000  0000000000000000  00000068  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .comment      0000002b  0000000000000000  0000000000000000  00000076  2**0
                  CONTENTS, READONLY
  6 .note.GNU-stack 00000000  0000000000000000  0000000000000000  000000a1  2**0
                  CONTENTS, READONLY
  7 .eh_frame     00000038  0000000000000000  0000000000000000  000000a8  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA

3.1. Symbols

View the symbol table of the file:
Create a new main_1.c file

#include"stdio.h"
#include"stdint.h"

__attribute__((section(".demo"))) uint8_t tttt;

uint8_t temp_1 = 0;
uint8_t temp_2 = 0;

void fun(uint8_t test)
{
    
    
    printf("this is test:%d\r\n",test);
}

int mian()
{
    
    
    temp_1 = temp_2++;
    fun(temp_1);
    return 0;
}

Compile and view the file content:

pi@NanoPi-NEO2:~/project/test$ readelf -s  main_1.o

Symbol table '.symtab' contains 21 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS main_1.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    5 $d
     7: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    4 $d
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     9: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    6 $d
    10: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 $x
    11: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
    12: 0000000000000014     0 NOTYPE  LOCAL  DEFAULT    9 $d
    13: 0000000000000000     0 SECTION LOCAL  DEFAULT    9 
    14: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
    15: 0000000000000000     1 OBJECT  GLOBAL DEFAULT    5 tttt
    16: 0000000000000000     1 OBJECT  GLOBAL DEFAULT    4 temp_1
    17: 0000000000000001     1 OBJECT  GLOBAL DEFAULT    4 temp_2
    18: 0000000000000000    44 FUNC    GLOBAL DEFAULT    1 fun
    19: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND printf
    20: 000000000000002c    80 FUNC    GLOBAL DEFAULT    1 mian

The above file has fun and printf functions. The fun function can find the corresponding definition, but printf cannot find it. The previous ndx is UND (undefined).

3.2. Compatible with C language – extern C

extern "C" {
    
    
int func(int);
int var;
}

Why do you do this?
After compiling in C++, the function name or variable name will be repackaged and modified, that is, the symbol _ ZN3fun3barE will be generated after fun is compiled .
In C, _ fun or fun will be generated , depending on the support of the compiler.
After using extern C, C++ will install the files in brackets in the compiled format of C language to generate a symbol table.

/*d.cpp*/
#include<stdio.h>

#define PI (111)

extern "C" {
    
    void funci(){
    
    ;}}

static void func(float){
    
    ;}
void ffunc(int){
    
    ;}

int main()
{
    
    
    if(PI);
    return 0;
}


Compile and debug to view:

pi@NanoPi-NEO2:~/project/test$ g++ -c d.cpp
pi@NanoPi-NEO2:~/project/test$ readelf -s d.o

Symbol table '.symtab' contains 14 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS d.cpp
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     5: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 $x
     6: 0000000000000008    20 FUNC    LOCAL  DEFAULT    1 _ZL4funcf
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     8: 0000000000000014     0 NOTYPE  LOCAL  DEFAULT    6 $d
     9: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
    10: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
    11: 0000000000000000     8 FUNC    GLOBAL DEFAULT    1 func
    12: 000000000000001c    20 FUNC    GLOBAL DEFAULT    1 _Z5ffunci
    13: 0000000000000030     8 FUNC    GLOBAL DEFAULT    1 main

func does not use the C++ symbol naming method. fun is named using C++ symbols.
_Z5ffunci:
_Z: fixed character
5: function name has 5 characters
i: int type (f: float, v: void...)

4. Link – ld

According to the above, each file will generate a symbol table, such as ao and bo two files, how to link ao and bo together to generate an executable file?
There are two methods of linking:
1. Combine files in order according to the order of the files, so that each file will have repeated .text and .data. . . etc.
2. Combine all segments with the same attribute, that is, text together and data together. (mainstream solution)
example:

#include"stdio.h"
#include"stdint.h"

void fun(uint8_t test)
{
    
    
    ;
}

int test()
{
    
    
    fun(1);
    return 0;
}

#include"stdio.h"
#include"stdint.h"

int main()
{
    
    
    test();

    return 0;
}

gcc -c main.c main_1.c
ld main.o main_1.o -e main -o ab

-e mainIndicates that the main function is used as the program entry, and the default program entry of the ld linker is _start.
-o abIndicates that the link output file name is ab, and the default is a.out.

All the same attribute pairs in the ab file generated after execution will correspond together.

pi@NanoPi-NEO2:~/project/test$ objdump -h main.o 

main.o:     file format elf64-littleaarch64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000018  0000000000000000  0000000000000000  00000040  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000000  0000000000000000  0000000000000000  00000058  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  0000000000000000  0000000000000000  00000058  2**0
                  ALLOC
pi@NanoPi-NEO2:~/project/test$ objdump -h main_1.o 

main_1.o:     file format elf64-littleaarch64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000030  0000000000000000  0000000000000000  00000040  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000000  0000000000000000  0000000000000000  00000070  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000000  0000000000000000  0000000000000000  00000070  2**0
                  ALLOC
pi@NanoPi-NEO2:~/project/test$ objdump -h ab

ab:     file format elf64-littleaarch64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         000000c4  0000000000400120  0000000000400120  00000120  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .eh_frame     00000060  00000000004001e8  00000000004001e8  000001e8  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .got          00000010  0000000000410fd8  0000000000410fd8  00000fd8  2**3
                  CONTENTS, ALLOC, LOAD, DATA
  3 .got.plt      00000018  0000000000410fe8  0000000000410fe8  00000fe8  2**3
                  CONTENTS, ALLOC, LOAD, DATA
  4 .data         00000004  0000000000411000  0000000000411000  00001000  2**1
                  CONTENTS, ALLOC, LOAD, DATA
  5 .bss          0000000c  0000000000411004  0000000000411004  00001004  2**1
                  ALLOC
  6 .comment      0000002a  0000000000000000  0000000000000000  00001004  2**0
                  CONTENTS, READONLY

View the generated file symbol table:

pi@NanoPi-NEO2:~/project/test$ readelf -s ab 

Symbol table '.symtab' contains 20 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000000004000b0     0 SECTION LOCAL  DEFAULT    1 
     2: 00000000004000f8     0 SECTION LOCAL  DEFAULT    2 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS main.c
     5: 00000000004000b0     0 NOTYPE  LOCAL  DEFAULT    1 $x
     6: 000000000040010c     0 NOTYPE  LOCAL  DEFAULT    2 $d
     7: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS main_1.c
     8: 00000000004000c8     0 NOTYPE  LOCAL  DEFAULT    1 $x
     9: 0000000000400130     0 NOTYPE  LOCAL  DEFAULT    2 $d
    10: 0000000000410fe8     0 NOTYPE  GLOBAL DEFAULT    2 _bss_end__
    11: 0000000000410fe8     0 NOTYPE  GLOBAL DEFAULT    2 __bss_start__
    12: 00000000004000c8    20 FUNC    GLOBAL DEFAULT    1 fun
    13: 0000000000410fe8     0 NOTYPE  GLOBAL DEFAULT    2 __bss_end__
    14: 00000000004000dc    28 FUNC    GLOBAL DEFAULT    1 test
    15: 0000000000410fe8     0 NOTYPE  GLOBAL DEFAULT    2 __bss_start
    16: 00000000004000b0    24 FUNC    GLOBAL DEFAULT    1 main
    17: 0000000000410fe8     0 NOTYPE  GLOBAL DEFAULT    2 __end__
    18: 0000000000410fe8     0 NOTYPE  GLOBAL DEFAULT    2 _edata
    19: 0000000000410fe8     0 NOTYPE  GLOBAL DEFAULT    2 _end

Each function corresponds to a unique address. Symbols that cannot be found in the compilation of a single file will be temporarily replaced with a fake address, and the real address will not be found and replaced until linking.
For example: before the link, if main.c cannot find the test function, it will set the address of the test function to 0, and the
insert image description here
linked file will fill in the corresponding address.
insert image description here

Guess you like

Origin blog.csdn.net/qq_37280428/article/details/131270495