[Underlying principle: in-depth understanding of computer systems # 1 all start with the "hello world" (a)

 

  The computer system is composed of hardware and software components of the system, they work together to run the application. Although the specific implementation of the system with the continuous change in time, but the concept of inherent system has not changed.

All computer hardware and software have a similar structure and function. The series theme is summed up his working principle of these components and their influence on the program of the underlying learning process.

Through the study, we will know some of the tricks to optimize their C code, designed to take full advantage of modern processor and memory system. You will learn how the compiler is to achieve a procedure call, and how to use this knowledge to avoid buffer overflow caused by security vulnerabilities.

  The system is now in the end we have introduced what happened from a simple entry procedures hello world. what's going on there?

 

 


1  #include<stdio.h>

2 int main ()

3  {

4    printf("hello,world\n")

5    return 0;

6  }


 

  After we helloworld program referred to as hw. (Huawei?)

hw's life cycle is from one source (source files) start, that programmers text file by dev, vc6, vs like the editor creates. It is that you play very skilled pile of code.

The file name is hw.c (after all in C language as an example)

  A source of a value is actually composed of 1 and 0 bits (bit) sequence, eight bits are called a group, called a byte. Each byte represents some text characters in the program.

Most modern computers use to represent standard ASCII text characters. (American Standard Code for Information Interchange: American Standard Code for Information Exchange), in fact, in this way a unique integer value single byte to represent each character.

For example, hw ASCII representation of the program:


#  i  n  c  l  u  d  e  SP  <  s  t  d  i  o  .  h   >

35    105  110    99    108  117     100   101   32       60     115   116   100   105   111   46   104   62

The remaining few lines is not to go into details

  You found the law yet. Hw.c is a sequence of bytes is stored in a file in the way. Each byte has an integer value. We said earlier 8-bit (bit) is a byte. These digital format means that you can use binary number eight preserved.

Note: Each subsequent line of code to hide here has a newline \ n to end, which corresponds to the ASCII value 10. Hw.c like this only referred to a text file consisting of characters from the ASCII file, all other files are called binary file.

  hw.c representation illustrates all of the information (in the disk file, video, network transmission face) a basic idea of ​​the system is represented by a string of bits (bit) of. The only way to distinguish between these different data objects that we read these data objects context. For example, in different contexts, a similar sequence of bytes may indicate an integer, float, string or machine instructions. For example, like a pork chop, two-sided bag is placed in Hamburg, put the dough in a clutch cake. It all depends on a background.

  As programmers, we need to understand the digital representation of the machine, as they relate to the actual integers and real numbers are different. They are finite approximation of the true value, we talk about this later.

  Other programs will also be translated into other programs in different formats

  Hw is the life cycle of the program from a high-level language C language program started, because this form can be read person. However, in order to run hw.c on the system, each C statement must be converted to other programs as a series of low-level machine language instructions. Into an example, if we all lived in the presence of people, means that we and the people dynasty, tribal warriors, primitive ape-man live together. But we can not communicate directly ape-man, we want to say let people translate dynasty, dynasty people to translate tribal people, tribal people and then eventually we mean to explain the ape-man.

  After the C language statements into machine language conversion, in accordance with the format called executable object program to lay the package, and store it as said earlier binary disk file. Also called target program executable object file,

On Unix systems, conversion from the source file to the destination is performed by the compiler driver:

linux> gcc -o hello hello.c

  Here, GCC compiler driver reads the source file hw.c and translate it into an executable object file hw. The process is divided into four stages to complete, as follows:

We will analyze each process a.

  Pre-processing stage: the preprocessor (CPP) in accordance with the command beginning with #, modify the original C program. For example hw.c in #incloude <stdio.h> command tells the processor reads the contents of the pre-header file stdio.h system and put it directly into the text of the program. The result of a long code program C, typically as a file name suffix i.

 

  Compilation phase: the compiler (ccl) hw.i translated into a text file that contains a text file hw.s assembly language program. The main program contains definitions of functions:

1  main:

2  subq  $8,%rsp

3  movl  $.LCO, %edi

4  call    puts    

5  movl  $0, %eax

6  addq  $8, %rsp

7 right

In lines 2 to 7 are defined in every line of text in a manner described in one machine instruction. We'll detail later in assembly language learning

 

  Compilation Phase: Next, the hw.s assembler as translated into machine language instructions, these instructions packed format generation called relocatable object program, and stores the results in the hw.o. hw.o is a binary file that contains 17 bytes is a function of the main instruction encoding. If it is open, a group of the distortion is seen with a text editor or the like.

 

  Link stage: this time we should note that we call the printf function in the source code, it is a function of the standard C library provided with the compiler in. printf function exists in a separate printf.o called pre-compiled object file, this file will be incorporated into our previous hw.o program, the linker ld is responsible for this merger. Finally get hw (no extension) file. He is an executable file that can be loaded into memory.

 

Understanding these underlying principles compiler how can you help us?

  对于类似helloworld这样简单的程序,我们可以依靠编译系统生成正确有效的机器代码。但是,很多稍微复杂点的程序在编译过程中就会产生一些需要动脑的问题。

 

优化程序性能。为了使我们的代码更高效,我们需要去了解一些机器代码以及编译器是咋把语言代码转化成机器代码的方式的。比如一些你可能从来没想过的小问题:switch语句和if-else效率是一样的吗?谁更快,谁占用的系统资源更少?while和for循环在系统内部是执行一样的指令吗?为什么有时候简单地重新排列了一下算术表达式中的括号就可让程序运行的更快?

理解报错:编译器报错大家都见怪不怪了吧,提示的错误代码,我们都会复制到百度谷歌中去查看解决办法。但是如果我们深入地了解了编译器以及系统底层原理,对于报错就会有大大的理解和扫除一些盲区的能力。比如链接器报错说它无法解析一个引用。为啥有些程序编译不报错,当你写好了520表白程序给女友时,她一打开就是一个经典的windows报错提示信息?

 

避免安全漏洞:对于渗透测试学习的朋友们最能懂了吧,比如缓冲区溢出,还有很多像ms17永恒之蓝ms14之类的安全漏洞,都是在底层层面的研究问题。也许我们大可在360里面打几个漏洞补丁。但如果你深入去理解,这对你大有裨益。

 

我们的helloword程序目前卡在了刚变成一团乱乎乎的二进制文件这个阶段,下期将进入更深层的阶段

 

Guess you like

Origin www.cnblogs.com/tanee/p/12380822.html