LINUX CGI

  • Why do you need CGI programming? 

In HTML, when the client fills in the form and presses the submit button, the content of the form is sent to the server. Generally, a server-side script is needed to process the content of the form. , Or save them, or make some queries by content, or something else. Without CGI, the world of WEB completely loses its interactivity, and all information becomes one-way, without any feedback. 
Some people think that JavaScript can be used instead of CGI programs. This is actually a conceptual error. JavaScript can only run in the client browser, while CGI works on the server. There are some intersections in their work, such as form data validation, but JavaScript can never replace CGI. But it can be said that if a job can be done with both JavaScript and CGI, then JavaScript must be used. In terms of execution speed, JavaScript has inherent advantages over CGI. Only those problems that cannot be solved on the client side, such as interacting with a remote database, should use CGI at this time. 
Simply put, CGI is an interface used to communicate between HTML forms and server-side programs. To say that it is an interface means that CGI is not a language, but a set of specifications that can be applied by other languages. In theory, you can use any programming language to write CGI programs, as long as you conform to some things defined in the CGI specification when programming. Since the C language performs well in terms of platform independence (almost any system platform has its corresponding compiler), and it is very familiar to most programmers (unlike Perl), therefore, C is CGI One of the preferred languages ​​for programming. Here we introduce how to use C to write CGI programs. 
          The simplest example of CGI programming is to process forms. Therefore, in this article, we mainly introduce how to use C to write CGI programs for table but processing. Understand the similarities and differences between CGI environment variables and POST/GET methods. You also need to understand HTML and URL encoding. If you use the GET method, the CGI program will receive the encoded form that is input to the environment variable QUERY_STRING. If you use the POST method, Your CGI program will receive the encoded form entered into stdin. The server will not send an EOF at the end of the data. Instead, you should use the environment variable CONTENT_LENGTH to determine how much data you want to read from stdin.

  • GET form processing 

For those forms that use the attribute "METHOD=GET" (or without the METHOD attribute, GET is the default value at this time), CGI is defined as: when the form is sent to the server, the data in the form is saved on the server An environment variable called QUERY_STRING . The processing of this form is relatively simple, as long as the environment variable is read. This has different practices for different languages. In C language, you can use the library function getenv (defined in the standard library function stdlib) to access the value of the environment variable as a string. After you have obtained the data in the string, you can use some tricks to perform type conversion, which is relatively simple. The standard output in CGI programs (such as the stdout file stream in C) is also redefined. It does not generate any output on the server, but is redirected to the client browser. In this way, if an HTML document is output to its stdout when writing a C CGI program, the HTML document will be displayed in the client's browser. This is also a basic principle of CGI programs. 
  Let's take a look at the specific program implementation, the following is an HTML form: 

< FORM ACTION="/cgi-bin/mult.cgi" > 
< P >请在下面填入乘数和被乘数,按下确定后可以看到结果。 
< INPUT NAME="m" SIZE="5" > 
< INPUT NAME="n" SIZE="5" >< BR > 
< INPUT TYPE="SUBMIT" VALUE="确定" > 
< /FORM > 


          The function we want to achieve is very simple, that is, multiply the value entered in the form and output the result. In fact, this function can be implemented with JavaScript, but in order to make the program as simple and easy to understand as possible, I chose this small multiplication as an example. 
  The following is the CGI program that processes this form, corresponding to the ACTION attribute value in the FORM tag. 

#include < stdio.h > 
#include < stdlib.h > 

int main(void) 
{ 
    char *data; 
    long m,n; 
    printf("%s%c%c ","Content-Type:text/html;charset=gb2312",13,10); 
    printf("< TITLE >乘法结果< /TITLE > "); 
    printf("< H3 >乘法结果< /H3 > "); 
    data = getenv("QUERY_STRING"); 
    if(data == NULL) 
    printf("< P >错误!数据没有被输入或者数据传输有问题"); 
    else if(sscanf(data,"m=%ld&n=%ld",&m,&n)!=2) 
    printf("< P >错误!输入数据非法。表单中输入的必须是数字。"); 
    else 
    printf("< P >%ld和%ld的成绩是:%ld。",m,n,m*n); 
    return 0; 
} 


         The specific C grammar will not be discussed much, let's take a look at its special place as a CGI program. 
         As mentioned earlier, the content of standard output is the content to be displayed in the browser. The output content of the first line is necessary, and is unique to a CGI program: printf("%s%c%c","Content-Type:text/html",13,10), this output is as HTML File header. Because CGI can not only output HTML text like a browser, but also output images, sounds and the like. This line tells the browser how to handle the received content. There are two blank lines after the definition of Content-Type, which is also indispensable. Because the head output of all CGI programs is similar, you can define a function for it to save programming time. This is a commonly used technique in CGI programming. 
           The program later calls the library function getevn to get the content of QUERY_STRING, and then uses the sscanf function to take out the value of each parameter. Note the usage of the sscanf function. There is nothing else, and it is no different from a normal C program. 
           After compiling the program, rename it to mult.cgi and place it under the /cgi-bin/ directory, then it can be called by the form. In this way, a CGI program that handles GET forms is complete. 

  •   POST form processing 

Let's consider another form transmission method: POST. Suppose the task we want to achieve is this: add a piece of text entered by the customer in the form to the back of a text file on the server. This can be seen as the prototype of a message board program. Obviously, this work cannot be achieved with client-side scripts such as JavaScript, and it can be regarded as a true CGI program. 
         It seems that this problem is very similar to the content mentioned above, just using different forms and different scripts (programs). But in fact, there are some differences. In the above example, the GET processing method can be regarded as a "pure query" type, that is, it has nothing to do with state. The same data can be submitted any number of times without causing any problems (except for some small overhead on the server). But the task is different now, at least it has to change the content of a file. Therefore, it can be said that it is state-related. This is also one of the differences between POST and GET. Moreover, GET has a limit on the length of the form, while POST is not. This is the main reason for choosing the POST method in this task. But relatively, the processing speed of GET is faster than POST. 
In the definition of CGI, for the POST type form, its content is sent to the standard input of the CGI program (stdin in C language), and the transmitted length is placed in the environment variable CONTENT_LENGTH. So what we have to do is to read a string of CONTENT_LENGTH length in the standard input. Reading data from standard output sounds easier than reading data from environment variables, but it is not. There are some details to pay attention to, which can be seen in the following program. One thing to pay special attention to is: CGI programs are different from general programs. General programs will get an EOF sign after reading the contents of a file stream. But in the form processing process of CGI program, EOF will never appear, so do not read characters longer than the length of CONTENT_LENGTH, or there will be any consequences, no one knows (there is no definition in the CGI specification, generally according to Different servers have different processing methods). 
          Let's take a look at how to collect data from the POST form to the CGI program. Here is a relatively simple C source code: 

#include < stdio.h > 
#include < stdlib.h > 
#define MAXLEN 80 
#define EXTRA 5 
/* 4个字节留给字段的名字"data", 1个字节留给"=" */ 
#define MAXINPUT MAXLEN+EXTRA+2 
/* 1个字节留给换行符,还有一个留给后面的NULL */ 
#define DATAFILE "../data/data.txt" 
/* 要被添加数据的文件 */ 

void unencode(char *src, char *last, char *dest) 
{ 
    for(; src != last; src++, dest++) 
    if(*src == "+") 
    *dest = " "; 
    else if(*src == "%") { 
    int code; 
    if(sscanf(src+1, "%2x", &code) != 1) code = "?"; 
    *dest = code; 
    src +=2; } 
    else 
    *dest = *src; 
    *dest = " "; 
    *++dest = ""; 
} 

int main(void) 
{ 
    char *lenstr; 
    char input[MAXINPUT], data[MAXINPUT]; 
    long len; 
    printf("%s%c%c ", 
    "Content-Type:text/html;charset=gb2312",13,10); 
    printf("< TITLE >Response< /TITLE > "); 
    lenstr = getenv("CONTENT_LENGTH"); 
    if(lenstr == NULL || sscanf(lenstr,"%ld",&len)!=1 || len > MAXLEN) 
        printf("< P >表单提交错误"); 
    else { 
        FILE *f; 
        fgets(input, len+1, stdin); 
        unencode(input+EXTRA, input+len, data); 
        f = fopen(DATAFILE, "a"); 
        if(f == NULL) 
            printf("< P >对不起,意外错误,不能够保存你的数据 "); 
        else 
            fputs(data, f); 
        fclose(f); 
        printf("< P >非常感谢,您的数据已经被保存< BR >%s",data); 
    } 
    return 0; 
} 


          Essentially, the program first obtains the word length of the data from the CONTENT_LENGTH environment variable, and then reads a string of the corresponding length. Because the data content is encoded during transmission, it must be decoded accordingly. The coding rules are very simple. The main ones are as follows: 
  1. Each field in the form is represented by a field name followed by an equal sign, followed by the value of this field, and the content between each field is represented by & link ; 
  2. All space symbols are replaced by plus signs, so spaces in the code segment are illegal; 
  3. Special characters such as punctuation marks, and some characters with specific meanings such as "+", followed by a percent sign The corresponding ACSII code value is expressed.
 
  For example: if the user input is: 
  Hello there! 
  Then the data is encoded when it is transmitted to the server, and it becomes data=Hello+there%21. The unencode() function above is used to decode the encoded data. After the decoding is complete, the data is added to the end of the data.txt file and is displayed in the browser. 
  After the file is compiled, rename it to collect.cgi and place it in the CGI directory to be called by the form. The corresponding form is given below: 
<FORM ACTION="/cgi-bin/collect.cgi" METHOD="POST"> 
<P >Please enter your message (maximum 80 characters): <BR >< INPUT NAME ="data" SIZE="60" MAXLENGTH="80" >< BR> 
<INPUT TYPE="SUBMIT" VALUE="OK" 

In fact, this program can only be used as an example and cannot be used formally. It misses a very critical problem: when multiple users write data to files at the same time, errors will definitely occur. For such a program, the probability of files being written simultaneously is very high. Therefore, in a more formal message board program, some more considerations need to be done, such as adding a semaphore, or relying on a key file. Because that's just a matter of programming skills, I won't talk about it here. 
  Finally, let's write a CGI program that browses the data.txt file, which only needs to output the content to stdout: 
 

  #include < stdio.h > 
  #include < stdlib.h > 
  #define DATAFILE "../data/data.txt" 
  int main(void) 
  { 
  FILE *f = fopen(DATAFILE,"r"); 
  int ch; 
  if(f == NULL) { 
  printf("%s%c%c ", 
  "Content-Type:text/html;charset=gb2312",13,10); 
  printf("< TITLE >错误 < /TITLE > "); 
  printf("< P >< EM >意外错误,无法打开文件< /EM >"); } 
  else { 
  printf("%s%c%c ", 
  "Content-Type:text/plain",13,10); 
  while((ch=getc(f)) != EOF) 
  putchar(ch); 
  fclose(f); } 
  return 0; 
  } 


  The only thing to note about this program is that it does not output data.txt after packaging it in HTML format, but directly outputs it as plain text, as long as the text/plain type is used in the output header instead of text/ HTML is fine, and the browser will automatically select the corresponding processing method according to the type of Content-Type. 
  To trigger this program is also very simple, because there is no data to enter, so you can do it with just one button: 
  <FORM ACTION="/cgi-bin/viewdata.cgi"> 
  <P >< INPUT TYPE="SUBMIT" VALUE ="Check"> 
  </FORM> 
  Here, some basic principles of writing CGI programs in C will be over. Of course, it is difficult to write a good CGI program based on these contents. This requires further study of the CGI specification and some other unique skills of CGI programming. 
The purpose of this article is to understand the concept of CGI programming. In fact, some of the current mainstream server-side scripting programming languages ​​such as ASP, PHP, JSP, etc., basically have most of the functions of CGI programming, but their use is indeed better than no matter what language is used for CGI Programming is much easier. So when doing server-side programming, these scripting programming languages ​​are generally considered first. Only when they can't solve it, such as when some more low-level programming is required, will CGI be used.

Guess you like

Origin blog.csdn.net/weixin_38293850/article/details/106845779