C language files, streams, keyboard input, redirection

A file is an area of memory in which information is stored. Normally, a file is kept in some sort of permanent memory, such as a hard disk, USB flash drive, or optical disc, such as a DVD.

C has many library functions for opening, reading, writing, and closing files. On one level, it can deal with files by using the basic file tools of the host operating system. This is called low-level I/O . Because of the many differences among computer systems, it is impossible to create a standard library of universal low-level I/O functions, and ANSI C does not attempt to do so; however, C also deals with files on a second level called the
standard I/O package. This involves creating a standard model and a standard set of I/O functions for dealing with files. At this higher level, differences between systems are handled by specific C implementations so that you deal with a uniform interface.

Different systems, for example, store files differently. Some store the file contents in one place and information about the file elsewhere. Some build a description of the file into the file itself. In dealing with text, some systems use a single newline character to mark the end of a line. Others might use the combination of the carriage return and linefeed characters to represent the end of a line. Some systems measure file sizes to the nearest byte; some measure in blocks of bytes.

When you use the standard I/O package, you are shielded from these differences. Therefore, to check for a newline, you can use if (ch == '\n') . If the system actually uses the carriagereturn/ linefeed combination, the I/O functions automatically translate back and forth between the two representations.

Conceptually, the C program deals with a stream instead of directly with a file. A stream is an idealized flow of data to which the actual input or output is mapped. That means various kinds of input with differing properties are represented by streams with more uniform properties. The process of opening a file then becomes one of associating a stream with the file, and reading and writing take place via the stream.

C treats input and output devices the same as it treats regular files on storage devices. In particular, the
keyboard and the display device are treated as files opened automatically by every C program. Keyboard input is represented by a stream called stdin , and output to the screen (or teletype or other output device) is represented by a stream called stdout . The getchar() , putchar() , printf() , and scanf() functions are all members of the standard I/O package, and they deal with these two streams.

You can use the same techniques with keyboard input as you do with files. For example, a program reading a file needs a way to detect the end of the file so that it knows where to stop reading. Therefore, C input functions come equipped with a builtin, end-of-file detector. Because keyboard input is treated like a file, you should be able to use that end-of-file detector to terminate keyboard input, too. Let’s see how this is done, beginning with files.

The End of File

One method to detect the end of a file is to place a special character in the file to mark the end. This is the method once used, for example, in CP/M, IBM-DOS, and MS-DOS text files. Today, these operating systems may use an embedded Ctrl+Z character to mark the ends of files. At one time, this was the sole means these operating systems used, but there are other options now, such as keeping track of the file size. So a modern text file may or may not have an embedded Ctrl+Z, but if it does, the operating system will treat it as an end-of-file marker.

A second approach is for the operating system to store information on the size of the file. If a file has 3000 bytes and a program has read 3000 bytes, the program has reached the end. MS-DOS and its relatives use this approach for binary files because this method allows the files to hold all characters, including Ctrl+Z. Newer versions of DOS also use this approach for text files. Unix uses this approach for all files.

C handles this variety of methods by having the getchar() function return a special value when the end of a file is reached, regardless of how the operating system actually detects the end of file. The name given to this value is EOF (end of file). Therefore, the return value for getchar() when it detects an end of file is EOF . The scanf() function also returns EOF on detecting the end of a file. Typically, EOF is defined in the stdio.h file as follows:

#define EOF (-1)

getchar() returns a value in the range 0 through 127 , because those are values corresponding to the standard character set, but it might return values from 0 through 255 if the system recognizes an extended character set. In either case, the value -1 does not correspond to any character, so it can be used to signal the end of a file.

Some systems may define EOF to be a value other than -1 , but the definition is always different from a return value produced by a legitimate input character. If you include the stdio.h file and use the EOF symbol, you don’t have to worry about the numeric definition. The important point is that EOF represents a value that signals the end of a file was detected; it is not a symbol actually found in the file.

Okay, how can you use EOF in a program? Compare the return value of getchar() with EOF . If they are different, you have not yet reached the end of a file. In other words, you can use an expression like this:

while ((ch = getchar()) != EOF)

What if you are reading keyboard input and not a file? Most systems (but not all) have a way to simulate an end-of-file condition from the keyboard.

Program example:

The following program reads from the keyboard and prints what it reads.

#include<stdio.h>

int main(void)
{
    
    
	int ch;
	while ((ch = getchar()) != EOF)
	{
    
    
		putchar(ch);
	}

	return 0;
}

result:

ok
ok
hello
hello
^Z

Emulates end-of-file with Ctrl + Z, ending keyboard input.

The concept of simulated EOF arose in a command-line environment using a text interface. In such an environment, the user interacts with a program through keystrokes, and the operating system generates the EOF signal. Some practices don’t translate particularly well to graphical interfaces, such as Windows and the Macintosh, with more complex user interfaces that incorporate mouse movement and button clicks. The program behavior on encountering a simulated EOF depends on the compiler and project type.

Here you need to declare ch as int type, because char type can only store unsigned integers, but here you need to return -1 to compare with EOF, and the return value of getchar() function is also int type.

The fact that getchar() is type int is why some compilers warn of possible data loss if you assign the getchar() return value to a type char variable.

redirect redirection

Input and output involve functions, data, and devices.

By default, C uses the standard I/O package to look for standard input as an input source, which is the stdin stream, the input device is the keyboard, and the input data stream consists of characters.

The program does not care about where to get the data. These devices are expressed by a more abstract and unified concept, that is, streams. I/O functions interact with streams rather than with a certain hardware device or a certain file.

Programs can thus take input from magnetic tape, punched cards or teletypes, or from files.

Programs can use files in two ways, one is to explicitly use certain functions to open files, close files, read files, write files, etc. The second method is to redirect program input or output from files to files .

One major problem with redirection is that it is associated with the operating system, not C. However, the many C environments, including Unix, Linux, and the Windows Command - Prompt mode, feature redirection, and some C implementations simulate it on systems lacking the feature.

program:

#include<stdio.h>

int main(void)
{
    
    
	int ch;
	while ((ch = getchar()) != EOF)
	{
    
    
		putchar(ch);
	}

	return 0;
}

Program name:echo_eof.c

Compile the program to get an executable file: echo_eof.exe

As shown in the picture:

insert image description here

Enter in the command line mode echo_eofto execute this program, as shown in the figure:

insert image description here

input redirection

Now assume that the input of the program is redirected to the file word, first create a text file word, the content is:hello world!

As shown in the picture:

insert image description here

Note that the words file has no suffix.

Redirect input to the file words on the command line:

insert image description here

The < symbol is a Unix and Linux and DOS/Windows redirection operator. It causes the words file to be associated with the stdin stream, channeling the file contents into the echo_eof program. The echo_eof program itself doesn’t know (or care) that the input is coming from a file instead of the keyboard. All it knows is that a stream of characters is being fed to it, so it reads them and prints them one character at a time until the end of file shows up. Because C puts files and I/O devices on the same footing, the file is now the I/O device.

With Unix, Linux, and Windows Command Prompt, the spaces on either side of the < are optional.

output redirection

You can also echo_eofredirect the output of to a file, so that the program reads the input from the keyboard and saves it in the file, as shown in the figure:

insert image description here

Only enter at the beginning of a line Ctrl + Zto end the input.

The > is a second redirection operator. It causes a new file called mywords to be created for your use, and then it redirects the output of echo_eof (that is, a copy of the characters you type) to that file. The redirection reassigns stdout from the display device (your screen) to the mywords file instead. If you already have a file with the name mywords , normally it would be erased and then replaced by the new one. (Many operating systems, however, give you the option of protecting existing files by making them read-only.) All that appears on your screen are the letters as you type them, and the copies go to the file instead. To end the program, press Ctrl+D (Unix) or Ctrl+Z (DOS) at the beginning of a line.

insert image description here

combined redirection

Now suppose you want to make a copy of the file mywords and call it savewords . Just issue this next command:

echo_eof < mywords > savewords

and the deed is done. The following command would have worked as well, because the order of redirection operations doesn’t matter:

echo_eof > savewords < mywords

Beware: Don’t use the same file for both input and output to the same command.

echo_eof < mywords > mywords  // wrong

The reason is that > mywords causes the original mywords to be truncated to zero length before it is ever used as input.

In brief, here are the rules governing the use of the two redirection operators ( < and > ) with Unix, Linux, or Windows/DOS:

■ A redirection operator connects an executable program (including standard operating system commands) with a data file. It cannot be used to connect one data file to another, nor can it be used to connect one program to another program.

■ Input cannot be taken from more than one file, nor can output be directed to more than one file by using these operators.

■ Normally, spaces between the names and operators are optional, except occasionally when some characters with special meaning to the Unix shell or Linux shell or the Windows Command Prompt mode are used.

Unix, Linux, and Windows/DOS also feature the >> operator, which enables you to add data to the end of an existing file, and the pipe operator ( | ), which enables you to connect the output of one program to the input of a second program.

Guess you like

Origin blog.csdn.net/chengkai730/article/details/132214848