Chapter 0 looked closely at a tiny program, which we used to introduce surprisingly many fundamental C++ ideas: comments, standard headers, scopes, namespaces, expressions, statements, string literals, and output.

This chapter continues our overview of the fundamentals by writing similarly simple programs that use character strings. In the process, we'll learn about declarations, variables, and initialization, as well as something about input and the C++ string library. The programs in this chapter are so simple that they do not even require any control structures, which we will cover in Chapter 2.

1.1 Input

Once we can write text, the next step is to read it. For example, we can modify the Hello, world! program to say hello to a specific person:

// ask for a person's name, and greet the person
#include <iostream>
#include <string>
int main()
{
    // ask for the person's name    
    std::cout << "Please enter your first name: ";
    // read the name
    std::string name; // define name
    std::cin >> name; // read into
    // write a greeting
    std::cout << "Hello, " << name << "!" << std::endl;
    return 0;
}

When we execute this program, it will write

Please enter your first name:

on the standard output. If we respond, for example,

Vladimir

then the program will write

Hello, Vladimir!

Let's look at what's going on. In order to read input, we must have a place to put it. Such a place is called a variable. A variable is an object that has a name. An object, in turn, is a part of the computer's memory that has a type. The distinction between objects and variables is important because, as we'll see in §3.2.2, §4.2.3, and §10.6.1, it is possible to have objects that do not have names.

If we wish to use a variable, we must tell the implementation what name to give it and what type we want it to have. The requirement to supply both a name and a type makes it easier for the implementation to generate efficient machine code for our programs. The requirement also lets the compiler detect misspelled variable names-unless the misspelling happens to match one of the names that our program said it intended to use.

In this example, our variable is named name, and its type is std::string. As we saw in §0.5 and §0.7, the use of std:: implies that the name, string, that follows it is part of the standard library, not part of the core language or of a nonstandard library. As with every part of the standard library, std::string has an associated header, namely <string>, so we've added an appropriate #include directive to our program.

The first statement,

std::cout << "Please enter your first name: ";

should be familiar by now: It writes a message that asks for the user's name. An important part of this statement is what isn't there, namely the std::endl manipulator. Because we did not use std::endl, the output does not begin a new line after the program has written its message. Instead, as soon as it has written the prompt, the computer waits-on the same line-for input.

The next statement,

std::string name;   // define name

is a definition, which defines our variable named name that has type std::string. Because this definition appears within a function body, name is a local variable, which exists only while the part of the program within the braces is executing. As soon as the computer reaches the }, it destroys the variable name, and returns any memory that the variable occupied to the system for other uses. The limited lifetime of local variables is one reason that it is important to distinguish between variables and other objects.

Implicit in the type of an object is its interface——the collection of operations that are possible on an object of that type. By defining name as a variable (a named object) of type string, we are implicitly saying that we want to be able to do with name whatever the library says that we can do with strings.

One of those operations is to initialize the string. Defining a string variable implicitly initializes it, because the standard library says that every string object starts out with a value. We shall see shortly that we can supply a value of our own when we create a string. If we do not do so, then the string starts out containing no characters at all. We call such a string an empty or null string.

Once we have defined name, we execute

std::cin >> name;        // read into name

which is a statement that reads from std::cin into name. Analogous with its use of the << operator and std::cout for output, the library uses the >> operator and std::cin for input. In this example, >> reads a string from the standard input and stores what it read in the object named name. When we ask the library to read a string, it begins by discarding whitespace characters (space, tab, backspace, or the end of the line) from the input, then reads characters into name until it encounters another whitespace character or end-of-file. Therefore, the result of executing std::cin >> name is to read a word from the standard input, storing in name the characters that constitute the word.

The input operation has another side effect: It causes our prompt, which asks for the user's name, to appear on the computer's output device. In general, the input-output library saves its output in an internal data structure called a buffer, which is used to optimize output operations. Most systems take a significant amount of time to write characters to an output device, regardless of how many characters there are to write. To avoid the overhead of writing in response to each output request, the library uses the buffer to accumulate the characters to be written, and flushes the buffer, by writing its contents to the output device, only when necessary. By doing so, it can combine several output operations into a single write.

There are three events that cause the system to flush the buffer. First, the buffer might be full, in which case the library will flush it automatically. Second, the library might be asked to read from the standard input stream. In that case, the library immediately flushes the output buffer without waiting for the buffer to become full. The third occasion for flushing the buffer is when we explicitly say to do so.

When our program writes its prompt to cout, that output goes into the buffer associated with the standard output stream. Next, we attempt to read from cin. This read flushes the cout buffer, so we are assured that our user will see the prompt.

Our next statement, which generates the output, explicitly instructs the library to flush the buffer. That statement is only slightly more complicated than the one that wrote the prompt. Here we write the string literal "Hello, " followed by the value of the string variable name, and finally by std::endl. Writing the value of std::endl ends the line of output, and then flushes the buffer, which forces the system to write to the output stream immediately.

Flushing output buffers at opportune moments is an important habit when you are writing programs that might take a long time to run. Otherwise, some of the program's output might languish in the system's buffers for a long time between when your program writes it and when you see it.

1.2 Framing a name

So far, our program has been restrained in its greetings. We'd like to change that by writing a more elaborate greeting, so that the input and output look like this:

Please enter your first name: Estragon
********************
*                  *
* Hello, Estragon! *
*                  *
********************

Our program will produce five lines of output. The first line begins the frame. It is a sequence of * characters as long as the greeting, plus a space and an * at each end. The second line will be an appropriate number of spaces with an * at each end. The third line is an *, a space, the greeting, a space, and an *. The last two lines will be the same as the second and first lines, respectively.

A sensible strategy is to build up the output a piece at a time. First we'll read the name, then we'll use it to construct the greeting, and then we'll use the greeting to build each line of the output. Here is a program that uses that strategy to solve our problem:

// ask for a person's name, and generate a framed greeting
#include <iostream>
#include <string>
int main()
{
    std::cout << "Please enter your first name: ";
    std::string name;
    std::cin >> name;

    // build the message that we intend to write
    const std::string greeting = "Hello, " + name + "!";
 
    // build the second and fourth lines of the output
    const std::string spaces(greeting.size(), ' ');
    const std::string second = "* " + spaces + " *";

    // build the first and fifth lines of the output
    const std::string first(second.size(), '*');

    // write it all
    std::cout << std::endl;
    std::cout << first << std::endl;
    std::cout << second << std::endl;
    std::cout << "* " << greeting << " *" << std::endl;
    std::cout << second << std::endl;
    std::cout << first << std::endl;
    return 0;
}

First, our program asks for the user's name, and reads that name into a variable named $name$ . Then, it defines a variable named $greeting$ that contains the message that it intends to write. Next, it defines a variable named $spaces$ , which contains as many spaces as the number of characters in $greeting$ . It uses the $spaces$ variable to define a variable named $second$ , which will contain the second line of the output, and then the program constructs $first$ as a variable that contains as many * characters as the number of characters in $second$ . Finally, it writes the output, a line at a time.

The #include directives and the first three statements in this program should be familiar. The definition of greeting, on the other hand, introduces three new ideas.

One idea is that we can give a variable a value as we define it. We do so by placing, between the variable's name and the semicolon that follows it, an = symbol followed by the value that we wish the variable to have. If the variable and value have different types——as §10.2 shows that strings and string literals do——the implementation will convert the initial value to the type of the variable.

The second new idea is that we can use “+” to concatenate a string and a string literal——or, for that matter, two strings (but not two string literals). We noted in passing in Chapter 0 that 3 + 4 is 7. Here we have an example in which + means something completely different. In each case, we can determine what the + operator does by examining the types of its operands. When an operator has different meanings for operands of different types, we say that the operator is overloaded.

The third idea is that of saying const as part of a variable's definition. Doing so promises that we are not going to change the value of the variable for the rest of its lifetime. Strictly speaking, this program gains nothing by using const. However, pointing out which variables will not change can make a program much easier to understand.

Note that if we say that a variable is const, we must initialize it then and there, because we won't have the opportunity later. Note also that the value that we use to initialize the const variable need not itself be a constant. In this example, we won't know the value of $greeting$ until after we have read a value into $name$ , which obviously can't happen until we run the program. For this reason, we cannot say that $name$ is const, because we change its value by reading into it.

One property of an operator that never changes is its associativity. We learned in Chapter 0 that << is left-associative, so that std::cout << s << t means the same as (std::cout << s) << t. Similarly, the + operator (and, for that matter, the >> operator) is also left-associative. Accordingly, the value of "Hello, " + name + "!" is the result of concatenating "Hello, " with name, and concatenating the result of that concatenation with "!". So, for example, if the variable name contains Estragon, then the value of "Hello, " + name + "!" is Hello, Estragon!

At this point, we have figured out what we are going to say, and saved that information in the variable named greeting. Our next job is to build the frame that will enclose our greeting. In order to do so, we introduce three more ideas in a single statement:

std::string spaces(greeting.size(), ' ');

When we defined greeting, we used an = symbol to initialize it. Here, we are following spaces by two expressions, which are separated by a comma and enclosed in parentheses. When we use the = symbol, we are saying explicitly what value we would like the variable to have. By using parentheses in a definition, as we do here, we tell the implementation to construct the variable——in this case, $spaces$ ——from the expressions, in a way that depends on the type of the variable. In other words, in order to understand this definition, we must understand what it means to construct a string from two expressions.

How a variable is constructed depends entirely on its type. In this particular case, we are constructing a string from——well, from what? Both expressions are of forms that we haven't seen before. What do they mean?

The first expression, greeting.size(), is an example of calling a member function. In effect, the object named greeting has a component named size, which turns out to be a function, and which we can therefore call to obtain a value. The variable $greeting$ has type std::string, which is defined so that evaluating greeting.size() yields an integer that represents the number of characters in greeting.

The second expression, ' ', is a character literal. Character literals are completely distinct from string literals. A character literal is always enclosed in single quotes; a string literal is always enclosed in double quotes. The type of a character literal is the built——in type char; the type of a string literal is much more complicated, and we shall not explain it until §10.2. A character literal represents a single character. The characters that have special meaning inside a string literal have the same special meaning in a character literal. Thus, if we want ' or \, we must precede it by \. For that matter, '\n', '\t', '\"', and related forms work analogously to the way we saw in Chapter 0 that they work for string literals.

To complete our understanding of spaces, we need to know that when we construct a string from an integer value and a char value, the result has as many copies of the char value as the value of the integer. So, for example, if we were to define

std::string stars(10, '*');

then stars.size() would be 10, and $stars$ itself would contain **********.

Thus, spaces contains the same number of characters as $greeting$ , but all of those characters are blanks.

Understanding the definition of $second$ requires no new knowledge: We concatenate " * ", our string of spaces, and " *" to obtain the second line of our framed message. The definition of $first$ requires no new knowledge either; it gives $first$ a value that contains as many * characters as the number of characters in second.

The rest of the program should be familiar; all it does is write strings in the same way we did in §1.1.

1.3 Details

Types:

char:

Built-in type that holds ordinary characters as defined by the implementation.

wchar_t:

Built-in type intended to hold "wide characters", which are big enough to hold characters for languages such as Chinese.

The string type:

The string type is defined in the standard header <string> ;. An object of type string contains a sequence of zero or more characters. If $n$ is an integer, $c$ is a char, $is$ is an input stream, and $os$ is an output stream, then the string operations include:

std::string s;

Defines $s$ as a variable of type std::string that is initially empty.

std::string t = s;

Defines $t$ as a variable of type std::string that initially contains a copy of the characters in $s$ , where $s$ can be either a string or a string literal.

std::string z(n, c);

Defines $z$ as a variable of type std::string that initially contains $n$ copies of the character $c$ . Here, $c$ must be a char, not a string or a string literal.

os << s

Writes the characters contained in $s$ , without any formatting changes, on the output stream denoted by $os$ . The result of the expression is $os$ .

is >> s

Reads and discards characters from the stream denoted by $is$ until encountering a character that is not whitespace. Then reads successive characters from $is$ into $s$ , overwriting whatever value $s$ might have had, until the next character read would be whitespace. The result is $is$ .

s + t

The result of this expression is an std::string that contains a copy of the characters in $s$ followed by a copy of the characters in $t$ . Either $s$ or $t$ , but not both, may be a string literal or a value of type char.

s.size()

The number of characters in $s$ .

Variable

Variables can be defined in one of three ways:

std::string hello = "Hello";
// define the variable with an explicit initial value


std::string stars(100, '*');
// construct the variable according to its type and the given expressions


std::string name;
// define the variable with an implicit initialization, which depends on its type

Variables defined inside a pair of curly braces are local variables/which exist only while executing the part of the program within the braces. When the implementation reaches the } , it destroys the variables, and returns any memory that they occupied to the system. Defining a variable as const promises that the variable's value, will not change during its lifetime. Such a variable must be initialized as part of its definition, because there is no way to do so later.

Input:

Executing std::cin >> v discards any whitespace characters in the standard input stream, then reads from the standard input into variable $v$ . It returns std::cin , which has type istream , in order to allow chained input operations.

Accelerated C++ Chapter 1——Working with Strings