As I read following from the Oracle website, I get that the int variable holds a character value in its last 16 bits from inputStream.read()
.
So does it always waste 2 bytes ?
CopyCharacters is very similar to CopyBytes. The most important difference is that CopyCharacters uses FileReader and FileWriter for input and output in place of FileInputStream and FileOutputStream. Notice that both CopyBytes and CopyCharacters use an int variable to read to and write from. However, in CopyCharacters, the int variable holds a character value in its last 16 bits; in CopyBytes, the int variable holds a byte value in its last 8 bits.
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
public class CopyCharacters {
public static void main(String[] args) throws IOException {
FileReader inputStream = null;
FileWriter outputStream = null;
try {
inputStream = new FileReader("xanadu.txt");
outputStream = new FileWriter("characteroutput.txt");
int c;
while ((c = inputStream.read()) != -1) {
outputStream.write(c);
}
} finally {
if (inputStream != null) {
inputStream.close();
}
if (outputStream != null) {
outputStream.close();
}
}
}
}
So does it always waste 2 bytes ?
Ermm ... yes. Either 2 bytes in the Reader
case or 3 bytes in the InputStream
case.
This wastage is necessary for the following reasons:
Both
InputStream.read()
andReader.read()
need to return a value to represent the "end of stream". As the javadocs say:InputStream.read()
: Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned.Reader.read()
: Returns the character read, as an integer in the range 0 to 65535 (0x00-0xffff), or -1 if the end of the stream has been reached.The extra end-of-stream value means that the return type of
read()
cannot be (respectively)byte
orchar
. (See also the last reason ...)It turns out that the "wasted" 2 or 3 bytes are of no consequence. Even a trivial Java program is going to use megabytes of memory. (Indeed, even a trivial C program is going to use tens or hundreds of kilobytes of memory ... if you account for the library code that they use.)
Returning a
byte
orchar
probably wouldn't save memory anyway. In a typical modern systems, local variables (evenbyte
andchar
) are stored word aligned on the stack. This is done because accessing memory with a word aligned address is typically faster.Replacing the
-1
with an exception would be inefficient in another way. Throwing and catching exceptions in Java is significantly more expensive than a simple test for-1
.