Java programming logic (29) - String analysis

The upper section of a single character Character wrapper classes, this section describes the string class. String manipulation is probably the most common computer program operated, Java strings are represented in the class String, this section will be described in detail String.

Basic string is relatively straightforward, we look at.

Basic Usage

String variables can be defined by a constant

String name = "Maradona said programming";

You can also create new String by

String name = new String ( "Maradona said programming");

And can be used directly String + + = operator, such as:

String name = "horse";
name + = ", said programming";
String descritpion = ", exploring the nature of programming";
System.out.println(name+descritpion); 

The output is: Maradona said programming, programming exploring nature

String class includes a number of ways, to facilitate the operation string.

Determining whether the string is empty

public boolean isEmpty()

Get string length

 public int length()

Take substring

public String substring(int beginIndex)
public String substring(int beginIndex, int endIndex) 

Find a character in a string or substring returns the first index to find the location, did not find returns -1

public int indexOf(int ch)
public int indexOf(String str)

Find a character or substring from behind, returning from its first position index number of the back, did not find returns -1

public int lastIndexOf(int ch)
public int lastIndexOf(String str) 

Determining whether the character string contains the specified sequence of characters. Recall, CharSequence is an interface, String also achieved CharSequence

public boolean contains(CharSequence s)  

Determining whether the string to the string at the beginning of the stator

public boolean startsWith(String prefix)

Determining whether the string to the string end of the stator

public boolean endsWith(String suffix)

Comparison with other strings to see whether the same content

public boolean equals(Object anObject)

Ignore case, comparison with other strings to see whether the same content

public boolean equalsIgnoreCase(String anotherString)

String also implements the Comparable interface, you can compare the size of the string

public int compareTo(String anotherString)

Ignore case can also be carried out to compare the size

public int compareToIgnoreCase(String str)

All characters are converted to uppercase and returns the new string, the original string unchanged

public String toUpperCase()

All characters converted to lowercase and returns the new string, the original string unchanged

public String toLowerCase()

Connection string, the string returns the current string in the string and parameters, the same original string

public String concat(String str)

String replacement, replace a single character, returns the new string, the original string unchanged

public String replace(char oldChar, char newChar)

String replacement, replace a sequence of characters to return new string, the original string unchanged

public String replace(CharSequence target, CharSequence replacement) 

Delete leading and trailing spaces, returns the new string, the original string unchanged

public String trim() 

Delimited string, returns the substring after the partition array, the same original string

public String[] split(String regex)

For example, according to a comma-separated "hello, world":

String str = "hello,world";
String[] arr = str.split(",");

arr[0]为"hello", arr[1]为"world"。

From the perspective of caller understand the basic usage String, here we come to understand further the internal String.

Into the interior String

Packaging array of characters

String class by a character string representing the internal array, instance variable is defined as:

private final char value[];

There are two constructors String, String can be created according to the char array

public String(char value[])
public String(char value[], int offset, int count)

It should be noted, String parameter will be created based on a new array and copy content, but not directly with the character array argument.

Most of the methods of String, are also the internal operations of the array of characters. For example:

  • () Method returns the length of the array is of length
  • substring () method is based on the parameters, call the constructor String (char value [], int offset, int count) built a string
  • indexOf when looking for a character or substring it is to find in this array

Most of these methods are relatively straightforward to achieve, we will not go into details.

String in some way related to this char array:

Returns the index position of char

public char charAt(int index)

Returns a string corresponding to char array

public char[] toCharArray()

Note that the return is an array after a copy, not the original array.

The specified range of char array into the target array copies the specified position

public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) 

Press Code Point processing character

Similarly with the Character, String class also provides methods, according to the string processing Code Point.

public int codePointAt(int index)
public int codePointBefore(int index)
public int codePointCount(int beginIndex, int endIndex)
public int offsetByCodePoints(int index, int codePointOffset)

Our analysis of these methods and Character a description very similar, this section will not repeat them.

Transcoding

String internally treated by UTF-16BE characters, the characters of the BMP using a char, two bytes for supplementary characters, two char, four bytes. We introduce a variety of coding in Section VI, different coding may be used for different character sets, use a different number of bytes, and different binary representation. How to handle these different encoding it? These Java coding between the internal representation and how to convert it?

Java uses Charset This class represents a variety of coding, it has two static methods commonly used:

public static Charset defaultCharset()
public static Charset forName(String charsetName) 

The first method returns the default coding system, for example, on my computer, execute the following statement:

System.out.println(Charset.defaultCharset().name());

Output is UTF-8

The second method returns the object to the Charset given code names, and we introduce corresponding to the coding section VI, which can be a charset name: US-ASCII, ISO-8859-1, windows-1252, GB2312, GBK, GB18030, big5, UTF-8, for example:

Charset charset = Charset.forName("GB18030");

String class provides a method, returns the string in a given byte coded representation:

public byte[] getBytes()  
public byte[] getBytes(String charsetName)
public byte[] getBytes(Charset charset) 

The first method is not a coding parameter using the system default encoding, the second coding method is a parameter name, and the third Charset.

String class has a configuration method, according to create a string of bytes and coded, i.e., encoded according to a given byte representation, creating Java internal representation.

Copy the code
public String(byte bytes[])
public String(byte bytes[], int offset, int length)
public String(byte bytes[], int offset, int length, String charsetName)
public String(byte bytes[], int offset, int length, Charset charset)
public String(byte bytes[], String charsetName)
public String(byte bytes[], Charset charset)
Copy the code

In addition to the transcoding method via the String, the Charset class there are some methods of encoding / decoding, this section is not introduced. It is important to recognize that the internal representation of the various Java coding is different, but interchangeable.

Immutability

And similar packaging, String class is immutable class, that is, once the object is created, there is no way to change it. String class also declared to final, it can not be inherited, internal char array value is final and can not be changed after initialization.

The String class provides many methods seem to modify, in fact, by creating a new String object to achieve, the original String object is not modified. For example, we look at concat () method code:

Copy the code
public String concat(String str) {
    int otherLen = str.length();
    if (otherLen == 0) {
        return this;
    }
    int len = value.length;
    char buf[] = Arrays.copyOf(value, len + otherLen);
    str.getChars(buf, len);
    return new String(buf, true);
}
Copy the code

Arrays.copyOf method created by a piece of a new array of characters, copies of the original content, and then create a new a new String. About Arrays class, we will introduce in detail in later chapters.

And similar packaging, defined as immutable class, the program can be more simple, safe, easy to understand. But if you frequently modify the string, and each modification to create a new string that is too low performance, then you should consider the other two classes of StringBuilder and StringBuffer Java, the next section we introduce them.

Constant string

String constants in Java is very special, in addition to a String variable can be assigned directly to the outside, like an object of its own type String, like, you can call the various methods String directly. We look at the code:

System.out.println ( "Maradona said the program" .length ());
System.out.println ( "Maradona said the program" .contains ( "horse"));
System.out.println ( "Maradona said the program" .indexOf ( "programming"));

In fact, these constants is the String object type, in memory, they are placed in a shared place, this place is called a string constant pool, which holds all the constant strings, each will keep a constant, It is shared by all users. When used in the form of a string constant, when the object is of type String corresponding to that used in the constant pool.

For example, we look at the code:

String name1 = "Maradona said programming";
String name2 = "Maradona said programming";
System.out.println(name1==name2);

Output is true, why? It is believed that, "Maradona said programming" There is a String object corresponding to the constant pool, we assume that the name laoma, the above code actually looks like this:

String laoma = new String (new char [] { 'old', 'Ma', 'say', 'ed', 'away'});
String name1 = laoma;
String name2 = laoma;
System.out.println(name1==name2);

In fact only a String object, three variables are pointing to this object, name1 == name2 is self-evident.

Note that, if not through direct assignment constant, but through the creation of new, == will not return true, look at the code below:

String name1 = new String ( "Maradona said programming");
String name2 = new String ( "Maradona said programming");
System.out.println(name1==name2);

The output is false, why? The above code looks like this:

String laoma = new String (new char [] { 'old', 'Ma', 'say', 'ed', 'away'});
String name1 = new String(laoma);
String name2 = new String(laoma);
System.out.println(name1==name2);

String String Class parameter to the constructor code is as follows:

public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}

hash String class is another instance variable that represents the value of the cached hashCode, we will be introduced.

As can be seen, two different points NAME1 and name2 String object, but the value of these two value inside the object point to the same char array. Which is probably the memory layout is as follows:


So, name1 == name2 is not established, but name1.equals (name2) is true.

hashCode

We just mentioned hash instance variable, which is defined as follows:

private int hash; // Default to 0

It caches the hashCode () value method, that is, the first call to hashCode () when will the results of this hash is stored in a variable, then call it later returned directly saved values.

We look at the hashCode method of the String class, as follows:

Copy the code
public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}
Copy the code

: If the cached hash 0, direct return, otherwise the hash is calculated based on the contents of the character array, the calculation method is not

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

s represents a string, s [0] represents a character, n denotes the length of the string, s [0] * 31 ^ (n-1) n-1 represents the power value multiplied by 31 is the first character .

Why use this method to calculate it? In this formula, hash value and the value of each character is related to each position is multiplied by a different value, location hash value of each character is also relevant. 31 probably because the use of two reasons, one can produce a more dispersed hash, the hash value string that is different in general different, more efficient calculation on the other hand, 31 * h 32 * hh i.e. with (h << 5 ) -h equivalent, may be more efficient shift operations instead of multiplication and subtraction.

In Java, widely used in the above ideas to implement hashCode.

Regular Expressions

String class, there are some methods accepted is not an ordinary string parameter, but regular expressions, regular expressions What is it? It can be understood as a string, but the expression is a rule that is generally used to match text, find, replace, etc., the regular expression is rich and powerful, is a relatively large topic, we will separately in subsequent chapters introduction.

Java has special categories such as Pattern Matcher and for regular expressions, but for simple cases, String class provides a more simple operation, String regular expression methods accepted are:

Delimited string

public String[] split(String regex) 

Check for matches

public boolean matches(String regex)

String replacement

public String replaceFirst(String regex, String replacement)
public String replaceAll(String regex, String replacement) 

summary

In this section, we introduce the String class, it presented its basic usage, the internal implementation, transcoding, analysis of its immutability, string constants, and the realization of hashCode.

In this section, we mentioned, frequent string modify operation, String class efficiency is relatively low, and we mentioned StringBuilder StringBuffer class. We also see String can directly use the + and + = operate, they are also behind the StringBuilder class.

Let's look at the next section of these two categories.

 

Guess you like

Origin www.cnblogs.com/ivy-xu/p/12370128.html