The Java String object is explained in detail, and I will never be afraid of others asking me about the String object in the future

1. Introduction to the String class

Definition and characteristics of the String class

The String class is a core class provided in Java to represent an immutable sequence of strings. It is part of the Java standard library, defined in the java.lang package, and is a final class, that is, it cannot be inherited. The definition and characteristics of the String class are introduced in detail below:

  1. Definition:
    The String class is a reference type (Reference Type), which is used to represent a string composed of characters. In Java, a string is treated as an object rather than a primitive data type. Each String object instance contains a sequence of characters, the sequence has a fixed length and content.

  2. Immutability:
    String objects are immutable, that is, once created, their value cannot be changed. This means that the contents of a String object cannot be changed after it is created. For any operation on a String object, a new String object is returned, and the original object remains unchanged. This immutability makes String objects thread-safe and memory-safe.

  3. String constant pool:
    The string constant pool (String Pool) in Java is a special memory area used to store string constants. The String object created by the string literal will first be searched in the string constant pool. If there is a string with the same value, the corresponding reference in the constant pool will be returned directly. This mechanism can save memory space and improve the reusability of strings.

  4. Methods and operations:
    The String class provides a wealth of methods and operations for processing strings. Commonly used operations include string concatenation, substring extraction, character search, replacement, and comparison. The String class also provides support for common operations such as string length, case conversion, and character encoding conversion.

  5. Other features:

    • String objects are immutable and thus can be safely shared in a multi-threaded environment.
    • The String class implements the Comparable interface, so string comparison and sorting operations can be performed.
    • The String class is widely used in various scenarios in Java, such as file processing, network communication, database operations, etc.

It should be noted that due to the immutability of the String object, a new String object will be generated every time the string is modified, which may cause performance problems when a large number of strings are frequently manipulated. In order to avoid this situation, you can use the StringBuilder or StringBuffer class to perform string operations, and convert to a String object at the end.

The way to create a String object

  1. Created using String literals:
    This is the most common and simple way of creating String objects. By enclosing a string literal in quotes, the Java compiler automatically converts it to a String object. For example:

    String str1 = "Hello, World!";
    

    The String object created by using the string literal will first check whether there is a string with the same value in the string constant pool, and if it exists, it will return the reference in the constant pool; if it does not exist, a new String object will be created. and add it to the string constant pool.

  2. Create with the new keyword:
    By using the new keyword and the constructor of the String class, you can explicitly create a new String object. For example:

    String str2 = new String("Hello");
    

    When creating a String object in this way, a new String object will be created regardless of whether a string with the same value already exists in the string constant pool. Therefore, String objects created using the new keyword will not use the string constant pool.

  3. Created using a character array:
    You can also use a character array to create a String object. You can create a String object containing the contents of a character array by passing the character array as a parameter to the constructor of the String class. For example:

    char[] chars = {
          
          'H', 'e', 'l', 'l', 'o'};
    String str3 = new String(chars);
    

    You can pass the entire character array or a subset of the array to create a String object.

  4. Using String Concatenation:
    Use the string concatenation operator (+) to concatenate multiple strings together and create a new String object. For example:

    String str4 = "Hello" + ", " + "World!";
    

    In this case, the compiler will automatically convert the operation of string concatenation to use the StringBuilder or StringBuffer class, and finally convert the result into a String object.

It should be noted that String objects created using string literals will be automatically added to the string constant pool, while String objects created through the new keyword will not be added to the string constant pool. In addition, the string objects in the string constant pool are immutable at runtime, while the String objects created using the new keyword can be modified.

Immutability and the concept of the string constant pool

  1. Immutability:
    String objects are immutable, which means that once a String object is created, its value cannot be changed. This immutability is embodied in the following aspects:

    • Modifying Strings: The content of a String object is fixed and its modification is not allowed. Any modification operation on a String object will actually return a new String object, leaving the original object unchanged. For example:

      String str = "Hello";
      str = str + ", World!";
      

      In the above example, str + ", World!"a new String object is actually created, while the original "Hello" String object is not changed.

    • Concatenating strings: For string concatenating operations, a new String object is also returned. For example:

      String str1 = "Hello";
      String str2 = " World!";
      String result = str1 + str2;
      

      In the above code, str1 + str2a new String object will be created.

    • Replace characters: The characters of the String object cannot be replaced directly, it needs to be realized by string splicing or using the StringBuilder/StringBuffer class. For example:

      String str = "Hello";
      str = str.replace('H', 'W');  // 无法直接替换,需要重新赋值
      

    The advantage of immutability is reflected in a multi-threaded environment, multiple threads can safely share String objects without worrying about modifying them. In addition, immutability also facilitates the implementation of the string constant pool.

  2. String constant pool (String Pool):
    The string constant pool is a special memory area in Java for storing string constants. It has the following characteristics:

    • The string constant pool is located in the heap memory, which is different from the ordinary heap.
    • The String object created by the string literal first checks whether there is a string with the same value in the string constant pool. If it exists, it will directly return the reference in the constant pool without creating a new object; if it does not exist, it will create a new String object in the constant pool.
    • The purpose of the string constant pool is to save memory space and improve the reusability of strings. Due to the immutability of String objects, the same value can be safely shared among multiple String objects, reducing memory overhead.

    Using the string constant pool can avoid repeatedly creating String objects with the same value in memory, which improves the performance and efficiency of the program. This is why using string literals to create String objects is one of the most common ways.

To sum up, the immutability of a String object means that once created, its value cannot be changed. The string constant pool is a special memory area used to store string constants, saving memory space and improving performance by reusing String objects of the same value.

Second, the basic operation of the string

String Concatenation and Concatenation

  1. Use the "+" operator:
    Use the "+" operator to concatenate multiple strings together to generate a new string. For example:

    String str1 = "Hello";
    String str2 = " World!";
    String result = str1 + str2;
    

    str1In this example, concatenating and using the "+" operator str2produces a new string resultwhose value is "Hello World!". This method is very concise and easy to understand, and is suitable for the connection of a small number of strings.

  2. Use the concat() method:
    concat()The method is used to concatenate the specified string to the end of the string. For example:

    String str1 = "Hello";
    String str2 = " World!";
    String result = str1.concat(str2);
    

    In the above example, str1.concat(str2)would str2concatenate to str1the end of and assign the result to result. Similarly, this method is also suitable for concatenation of a small number of strings.

  3. Use the StringBuilder or StringBuffer class:
    If you need to perform a large number of string concatenation operations or perform string operations in a multi-threaded environment, it is recommended to use the StringBuilder or StringBuffer class, which provide more efficient and variable string operation methods. For example:

    StringBuilder sb = new StringBuilder();
    sb.append("Hello");
    sb.append(" World!");
    String result = sb.toString();
    

    In this example, use append()the method of StringBuilder to add multiple strings to the StringBuilder object one by one, and finally toString()convert the StringBuilder object into a String object through the method.

    It should be noted that the StringBuilder class is not thread-safe, while the StringBuffer class is thread-safe. Therefore, in a single-threaded environment, it is recommended to use the StringBuilder class to perform string concatenation operations.

No matter which method is used to concatenate and concatenate strings, a new string object will be generated. This is because in Java, String objects are immutable and cannot be modified in-place. Therefore, each string concatenation or concatenation operation creates a new String object.

string comparison

In Java, string comparison is a common and important operation to determine whether two strings are equal or to determine their order. Java provides several methods to compare strings:

  1. Use the equals() method:
    equals()The method is used to determine whether the contents of two strings are the same. For example:

    String str1 = "Hello";
    String str2 = "hello";
    boolean isEqual = str1.equals(str2);
    

    In the above example, str1.equals(str2)will return false, because the contents of the two strings are not exactly the same. Note that equals()methods are case sensitive.

  2. Use the equalsIgnoreCase() method:
    equalsIgnoreCase()The method is used to ignore the case of the string to determine whether the contents of the two strings are the same. For example:

    String str1 = "Hello";
    String str2 = "hello";
    boolean isEqual = str1.equalsIgnoreCase(str2);
    

    In this example, str1.equalsIgnoreCase(str2)will return truebecause it ignores case.

  3. Use the compareTo() method:
    compareTo()The method is used to compare strings in lexicographical order. It returns an integer representing the relationship between two strings. The specific rules are as follows:

    • Returns 0 if the strings are equal.
    • Returns a negative number if the current string is less than the argument string.
    • Returns a positive number if the current string is greater than the argument string.

    For example:

    String str1 = "apple";
    String str2 = "banana";
    int result = str1.compareTo(str2);
    

    In the above example, str1.compareTo(str2)a negative number would be returned, indicating str1lexicographically before str2.

  4. Use the compareToIgnoreCase() method:
    compareToIgnoreCase()The method is similar to compareTo()the method, but ignores the case of the string. For example:

    String str1 = "Apple";
    String str2 = "banana";
    int result = str1.compareToIgnoreCase(str2);
    

    In this example, str1.compareToIgnoreCase(str2)a negative number will be returned, meaning before str1in lexicographical order str2(ignoring case).

It should be noted that the string comparison methods are all based on Unicode values. In addition, you can use ==the operator to compare whether the references of two strings are equal, but it does not compare the contents of the strings, it only judges whether the two string objects point to the same memory address.

String extraction and interception

In Java, strings can be extracted and truncated using different methods. These operations allow you to select specific parts from an original string and create a new string.

  1. Use substring() method:
    substring()method is used to extract substring from original string. It takes one or two parameters, the first parameter specifies the starting index of the substring (inclusive), and the second parameter (optional) specifies the ending index of the substring (exclusive). For example:

    String str = "Hello World";
    String substring1 = str.substring(6); // 从索引6开始截取到字符串末尾
    String substring2 = str.substring(0, 5); // 从索引0开始截取到索引5(不包括)
    

    In the above example, substring1the value of "World", and substring2the value of "Hello".

  2. Use the split() method:
    split()the method is used to split the string into multiple substrings according to the specified delimiter, and return a string array. For example:

    String str = "apple,banana,orange";
    String[] substrings = str.split(",");
    

    In this example, split(",")the string is strdivided into three substrings according to the comma separator and stored in the string array substrings. substrings[0]That is "apple", substrings[1]for "banana", substrings[2]for "orange".

  3. Use the charAt() method:
    charAt()The method is used to return the character at the specified index position in the string. Indexes start from 0. For example:

    String str = "Hello";
    char ch = str.charAt(1); // 获取索引为1的字符,即"e"
    

    In the above example, chthe value of 'e'.

  4. Using substring() method and indexOf() method:
    By using substring()method and indexOf()method in combination, you can extract a specific part in a string. For example:

    String str = "The quick brown fox";
    int startIndex = str.indexOf("quick"); // 获取子字符串"quick"的起始索引
    int endIndex = str.indexOf("fox"); // 获取子字符串"fox"的起始索引
    String result = str.substring(startIndex, endIndex);
    

    In this example, startIndexthe value of 4 endIndexis 16, so resultthe value of "quick brown".

It should be noted that the string extraction and interception operations will generate a new string object.

String Find and Replace

In Java, you can use different methods to perform string search and replace operations. These operations enable you to find a specific character or substring in a string and replace it with a new character or string.

  1. Use the indexOf() method:
    indexOf()The method is used to find the position index of the first occurrence of the specified character or substring in the original string. If a match is found, the index of the first match is returned; if no match is found, -1 is returned. For example:

    String str = "Hello World";
    int index = str.indexOf("o"); // 查找字符"o"在字符串中的位置
    

    In the above example, indexthe value of is 4 because the first character "o" is at index 4.

  2. Use the lastIndexOf() method:
    lastIndexOf()The method is similar to indexOf()the method, but it starts from the end of the string to find the position index of the last occurrence of the specified character or substring. For example:

    String str = "Hello World";
    int lastIndex = str.lastIndexOf("o"); // 查找字符"o"在字符串中最后一次出现的位置
    

    In this example, lastIndexthe value of is 7 because the last character "o" is at index 7.

  3. Use the contains() method:
    contains()The method is used to check whether the specified character or substring is contained in the string. The return value is boolean. For example:

    String str = "Hello World";
    boolean contains = str.contains("World"); // 检查字符串中是否包含子字符串"World"
    

    In the above example, containsthe value is true, because the string contains the substring "World".

  4. Use the replace() method:
    replace()The method is used to replace the specified character or substring with a new character or string. For example:

    String str = "Hello World";
    String newStr = str.replace("World", "Universe"); // 将字符串中的"World"替换为"Universe"
    

    In this example, newStrthe value is "Hello Universe".

  5. Use replaceAll() method:
    replaceAll()method replace()is similar to method, but it uses regular expressions for pattern matching and replacement. For example:

    String str = "Hello123World456";
    String newStr = str.replaceAll("\\d+", ""); // 用空字符串替换所有的数字
    

    In this example, newStrthe value is "HelloWorld", because all numbers are replaced with empty strings.

It should be noted that the search and replace operations of strings will generate a new string object, and the original string itself will not be changed.

Three, the common method of string

Get the length of the string

In Java, you can use length()the method to get the length of a string. This method returns the number of characters (including spaces and special characters) in the string.

Here is a simple example:

String str = "Hello World!";
int length = str.length();

In this example, length()the method is called and the return value is stored in an integer variable length. Ultimately, lengththe value of , is 12 because the string "Hello World!" consists of 12 characters.

It should be noted that length()the method is a method of the string object, so you need to use the dot operator after the string variable to call the method. This means you have to create a string object before you can call length()the method.

Also, length()the method also works with empty strings, which will return 0:

String emptyStr = "";
int emptyLength = emptyStr.length(); // 结果为0

For strings containing Unicode characters, length()the method returns the number of code units, not the number of Unicode characters. Java uses the UTF-16 encoding to represent strings, where some characters need to be represented using more than one code unit. Therefore, if the string contains such characters, the returned length may be greater than the actual number of Unicode characters.

convert case

In Java, you can use toUpperCase()the and toLowerCase()method to convert a string to uppercase or lowercase.

  1. toUpperCase()Method: This method converts all characters in the string to uppercase and returns a new string. For example:
String str = "Hello World!";
String upperCaseStr = str.toUpperCase();

In this example, toUpperCase()the method is called and the returned uppercase string is stored in upperCaseStra variable. Ultimately, upperCaseStrthe value for is "HELLO WORLD!".

  1. toLowerCase()Method: This method converts all characters in the string to lowercase letters and returns a new string. For example:
String str = "Hello World!";
String lowerCaseStr = str.toLowerCase();

In this example, toLowerCase()the method is called and the returned lowercase string is stored in lowerCaseStra variable. Ultimately, lowerCaseStrthe value for is "hello world!".

It should be noted that these two methods will generate a new string object, and the original string itself will not be changed.

Also, these two methods can only convert alphabetic characters to uppercase or lowercase, for non-alphabetic characters (such as numbers, special characters, etc.), they will remain unchanged.

Here's an example of how to convert case only for letters in a string:

String str = "Hello 123 World!";
String convertedStr = "";
for (int i = 0; i < str.length(); i++) {
    
    
    char c = str.charAt(i);
    if (Character.isLetter(c)) {
    
    
        if (Character.isUpperCase(c)) {
    
    
            convertedStr += Character.toLowerCase(c);
        } else {
    
    
            convertedStr += Character.toUpperCase(c);
        }
    } else {
    
    
        convertedStr += c;
    }
}

In this example, we iterate over each character in the string and use Character.isLetter()the method to check if the character is a letter. If it is a letter, use Character.isUpperCase()the method to determine the case of the character and convert accordingly. For non-alphabetic characters, we leave it unchanged.

The output of the above code will be "hELLO 123 wORLD!" with the case of the letters reversed.

Remove leading and trailing whitespace characters

In Java, you can use trim()methods to remove leading and trailing whitespace characters from a string. This method returns a new string with leading and trailing spaces removed from the original string.

Here is a simple example:

String str = "   Hello World!   ";
String trimmedStr = str.trim();

In this example, trim()the method is called and the returned string is stored in trimmedStra variable without whitespace. Ultimately, trimmedStrthe value for is "Hello World!".

It should be noted that trim()the method can only remove the spaces at the beginning and end of the string, but not the spaces in the middle of the string. For example, for the string "Hello World!", trim()the method will only remove leading and trailing spaces and return "Hello World!".

In addition, trim()the method can also strip other types of whitespace characters, such as tabs, newlines, and so on.

If you want to strip spaces from a string, you can use replaceAll()the method to replace all space characters:

String str = "   Hello    World!   ";
String noSpaceStr = str.replaceAll("\\s+", "");

In this example, replaceAll()the method is called and uses the regular expression "\s+" to match one or more consecutive whitespace characters and replaces them with an empty string. Finally, noSpaceStrthe value of "HelloWorld!".

It should be noted that replaceAll()the method will also return a new string, and the original string itself will not be changed.

split string

In Java, you can use split()the method to split a string into substrings. split()Method splits a string into an array of strings based on the given delimiter.

Here is a simple example:

String str = "Hello,World,Java";
String[] parts = str.split(",");

In this example, split()the method is called and strsplits the string into multiple substrings using commas as delimiters. The split results are stored in partsan array. Ultimately, partsthe value of the array is ["Hello", "World", "Java"].

As you can see, split()the method returns a string array, each element of which is a substring of the original string split according to the delimiter.

It should be noted that the delimiter can be a string or a regular expression. If the delimiter is a regular expression, it needs to be escaped with double backslashes.

Also, if you want to limit the number of substrings after splitting, you can use the method with the second parameter split(). For example, the following example strsplits a string into at most two substrings:

String str = "Hello,World,Java";
String[] parts = str.split(",", 2);

In this example, split()the second parameter of the method is 2, which means splitting into at most 2 substrings. The split results are stored in partsan array. Ultimately, partsthe value of the array is ["Hello", "World,Java"].

If there are consecutive delimiters in the string, split()the method returns an empty string as a substring between adjacent delimiters. If you want to remove these empty strings, you can combine the split()and trim()methods:

String str = "Hello,,World,Java";
String[] parts = str.split(",", -1);
for (int i = 0; i < parts.length; i++) {
    
    
    parts[i] = parts[i].trim();
}

In this example, split()the method is called with -1 as the second parameter, which means to keep all consecutive separators. Then, use a loop and trim()method to strip leading and trailing whitespace from each substring.

String formatting

In Java, formatting of strings can be used to create strings with a specific format. Formatting of strings is done using formatting patterns and parameter substitution.

Formatting strings in Java is mostly dependent on String.format()methods and System.out.printf()methods. Both methods support the use of special placeholders and conversion characters for string formatting.

Here is a simple example:

String name = "Alice";
int age = 25;
double weight = 55.5;

String message = String.format("My name is %s, I'm %d years old, and my weight is %.2f kg.", name, age, weight);
System.out.println(message);

In this example, String.format()the method is used to create a formatted string message. Among them, %s, %dand %.2fare placeholders representing a string, an integer, and a floating-point number with two decimal places, respectively. These placeholders will be replaced by the following parameters in turn to form the final formatted string.

Execute the above code, and the output is: "My name is Alice, I'm 25 years old, and my weight is 55.50 kg.".

In addition to String.format()methods, methods can also be used System.out.printf()to output formatted strings directly to the console:

String name = "Alice";
int age = 25;
double weight = 55.5;

System.out.printf("My name is %s, I'm %d years old, and my weight is %.2f kg.", name, age, weight);

Executing the above code will also output: "My name is Alice, I'm 25 years old, and my weight is 55.50 kg.".

In addition to regular placeholders, conversion characters can also be used to specify special formats, such as dates, times, etc. Here is an example:

import java.time.LocalDateTime;

LocalDateTime now = LocalDateTime.now();

System.out.printf("Current date and time: %tF %tT", now, now);

In this example, the use %tFand %tTconversion characters represent the format of the date and time, respectively. nowA variable is an LocalDateTimeobject that contains current date and time information. Executing the above code will output the current date and time, for example: "Current date and time: 2023-06-29 14:30:00".

It should be noted that in the format string, other modifiers can also be used to adjust the format of the output, such as width, precision, alignment, etc.

4. String immutability and performance issues

Effects of String Immutability on Programs

  1. Thread safety: Since strings are immutable, they are thread safe in a multi-threaded environment. Multiple threads can access and share the same string object at the same time without worrying about concurrency issues caused by data modification. This makes strings more reliable and easier to use in concurrent programming.

  2. Cache hashes: Since strings are immutable, their hashes can be cached. In Java, the hash value of a string is cached after the first calculation to improve the search efficiency of strings in data structures such as hash sets and hash maps. If the string is mutable, then every calculation of the hash value needs to traverse the character array of the string, which affects the performance of hash-related operations.

  3. Method passing safety: Since strings are immutable, they can be safely passed as method parameters. Inside the method, if the string is modified, a new string object is actually created without affecting the original string object. This avoids accidental modification when data is shared between methods, increasing the reliability of the program.

  4. Safety and reliability: The immutability of strings ensures that string data in programs cannot be accidentally modified. This is very important for security and reliability, especially when you need to protect sensitive information (such as passwords) or perform data manipulation.

  5. Performance optimization: Although the immutability of strings may cause some performance problems (such as when concatenating strings in large numbers), it also brings some performance optimization opportunities. Since strings are immutable, multiple strings can share the same memory space, saving memory. In addition, the immutability of strings also enables string objects to be cached and reused, reducing the overhead of frequently creating objects and improving program performance.

It should be noted that although strings are immutable, in Java classes, the characters of strings are Stringactually stored by using arrays. char[]When a modifying operation is performed on a string (such as calling a substring()method), a new Stringobject is actually created that refers to the same char[]array, but with possibly different ranges of indices. This design enables operations on strings without modifying the original Stringobject, further improving the reliability and security of the program.

Efficiency of string splicing

In Java, the string concatenation operation can be implemented in a variety of ways, such as using the "+" operator, concat()method, StringBuilderclass or StringBufferclass, etc.

However, due to the immutability of strings, each string concatenation operation will create a new string object. This can lead to the following efficiency issues:

  1. Memory overhead: Each string concatenation operation involves creating a new string object, which means that additional memory space needs to be allocated to store the new string. If a large number of string concatenation operations are performed, a large number of temporary objects will be generated, which increases memory overhead.

  2. Performance loss: The immutability of strings causes each concatenation operation to copy the previous string content and create a new string object. This operation will cause performance loss when splicing a large number of strings frequently, especially when using the "+" operator for splicing, because each "+" operator will trigger a copy and creation of a string object.

In order to solve the efficiency problem of string concatenation, StringBuilderclasses or StringBufferclasses can be used. These are mutable string classes that provide efficient string concatenation operations. Here are their features:

  • StringBuilder: A non-thread-safe mutable string class, used in a single-threaded environment.
  • StringBuffer: A thread-safe variable string class, used in a multi-threaded environment.

These classes provide a series of methods, such as append(), insert()etc., for appending or inserting operations on the basis of the original string without creating a new string object. This avoids frequent object creation and string copying, improving the efficiency of the splicing operation.

The following is sample code that demonstrates StringBuilderhow to use string concatenation:

StringBuilder sb = new StringBuilder();
sb.append("Hello");
sb.append(" ");
sb.append("World!");

String result = sb.toString();

In the above example, StringBuilderthe object appends the string content sbby calling the method multiple times . Finally the object is converted to an immutable object append()by calling a toString()method .StringBuilderString

Use of StringBuilder and StringBuffer

StringBuilderand StringBufferare classes in Java for efficient handling of mutable strings. They provide a series of methods for appending, inserting, deleting, and replacing on the basis of the original string without creating a new string object.

  1. create object:

    • Use a null parameter constructor:StringBuilder sb = new StringBuilder();
    • Use a constructor that specifies an initial capacity:StringBuilder sb = new StringBuilder(int capacity);

    StringBufferThe usage is StringBuildersimilar to , except that it is thread-safe and suitable for multi-threaded environments.

  2. Basic operation:

    • Append string: append(String str), used to append the specified string after the current string.
    • Insert string: insert(int offset, String str), insert the specified string at the specified position.
    • Delete characters or strings: delete(int start, int end), to delete the characters within the specified range; deleteCharAt(int index), to delete the characters at the specified index position.
    • Replace character or string: replace(int start, int end, String str), to replace the characters within the specified range with the specified string.
  3. Chained calls: both methods of
    StringBuilderand StringBufferreturn their own objects, so multiple operations can be performed through chained calls. For example:

    StringBuilder sb = new StringBuilder();
    sb.append("Hello").append(" ").append("World!");
    
  4. Convert to String:

    • Use toString()the method to convert an StringBuilderor StringBufferobject to an immutable Stringobject.
  5. Thread safety:

    • StringBuilderIt is non-thread-safe and suitable for single-threaded environments.
    • StringBufferIt is thread-safe and suitable for multi-threaded environments.

Since StringBuilderand StringBufferare both mutable string classes, they are usually more efficient than string concatenation using the "+" operator. StringBuilderIt is recommended to use or when frequent string operations are required or used in a multi-threaded environment StringBuffer.

It should be noted that although StringBuilderand StringBufferprovide efficient string manipulation, in a single-threaded environment, it is recommended to use StringBuilderbecause it is StringBufferslightly more lightweight than . Only in a multi-threaded environment do you need to consider StringBufferthe thread safety used.

5. Strings and regular expressions

Regular Expression Overview

Regular expressions are a powerful pattern-matching tool for searching, matching, and replacing strings of characters that match specific patterns in text. It provides a flexible and efficient way to process strings, and has a wide range of application scenarios, including text processing, data validation, data extraction, etc.

Regular expressions consist of characters and special characters and are used to describe patterns of strings.

  1. Ordinary characters: Any non-special character represents itself. For example, the regular expression abcwill match "abc" in the string.

  2. Metacharacters: Characters with special meaning, used to build more complex patterns.

    • .: Matches any single character, except newline.
    • *: Matches the preceding element zero or more times.
    • +: Matches the preceding element one or more times.
    • ?: Matches the preceding element zero or one time.
    • ^: The starting position of the matching string.
    • $: Matches the end position of the string.
    • \: Escape character, used to treat the following special characters as ordinary characters. For example, \.matches the period character ".".
    • []: Character class, matches any character in square brackets. For example, [aeiou]matches any single vowel.
    • [^]: Negates the character class, matching any character except the characters in square brackets. For example, [^0-9]matches any non-numeric character.
    • (): Grouping, treat the patterns in it as a whole.
    • |: Logical OR, matches either of the two patterns.
  3. Quantifier: Used to specify the number of occurrences of an element.

    • {n}: Exactly match n occurrences of the previous element.
    • {n,}: Matches at least n occurrences of the preceding element.
    • {n,m}: Match the preceding element at least n times and at most m times.
    • ?, *, +, {n,}, {n,m}are added after ?, indicating non-greedy matching, matching as little as possible.
  4. Predefined character classes: predefined shorthand forms for some commonly used character classes.

    • \d: Numeric characters, equivalent to [0-9].
    • \D: Non-numeric characters, equivalent to [^0-9].
    • \w: word character, equivalent to [a-zA-Z0-9_].
    • \W: Non-word characters, equivalent to [^a-zA-Z0-9_].
    • \s: Whitespace characters, including spaces, tabs, newlines, etc.
    • \S: A non-blank character.
  5. Greedy and non-greedy matching: By default, the regular expression will match as much as possible, that is, greedy matching. Non-greedy matching can be achieved by adding after the quantifier ?, matching as little as possible.

The general steps for using regular expressions are as follows:

  1. Defines a regular expression pattern.
  2. Use the corresponding method to match, search or replace, the common methods are:
    • matches(String regex, CharSequence input): Determine whether the entire string matches the pattern.
    • find(): Find and return the next matching substring.
    • replaceAll(String regex, String replacement): Replace all matched substrings with the specified string.
    • split(String regex): Splits a string based on a regular expression.

Using regular expressions in Java can use java.util.regexrelated classes and methods under the package, mainly including Patternand Matchertwo classes. Among them, Patternit is used to compile regular expressions Matcherfor matching operations.

Below is a simple Java sample code that demonstrates the process of matching and replacing strings using regular expressions:

import java.util.regex.*;

public class RegexExample {
    
    
    public static void main(String[] args) {
    
    
        String text = "The quick brown fox jumps over the lazy dog.";

        // 匹配包含 "fox" 的单词
        Pattern pattern = Pattern.compile("\\bfox\\b");
        Matcher matcher = pattern.matcher(text);
        while (matcher.find()) {
    
    
            System.out.println("Found match at index " + matcher.start());
        }

        // 替换所有的元音字母为 "*"
        String replacedText = text.replaceAll("[aeiou]", "*");
        System.out.println(replacedText);
    }
}

The above code matches and outputs words containing "fox" through regular expressions, and replaces vowels in the text with "*".

String and regular expression matching and replacement

Matching and replacing strings with regular expressions is a common and very useful operation. By using regular expressions, we can find, match and replace strings in text that match certain patterns.

The matching operation refers to finding substrings that satisfy a specific pattern in a given string. Java provides java.util.regexpackages in which the Patternand Matcherclasses can be used for matching operations.

  1. PatternClass: Used to compile regular expressions and create Patternobjects. It provides multiple static methods to compile regular expressions, and some options can also be set to adjust the matching behavior. For example, Pattern.compile(String regex)the method compiles a regular expression into Patternan object.

  2. MatcherClass: Used to perform matching operations in a given string. By calling Pattern.matcher(CharSequence input)a method, we can get an Matcherobject. Next, Matchervarious methods of the class can be used to perform matching operations:

    • matches(): Determine whether the entire string matches the regular expression.
    • find(): Attempts to find the next matching substring in the string.
    • start(): Returns the starting index of the currently matched substring.
    • end(): Returns the end index of the currently matched substring.
    • group(): Returns the currently matched substring.

The following is a simple sample code that demonstrates how to use regular expressions for matching operations:

import java.util.regex.*;

public class RegexMatchingExample {
    
    
    public static void main(String[] args) {
    
    
        String text = "The quick brown fox jumps over the lazy dog.";

        // 匹配包含 "fox" 的单词
        Pattern pattern = Pattern.compile("\\bfox\\b");
        Matcher matcher = pattern.matcher(text);
        while (matcher.find()) {
    
    
            System.out.println("Found match at index " + matcher.start());
        }
    }
}

The above code uses a regular expression "\bfox\b"to match words that contain "fox" in the text. By calling Matcher.find()the method, we find the position of each match and output its starting index.

The replacement operation refers to replacing the matched substring with a new string. There are two common ways to use regular expressions for replacement operations in Java:

  1. String.replaceAll(String regex, String replacement): This method uses a regular expression to find and replace all matching substrings in the given string. Among them, regexthe parameter specifies the regular expression, and replacementthe parameter specifies the string to be replaced.

  2. Matcher.replaceAll(String replacement): This method Matcherfinds and replaces all matching substrings in the object's current string. Likewise, replacementthe parameter specifies the string to replace.

Here is a sample code that demonstrates how to use regular expressions for replacement operations:

import java.util.regex.*;

public class RegexReplacementExample {
    
    
    public static void main(String[] args) {
    
    
        String text = "The quick brown fox jumps over the lazy dog.";

        // 将文本中的元音字母替换为 "*"
        String replacedText = text.replaceAll("[aeiou]", "*");
        System.out.println(replacedText);
    }
}

The above code uses regular expressions "[aeiou]"to match vowels in the text and replace them with "*". By calling String.replaceAll()the method, we can get the replaced string and output it.

6. Encoding and decoding of strings

The concept of character encoding and character set

When we deal with text, character encoding and character set are two important concepts. They are used to represent and manipulate characters and text.

Character set (Character Set) is a collection of characters, each character has a unique number. Common character sets include ASCII, Unicode, etc. A character set defines the mapping between characters and numbers.

ASCII (American Standard Code for Information Interchange) is one of the earliest and most commonly used character sets. It uses 7-bit binary numbers (0-127) to represent 128 characters, including English letters, numbers and some punctuation marks. The ASCII character set is the character set used by English and other western countries.

With the development of computer technology, there have been more needs to represent characters of various languages ​​around the world. Unicode came into being. Unicode is an international standard character set, which provides unique numbers for almost all characters in the world. Unicode uses a variety of encoding schemes to represent characters, the most commonly used of which are UTF-8 and UTF-16.

UTF-8 (Unicode Transformation Format-8) is a variable-length encoding scheme that uses 8-bit binary numbers (0-255) to represent Unicode characters. UTF-8 can represent ASCII characters as well as other Unicode characters, so it is backward compatible with ASCII.

UTF-16 (Unicode Transformation Format-16) is a fixed-length encoding scheme that uses 16-bit binary numbers (0-65535) to represent Unicode characters. UTF-16 is suitable for representing most Unicode characters, but it will take up more storage space than UTF-8.

Character Encoding (Character Encoding) is a rule that converts characters in a character set into a computer-recognizable binary form. It uses an encoding table to represent the mapping between characters and numbers.

In Java, the default encoding for strings is UTF-16. When we deal with text in Java programs, we usually use Stringclasses to represent strings, and this class uses UTF-16 encoding to store and process string data.

In actual development, we need to pay attention to the correct use and conversion of character encoding. If you are dealing with text data in a different environment, you can use getBytes()the method Convert a string to a specified byte array, or use new String(byte[], charset)Convert a byte array to a string in a specified encoding.

Common character encoding methods

Common character encoding methods include ASCII, UTF-8, UTF-16, and ISO-8859-1.

  1. ASCII(American Standard Code for Information Interchange):

    • Encoding range: represented by 7-bit binary numbers, including 0-127 characters.
    • Features: ASCII is the earliest and most commonly used character encoding method, mainly used for character representation in English and other western countries. It can only represent basic Latin letters, numbers and some punctuation marks, non-Latin characters and special characters are not supported.
    • When to use: Suitable for representations of English, numbers, and basic punctuation.
  2. UTF-8(Unicode Transformation Format-8):

    • Encoding range: Variable-length encoding, using 8-bit binary numbers to represent Unicode characters.
    • Features: UTF-8 is an encoding method of the Unicode standard, which can represent almost all Unicode characters. For ASCII characters, UTF-8 is compatible with ASCII encoding. It saves more storage space than UTF-16 when representing non-Latin characters, because it uses variable-length encoding and allocates bytes of different lengths according to different characters.
    • Usage scenario: A universal character encoding method, suitable for almost all text data, especially suitable for Internet transmission and storage.
  3. UTF-16(Unicode Transformation Format-16):

    • Encoding range: Fixed-length encoding, using 16-bit binary numbers to represent Unicode characters.
    • Features: UTF-16 is also an encoding method of the Unicode standard, which can represent almost all Unicode characters. Compared with UTF-8, UTF-16 will take up more storage space when representing non-Latin characters, because it uses a fixed-length 16-bit encoding.
    • Usage scenarios: Suitable for scenarios that require fixed-length encoding and random access to character positions, such as some system internal string processing and specific applications.
  4. ISO-8859-1(Latin-1):

    • Encoding range: represented by 8-bit binary numbers, including 0-255 characters.
    • Features: ISO-8859-1 is a character encoding method defined by the International Organization for Standardization. It is an extension of the ASCII encoding and can represent more Western European characters. However, ISO-8859-1 can only represent some European language characters, and does not support characters from most other languages ​​​​in the world.
    • Usage scenario: It is mainly used for the representation of Western European language text, and is not suitable for multilingual environment or globalization application scenarios.

In addition to the above-mentioned common character encoding methods, there are some other encoding methods, such as GBK, GB2312, Big5, etc., which are mainly used for character representation of Chinese and East Asian languages.

In the actual development process, it is necessary to select an appropriate character encoding method to process and store text data according to specific needs and environments. Make sure that the encoding method used can cover the required character range, and pay attention to the correct conversion and processing of character encoding to avoid problems of garbled characters or missing characters.

Introduction to string encoding and decoding methods

  1. A string is encoded as a sequence of bytes:

    • Use the getBytes() method: This method can encode a string into a byte array according to the specified character set. For example, to encode a string as a byte array using UTF-8 encoding:byte[] bytes = str.getBytes("UTF-8");
    • Use the string encoding conversion class: Java provides the Charset and CharsetEncoder classes for string encoding conversion. The sample code is as follows:
      Charset charset = Charset.forName("UTF-8");
      CharsetEncoder encoder = charset.newEncoder();
      ByteBuffer buffer = encoder.encode(CharBuffer.wrap(str));
      byte[] bytes = buffer.array();
      
  2. Byte sequence decoded to string:

    • Using the constructor: A string object can be created and decoded using the specified character set. For example, to decode a byte array to a string using the UTF-8 charset:String str = new String(bytes, "UTF-8");
    • Use the string encoding conversion class: Java provides the Charset and CharsetDecoder classes for string decoding conversion. The sample code is as follows:
      Charset charset = Charset.forName("UTF-8");
      CharsetDecoder decoder = charset.newDecoder();
      CharBuffer buffer = decoder.decode(ByteBuffer.wrap(bytes));
      String str = buffer.toString();
      
  3. URL encoding and decoding:

    • URL Encoding Using the URLEncoder Class: You can use the URLEncoder class to URL encode strings and convert special characters into a URL-safe form. The sample code is as follows:
      String encodedStr = URLEncoder.encode(str, "UTF-8");
      
    • Use the URLDecoder class for URL decoding: You can use the URLDecoder class to decode the URL-encoded string and restore it to the original string. The sample code is as follows:
      String decodedStr = URLDecoder.decode(encodedStr, "UTF-8");
      
  4. Base64 encoding and decoding:

    • Base64 encoding using the Base64 class: Java provides the Base64 class, which can perform Base64 encoding on a byte array to obtain a Base64 string representation. The sample code is as follows:
      byte[] encodedBytes = Base64.getEncoder().encode(bytes);
      String base64Str = new String(encodedBytes, StandardCharsets.UTF_8);
      
    • Base64 decoding using the Base64 class: You can use the Base64 class to decode the Base64-encoded string and restore it to the original byte array. The sample code is as follows:
      byte[] decodedBytes = Base64.getDecoder().decode(base64Str);
      

7. Conversion of strings and other data types

Conversion of strings to basic data types

  1. String to basic data type:

    • Use the static method of the wrapper class: each basic data type has a corresponding wrapper class, such as Integer, , Doubleetc. These wrapper classes provide static methods for converting strings to corresponding types, such as Integer.parseInt(), Double.parseDouble()etc. For example:
      String str = "123";
      int num = Integer.parseInt(str);
      double decimal = Double.parseDouble(str);
      
    • Use the valueOf method: All wrapper classes provide valueOfmethods to convert the corresponding string into a wrapper class object of the corresponding type. Wrapper class objects can then be converted to primitive data types via automatic unboxing. For example:
      String str = "123";
      Integer integer = Integer.valueOf(str);
      int num = integer.intValue();
      
  2. Convert basic data types to strings:

    • Using string concatenation: You can use string concatenation (+) to convert primitive data types to strings. For example:
      int num = 123;
      String str = "" + num;
      
    • Use the valueOf method of the String class: All wrapper classes and basic data types have implemented toString()the method, and you can use String.valueOf()or directly call the method of the object toString()to convert the basic data type to a string. For example:
      int num = 123;
      String str = String.valueOf(num);
      

It should be noted that when the string cannot be correctly converted to the corresponding type, NumberFormatExceptionan exception will be thrown. Therefore, when converting a string to a primitive data type, it is necessary to ensure that the string is in the correct format.

In addition, new stream-based methods have been introduced in Java 8 and later, which can more conveniently handle the conversion between strings and primitive data types. For example, Integerthe class provides overloaded versions of parseInt()the and valueOf()methods that allow specifying the base (base) when converting. These new methods provide a more flexible way to convert.

Conversion of string to datetime type

  1. String to datetime type:

    • Working with SimpleDateFormatclasses: SimpleDateFormatClasses are classes for formatting and parsing dates and times. You can use its parse()methods to parse a string into a date type. For example:
      String str = "2023-06-29 12:34:56";
      SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
      Date date = format.parse(str);
      
    • Using DateTimeFormatterclasses (Java 8 and later): DateTimeFormatterClasses are date-time handling classes introduced in Java 8. You can use its parse()methods to parse a string into a datetime type. For example:
      String str = "2023-06-29 12:34:56";
      DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
      LocalDateTime dateTime = LocalDateTime.parse(str, formatter);
      
  2. Convert datetime type to string:

    • Using SimpleDateFormatclasses: You can use the methods SimpleDateFormatof the class format()to format the date type as a string. For example:
      Date date = new Date();
      SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
      String str = format.format(date);
      
    • Using DateTimeFormatterclasses (Java 8 and later): You can use the methods DateTimeFormatterof a class format()to format a datetime type as a string. For example:
      LocalDateTime dateTime = LocalDateTime.now();
      DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
      String str = dateTime.format(formatter);
      

It should be noted that when using SimpleDateFormatthe or DateTimeFormatterclass, you need to use the correct date and time format pattern (pattern) to specify the format of the string. For example, yyyyto represent the year, MMto represent the month, ddto represent the date, HHto represent the hour (24-hour system), mmto represent the minute, ssto represent the second, etc.

In addition, Java 8 introduced java.timepackages to provide more powerful and flexible date and time processing classes. In LocalDateTimeaddition, there are LocalDate, LocalTime, ZonedDateTimeand other classes that can be used to handle date and time with different granularities. These new datetime classes also provide conversion methods to and from strings, used in a manner similar to the examples above.

8. Internationalization and localization of strings

Internationalization Support for the Java Platform

The ava platform provides powerful internationalization (Internationalization, referred to as i18n) support, enabling developers to easily write applications with multi-language and regional characteristics.

  1. Localization:

    • LocaleClass: LocaleA class represents a specific language and locale. It can be used to identify different languages, countries and cultures. Classes Localeallow you to specify the language or region to use, ensuring that your application displays the correct localized content in different environments.
    • Resource Bundle: A resource bundle is a collection that contains localization information. Each language or region can have a corresponding resource package, which contains information such as text, images, and formats related to the language or region. Resource bundles in Java are usually stored in .properties files, and each file corresponds to the resource content of a language or region.
  2. Character encoding and text processing:

    • CharsetClass: CharsetA class represents a character encoding set. Java uses the Unicode character set as an internal encoding, but external data and files may use a different character encoding. Through Charsetthe class, character set conversion and processing can be performed to ensure correct conversion and processing of text content between different character encodings.
    • String类和MessageFormat类:Java的String类提供了许多用于本地化文本处理的方法,如字符串连接和格式化。MessageFormat类则提供了更高级的字符串格式化功能,能够根据指定的语言和地区进行复杂的消息格式化和替换操作。
  3. 日期和时间处理:

    • java.time包:Java 8引入了新的日期和时间API,位于java.time包中。它提供了一组强大的类和方法,用于处理不同粒度的日期和时间,以及与时区相关的操作。这些类可以根据指定的语言和地区格式化日期和时间信息。
    • DateFormat类:DateFormat类是旧版日期格式化类,位于java.text包中。它提供了基本的日期和时间格式化功能,并可根据指定的语言和地区进行本地化格式化。
  4. 数字和货币格式化:

    • NumberFormat类:NumberFormat类是用于格式化数字的类,位于java.text包中。它可以根据指定的语言和地区格式化数字、百分比和货币等内容。
    • Currency类:Currency类是表示货币的类,它提供了有关货币的信息,如货币代码、符号、小数位数等。通过Currency类,可以根据指定的语言和地区正确地格式化货币值。
  5. 时间和货币的格式化约定:

    • Locale类和java.util.spi.LocaleServiceProvider接口:通过使用Locale类,可以指定时间和货币的格式化约定。不同的语言和地区可能对时间、日期和货币的格式有不同的约定,而Java平台提供了一套基于Locale类的本地化约定接口来处理这些差异。

字符串的本地化处理方式

  1. 使用资源包(Resource Bundle):
    资源包是Java中一种常见的本地化处理方式,通过它可以将本地化相关的字符串存储在不同的属性文件中,每个属性文件对应一个语言或地区。资源包通常使用.properties文件,内容以键值对的形式表示。

    • Create a resource bundle: Create a .properties file, for example messages.properties, in which to store default language strings. Then create corresponding attribute files for different languages ​​and regions, such as messages_zh_CN.propertiesSimplified Chinese.
    • Load resource bundle: In Java code, use ResourceBundlethe class to load the corresponding resource bundle file. The language and region can be specified as required, if not specified, the system default language and region will be used.
    • Get localized string: Use ResourceBundlethe class to get the corresponding localized string according to the specified key.

    The sample code is as follows:

    // 加载资源包
    ResourceBundle bundle = ResourceBundle.getBundle("messages", new Locale("zh", "CN"));
    // 获取本地化字符串
    String localizedString = bundle.getString("hello.world");
    
  2. Use the MessageFormat class:
    MessageFormatThe class is a tool class provided by Java for formatting strings, which can perform complex message formatting and replacement operations according to the specified language and region. Dynamic substitution can be achieved by using placeholders and parameters in the string.

    The sample code is as follows:

    String pattern = "Hello, {0}! Today is {1}.";
    String name = "John";
    String date = DateFormat.getDateInstance(DateFormat.FULL).format(new Date());
    String formattedString = MessageFormat.format(pattern, name, date);
    

    In the above example, patternit is a string with placeholders, and MessageFormat.format()the method will replace the incoming parameters with the corresponding placeholders according to the specified language and region.

  3. Use the StringFormat class:
    String.format()The method is a commonly used string formatting method in Java, and it also supports localization. This method can realize the formatted output of the string by using the format string and parameters.

    The sample code is as follows:

    String name = "Alice";
    int age = 30;
    String localizedString = String.format("My name is %s and I'm %d years old.", name, age);
    

    In the above example, %sand %dare placeholders for string formatting, String.format()the method will replace the incoming parameters with the corresponding placeholders according to the specified language and region.

Guess you like

Origin blog.csdn.net/u012581020/article/details/131455658