JAVA replaceAll regex和replacement转义
replacement转义
有下面这段代码:
public static void main(String[] args){
System.out.println("abcdef".replaceAll("abc","$"));//1
System.out.println("abcdef".replaceAll("abc", "\\"));//2
}
咋一看没毛病,但是运行会报错:
1:Exception in thread "main" java.lang.IllegalArgumentException: Illegal group reference: group index is missing
2:Exception in thread "main" java.lang.IllegalArgumentException: character to be escaped is missing
因为\
和$
是replacement
中的特殊字符,前者用来转义,后者用来匹配组并使用$n
来引用第n
个组。它们的正确使用方法如下:
System.out.println("abcdef".replaceAll("abc", "\\\\"));// \def
System.out.println("abcdef".replaceAll("abc", "\\$"));// $def
System.out.println("www.baidu.com".replaceAll("(.*)\\.(baidu)\\.(.*)", "https://$1.google.$3"));// https://www.google.com
可以看出这两个特殊字符的作用了,所以如果replacement
里面包含有这两个特殊字符,却没有按照需要的规则来匹配,就会报错。除了直接使用\
转义以外,还有一种方法在不知道replacement
中是否包含特殊字符是特别有用:
public static void main(String[] args){
System.out.println(replaceAll("abcdef","abc", "\\"));
System.out.println(replaceAll("abcdef","abc", "$"));
}
public static String replaceAll(String s,String regex,String replacement){
return s.replaceAll(regex,Matcher.quoteReplacement(replacement));
}
使用Matcher.quoteReplacement(replacement)
就能将replacement
中的所有特殊字符转义,是一种非常保险的方法,Matcher.quoteReplacement()
就是一个转义的操作:
/**
* Returns a literal replacement <code>String</code> for the specified
* <code>String</code>.
*
* This method produces a <code>String</code> that will work
* as a literal replacement <code>s</code> in the
* <code>appendReplacement</code> method of the {@link Matcher} class.
* The <code>String</code> produced will match the sequence of characters
* in <code>s</code> treated as a literal sequence. Slashes ('\') and
* dollar signs ('$') will be given no special meaning.
*
* @param s The string to be literalized
* @return A literal string replacement
* @since 1.5
*/
public static String quoteReplacement(String s) {
if ((s.indexOf('\\') == -1) && (s.indexOf('$') == -1))
return s;
StringBuilder sb = new StringBuilder();
for (int i=0; i<s.length(); i++) {
char c = s.charAt(i);
if (c == '\\' || c == '$') {
sb.append('\\');
}
sb.append(c);
}
return sb.toString();
}
regex转义
和replacement
类似的,如果regex
中有字符需要转义,也可主动加\
来转义,但是在不确定regex
中是否有特殊字符时,可能就会出错:
public static void main(String[] args){
System.out.println(replaceAll("My name is ${name}","${name}", "Hanmeimei"));
}
public static String replaceAll(String s,String regex,String replacement){
return s.replaceAll(regex,Matcher.quoteReplacement(replacement));
}
Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repetition near index 0
${name}
^
在确定regex
中不包含正则表达式的时候,可以使用Pattern.quote()
来转义:
public static void main(String[] args){
System.out.println(replaceAll("My name is ${name}","${name}", "Hanmeimei"));//My name is Hanmeimei
}
public static String replaceAll(String s,String regex,String replacement){
return s.replaceAll(Pattern.quote(regex),Matcher.quoteReplacement(replacement));
}