This article only pursues more efficient regular matching, that is, improving the time-consuming compilation of regular expressions, which is suitable for situations where there are many regular matching scenarios
The essence of efficiency improvement: local cache + reduce the number of compilations (thinking about effective java, and thinking about TCP time-consuming in database connection, if conditions permit, regular expressions frequently used in the system can be collected, and can also be loaded into memory during system initialization) + (optional: collect data, and load it into regular tools in advance through jobs or threads)
The maven:version part is downloaded from the maven reference by itself
<dependency> <groupId>com.github.ben-manes.caffeine</groupId> <artifactId>caffeine</artifactId> </dependency>
import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
import org.apache.commons.lang3.StringUtils;
import org.springframework.util.StopWatch;import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
import java.util.regex.Pattern;/**
* description: Regular expression tool class
*/
public class RegUtils { private RegUtils() { }
public static final Cache<String, Pattern> REG_PATTERNS = Caffeine.newBuilder().maximumSize(512).build();
/**
* 获取pattern
* @param expression
* @return
*/
public static Pattern getRegPattern(String expression) {
if (StringUtils.isEmpty(expression)) {
throw new IllegalArgumentException("expression is empty");
}
Pattern ifPresent = REG_PATTERNS.getIfPresent(expression);
if (Objects.nonNull(ifPresent)) {
return ifPresent;
}
Pattern compile = null;
try {
compile = Pattern.compile(expression);
} catch (Exception e) {
throw new IllegalArgumentException("expression error:" + expression + " " + e.getMessage());
}
if (Objects.nonNull(compile)) {
REG_PATTERNS.put(expression, compile);
}
return compile;
}/**
* Regular replacement replaces the replaceAll method of the String class
* @param value
* @param reg
* @param replacement
* @return
*/
public static String replaceReg(String value, String reg, String replacement) { return getRegPattern(reg).matcher(value).replaceAll(replacement); }public static void main(String[] args) {
String expression = "[0-9]{3}[-]{1}[0-9]{4}";
String data = "333-8889";
StopWatch stopWatch = new StopWatch(expression);
stopWatch.start();
boolean matches = getRegPattern(expression).matcher(data).matches();
stopWatch.stop();
System.out.println(stopWatch.prettyPrint());
List<String> strings = new ArrayList<>();
strings.add("333-8889");// true
for (String string : strings) {
StopWatch stopWatch1 = new StopWatch(expression);
stopWatch1.start();
boolean matches1 = getRegPattern(expression).matcher(string).matches();
stopWatch1.stop();
System.out.println(matches1);
System.out.println(stopWatch1.prettyPrint());
}
}
}
compare results
Put it into the cache after the first compilation, and get it directly from the cache for the second time, which has a qualitative change in efficiency and speed