我添加了一个计数器,以便检查每n次charAt读取,以减少开销。
注意:
有些人表示可能不会频繁调用carAt。我只是添加了foo变量,以演示charAt被调用的次数以及它足够频繁。如果您要在生产中使用此功能,请删除该计数器,因为它会降低性能并在服务器上长时间运行时溢出long。在此示例中,大约每0.8秒调用30百万次charAt(未经过适当的微基准测试条件测试,仅为概念证明)。如果您想要更高的精度,则可以设置较低的checkInterval,但会牺牲性能(System.currentTimeMillis() > timeoutTime比长期使用if子句更昂贵)。
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import com.goikosoft.test.RegexpTimeoutException;
public class RegularExpressionUtils {
public static long foo = 0;
public static void main(String[] args) {
long millis = System.currentTimeMillis();
Matcher matcher = createMatcherWithTimeout(
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "(x+x+)+y", 10000, 30000000);
try {
System.out.println(matcher.matches());
} catch (RuntimeException e) {
System.out.println("Operation timed out after " + (System.currentTimeMillis() - millis) + " milliseconds");
}
System.out.print(foo);
}
public static Matcher createMatcherWithTimeout(String stringToMatch, String regularExpression, long timeoutMillis,
int checkInterval) {
Pattern pattern = Pattern.compile(regularExpression);
return createMatcherWithTimeout(stringToMatch, pattern, timeoutMillis, checkInterval);
}
public static Matcher createMatcherWithTimeout(String stringToMatch, Pattern regularExpressionPattern,
long timeoutMillis, int checkInterval) {
if (timeoutMillis < 0) {
return regularExpressionPattern.matcher(stringToMatch);
}
CharSequence charSequence = new TimeoutRegexCharSequence(stringToMatch, timeoutMillis, stringToMatch,
regularExpressionPattern.pattern(), checkInterval);
return regularExpressionPattern.matcher(charSequence);
}
private static class TimeoutRegexCharSequence implements CharSequence {
private final CharSequence inner;
private final long timeoutMillis;
private final long timeoutTime;
private final String stringToMatch;
private final String regularExpression;
private int checkInterval;
private int attemps;
TimeoutRegexCharSequence(CharSequence inner, long timeoutMillis, String stringToMatch,
String regularExpression, int checkInterval) {
super();
this.inner = inner;
this.timeoutMillis = timeoutMillis;
this.stringToMatch = stringToMatch;
this.regularExpression = regularExpression;
timeoutTime = System.currentTimeMillis() + timeoutMillis;
this.checkInterval = checkInterval;
this.attemps = 0;
}
public char charAt(int index) {
if (this.attemps == this.checkInterval) {
foo++;
if (System.currentTimeMillis() > timeoutTime) {
throw new RegexpTimeoutException(regularExpression, stringToMatch, timeoutMillis);
}
this.attemps = 0;
} else {
this.attemps++;
}
return inner.charAt(index);
}
public int length() {
return inner.length();
}
public CharSequence subSequence(int start, int end) {
return new TimeoutRegexCharSequence(inner.subSequence(start, end), timeoutMillis, stringToMatch,
regularExpression, checkInterval);
}
@Override
public String toString() {
return inner.toString();
}
}
}
还有自定义异常,这样你就可以捕获仅仅那个异常,以避免吞噬其他可能会抛出的 RE Pattern / Matcher 异常。
public class RegexpTimeoutException extends RuntimeException {
private static final long serialVersionUID = 6437153127902393756L;
private final String regularExpression;
private final String stringToMatch;
private final long timeoutMillis;
public RegexpTimeoutException() {
super();
regularExpression = null;
stringToMatch = null;
timeoutMillis = 0;
}
public RegexpTimeoutException(String message, Throwable cause) {
super(message, cause);
regularExpression = null;
stringToMatch = null;
timeoutMillis = 0;
}
public RegexpTimeoutException(String message) {
super(message);
regularExpression = null;
stringToMatch = null;
timeoutMillis = 0;
}
public RegexpTimeoutException(Throwable cause) {
super(cause);
regularExpression = null;
stringToMatch = null;
timeoutMillis = 0;
}
public RegexpTimeoutException(String regularExpression, String stringToMatch, long timeoutMillis) {
super("Timeout occurred after " + timeoutMillis + "ms while processing regular expression '"
+ regularExpression + "' on input '" + stringToMatch + "'!");
this.regularExpression = regularExpression;
this.stringToMatch = stringToMatch;
this.timeoutMillis = timeoutMillis;
}
public String getRegularExpression() {
return regularExpression;
}
public String getStringToMatch() {
return stringToMatch;
}
public long getTimeoutMillis() {
return timeoutMillis;
}
}
基于Andreas的回答。主要功劳应归于他和他的来源。