Arrays.stream(elements).reduce(str, (r, w) -> r.replace(w, ""))
带有预期输出。
如果您想将输入字符串缩小到不再可能,最好迭代直到没有更改为止。
String n = str, o = null;
do {
n = stream(elements).reduce(o = n, (r, w) -> r.replace(w, ""));
} while(!n.equals(o));
System.out.println(n);
然后,使用输入字符串。
This is a caterpillar and that is a docatg.
你将获得
This is a erpillar and that is a .
如果您真的需要一种快速算法,请使用成本为O(n)
的Aho-Corasick
StringBuilder sb = new StringBuilder();
int begining = -1;
for (Emit e : Trie.builder().addKeywords(elements).build().parseText(str)) {
sb.append(str, begining + 1, e.getStart());
begining = e.getEnd();
}
sb.append(str, begining + 1, str.length());
System.out.println(sb.toString());
Aside 解决方案性能比较(与 Oussama ZAGHDOUD 的解决方案相比):
Equals = true // check all output are equals
Time1 = 18,548822 // Oussama ZAGHDOUD's solution O(n^2)
Time2 = 0,134459 // Aho-Corasick O(n) without precompute Trie
Time3 = 0,065056 // Aho-Corasick O(n) precomputed Trie
全面的工作代码
static String alg1(String[] elements, String str) {
StringBuilder bf = new StringBuilder(str);
str =null;
Stream.of(elements).forEach(e -> {
int index = bf.indexOf(e);
while (index != -1) {
index = bf.indexOf(e);
if (index != -1) {
bf.delete(index, index + e.length());
}
}
});
return bf.toString();
}
static String alg2(String[] elements, String str) {
StringBuilder sb = new StringBuilder();
int begining = -1;
for (Emit e : Trie.builder().addKeywords(elements).build().parseText(str)) {
sb.append(str, begining + 1, e.getStart());
begining = e.getEnd();
}
sb.append(str, begining + 1, str.length());
return sb.toString();
}
static String alg3(Trie trie, String str) {
StringBuilder sb = new StringBuilder();
int begining = -1;
for (Emit e : trie.parseText(str)) {
sb.append(str, begining + 1, e.getStart());
begining = e.getEnd();
}
sb.append(str, begining + 1, str.length());
return sb.toString();
}
public static void main(String... args) throws JsonProcessingException {
final ThreadLocalRandom rnd = ThreadLocalRandom.current();
String[] elements = range(0, 1_000).mapToObj(i -> "w" + rnd.nextInt()).toArray(String[]::new);
String str = range(0, 100_000)
.mapToObj(i -> "z" + rnd.nextInt() + " " + elements[rnd.nextInt(elements.length)])
.collect(joining(", "));
Trie trie = Trie.builder().addKeywords(elements).build();
long t0 = System.nanoTime();
String s1 = alg1(elements, str);
long t1 = System.nanoTime();
String s2 = alg2(elements, str);
long t2 = System.nanoTime();
String s3 = alg3(trie, str);
long t3 = System.nanoTime();
System.out.printf("Equals = %s%nTime1 = %f%nTime2 = %f%nTime3 = %f%n",
s1.equals(s2) && s2.equals(s3), (t1 - t0) * 1e-9, (t2 - t1) * 1e-9, (t3 - t2) * 1e-9);
}
elements = {"cat", "catcher",...}
。如果先移除"cat",那么短语就变成了"pass it to her"。 - Charlie G["app","ply"]
。对于短语 "apply",我们可以去掉 "app" 留下 "ly",或者去掉 "ply" 留下 "ap"。 - Charlie G