如何从字符串中删除动态字符的字符串?

4

我是一个Android开发新手,正在尝试从一个字符串中删除一串动态字符。我的字符串:

"Beginning of String....<img src="http://webaddress.com" height="1" width="1"/>"

我想要移除"&lt"、"&gt"以及它们之间的所有内容。我只想要"Beginning of String..."。到目前为止,我尝试了以下方法但没有成功。

description = description.replaceFirst("(?s)(&lt)(.*?)(&gt)","$1$3");

我也试过在类似的字符串上运行此代码,结果正常,所以我不明白自己错在哪里。

description = description.replaceFirst("(?s)(<sub>)(.*?)(</sub>)","$1$3");

我的课程

public class RssReader {

private final static String BOLD_OPEN = "<B>";
private final static String BOLD_CLOSE = "</B>";
private final static String BREAK = "<BR>";
private final static String ITALIC_OPEN = "<I>";
private final static String ITALIC_CLOSE = "</I>";
private final static String SMALL_OPEN = "<SMALL>";
private final static String SMALL_CLOSE = "</SMALL>";


public static List<JSONObject> getLatestRssFeed(){
    String feed = "http://feeds.feedburner.com/MetalMarketCommentary";
    //http://globoesporte.globo.com/dynamo/futebol/times/vasco/rss2.xml
    //http://feeds.feedburner.com/GoldMoneyGoldResearch +
    //http://feeds.feedburner.com/GoldsilvercomNews +
    //http://feed43.com/7466558277232702.xml
    //http://feeds.feedburner.com/SilverGoldDaily
    //http://feeds.feedburner.com/MetalMarketCommentary
    //http://link.brightcove.com/services/player/bcpid1683318714001?bckey=AQ~~,AAAAC59qSJk~,vyxcsD3OtBPHZ2UIrFX2-wdCLTYNyMNn&bclid=1644543007001&bctid=1854182861001

    RSSHandler rh = new RSSHandler();
    List<Article> articles =  rh.getLatestArticles(feed);
    Log.e("RSS ERROR", "Number of articles " + articles.size());
    return fillData(articles);
}



private static List<JSONObject> fillData(List<Article> articles) {

    List<JSONObject> items = new ArrayList<JSONObject>();
    for (Article article : articles) {
        JSONObject current = new JSONObject();
        try {
            buildJsonObject(article, current);
        } catch (JSONException e) {
            Log.e("RSS ERROR", "Error creating JSON Object from RSS feed");
        }
        items.add(current);
    }

    return items;
}



private static void buildJsonObject(Article article, JSONObject current) throws JSONException {
    String title = article.getTitle();
    String description = article.getDescription();
    description = description.replaceFirst("(?s)(<sub>)(.*?)(</sub>)","$1$3");
    int start = description.indexOf(".&");
    description= description.substring(0, start);
    String date = article.getPubDate();
    String imgLink = article.getImgLink();

    StringBuffer sb = new StringBuffer();
    sb.append(BOLD_OPEN).append(title).append(BOLD_CLOSE);
    sb.append(BREAK);
    sb.append(description);
    sb.append(BREAK);
    sb.append(SMALL_OPEN).append(ITALIC_OPEN).append(date).append(ITALIC_CLOSE).append(SMALL_CLOSE);

    current.put("text", Html.fromHtml(sb.toString()));
    current.put("imageLink", imgLink);
}
}

我正在解析的是XML

<item>
           <title>Gold Market Recap Report</title>
           <link>http://feedproxy.google.com/~r/MetalMarketCommentary/~3/jGYtkXdSKWs/mid-session-gold_703.html</link>
           <description>&lt;img src="http://www.cmegroup.com/images/1x1trans.gif?destination=http://www.cmegroup.com/education/market-commentary/metals/2012/09/mid-session-gold_703.html" alt=""/&gt;For the week December gold forged a trading range of roughly $37 an ounce. With gold prices attimes seemingly on the rocks and poised for a downside washout it was a change of pace to see afresh upside breakout in the Friday morning trade....&lt;img src="http://feeds.feedburner.com/~r/MetalMarketCommentary/~4/jGYtkXdSKWs" height="1" width="1"/&gt;</description>
           <pubDate>Fri, 21 Sep 2012 19:50:37 GMT</pubDate>
           <guid isPermaLink="false">http://www.cmegroup.com/education/market-commentary/metals/2012/09/mid-session-gold_703.html?source=rss</guid>
           <dc:date>2012-09-21T19:50:37Z</dc:date>
           <feedburner:origLink>http://www.cmegroup.com/education/market-commentary/metals/2012/09/mid-session-gold_703.html?source=rss</feedburner:origLink>
      </item>

2
(1) 当你说“没有成功”时,你指的是什么? (2) 为什么你写&lt&gt而不是&lt;&gt;? (3) 你说你也想删除&lt&gt,但你的代码用$1$3替换了(&lt)(...)(&gt),这是&lt&gt。为什么会这样? - ruakh
@ruakh 我是正则表达式的新手,从我读的教程中理解的是这段代码会删除“字符串开头”之后的所有内容。 - B. Money
2个回答

2
    String string = "Beginning of String....&lt;img src=\"http://webaddress.com\" height=\"1\" width=\"1\"/&gt;"; //Escape whatever has to be escaped
    System.out.println(string);
    int start = string.indexOf("&");
    int end = string.lastIndexOf("&");
    String temp = string.substring(start, (end+3));
    string = string.replace(temp, "");
    System.out.println(string);

这将删除位于&lt和&gt之间的任何内容,包括它们本身。

1
major &amp; fail &lt;fail&gt; fail - Qtax
谢谢回复。它需要更具适应性,因为我正在将XML解析为字符串,然后附加HTML标记,所以我不能仅仅在之后删除所有内容。我会发布我的类。请检查我的编辑。 - B. Money
@B.Money 给我一点时间,让我为您制作一个更具适应性的。 - Raghav Sood

0

您不必使用正则表达式来完成此操作。

创建一个方法,查找并删除以&lt;开头且以&gt;结尾的子字符串,并在循环中运行它。

String mstring = "Beginning of String....&lt;img src="http://webaddress.com" height="1" width="1"/&gt;";
//  You'll need to escape the double quotes if you're explicitly setting the string

public boolean containsTags()
{
    boolean lt, gt;

    lt = mstring.contains("&lt;");
    gt = mstring.contains("&gt;");

    return (lt || gt);
}

public void removeTags()
{
    int indexof_lt, indexof_gt;

    indexof_lt = mstring.indexOf("&lt;");
    indexof_gt = mstring.indexOf("&gt;");

    String str_first, str_last;

    str_first = mstring.substring(0, indexof_lt);
    str_last = mstring.substring(indexof_gt + indexof_gt.length);

    mstring = str_first + str_last;
}

public void removeAllTags()
{
    while(containsTags)
    {
        removeTags();
    }
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接