一些人可能已经知道,我正在为StackOverflow聊天系统工作,开发XMPP(Jabber)集成,使用xmpp4r包编写的XMPP组件。
我正在处理一个问题(好吧,有很多问题,但是目前只有一个问题 :-) 我正在提取聊天记录的JSON源并提取消息的HTML。我使用Ruby TidyHTML绑定将JSON中的HTML转换为XHTML,以便我可以将其作为XMPP消息发送--因为XMPP消息只是XML,将HTML转换为XHTML应该使其成为有效的XML,我只需将其放入<message>
标签中即可。
<message to='jeswah@smart-safe-secure.com/Token' type='groupchat' xmlns='jabber:client'><body><div class="onebox ob-message"><a class="roomname" href="/transcript/message/263372#263372"><span title="2010-11-04 19:20:23Z">1 hour ago</span></a>, by <span class="user-name">Fosco</span> <br/><div class="quote"><div class="room-mini"><div class="room-mini-header"><h3><img class="small-site-logo" title="Gaming" alt="Gaming" width="16" height="16" src="http://sstatic.net/gaming/img/favicon.ico" />&nbsp;<span class="room-name"><a href="http://chat.stackexchange.com/rooms/28/minecraft-talk">Minecraft Talk</a></span></h3><div class="room-mini-description">Everything Minecraft, including classic and survival mode</div></div><div class="room-current-user-count" title="current users">9</div><div class="mspark" style="height:25px;width:205px">
<div class="mspbar" style="width:8px;height:13px;left:0px;"></div><div class="mspbar" style="width:8px;height:9px;left:8px;"></div><div class="mspbar" style="width:8px;height:2px;left:16px;"></div><div class="mspbar" style="width:8px;height:8px;left:24px;"></div><div class="mspbar" style="width:8px;height:1px;left:32px;"></div><div class="mspbar" style="width:8px;height:1px;left:56px;"></div><div class="mspbar" style="width:8px;height:0px;left:64px;"></div><div class="mspbar" style="width:8px;height:0px;left:88px;"></div><div class="mspbar" style="width:8px;height:0px;left:96px;"></div><div class="mspbar" style="width:8px;height:11px;left:104px;"></div><div class="mspbar" style="width:8px;height:7px;left:112px;"></div><div class="mspbar" style="width:8px;height:7px;left:120px;"></div><div class="mspbar" style="width:8px;height:25px;left:128px;"></div><div class="mspbar" style="width:8px;height:14px;left:136px;"></div><div class="mspbar" style="width:8px;height:4px;left:144px;"></div><div class="mspbar" style="width:8px;height:7px;left:152px;"></div><div class="mspbar" style="width:8px;height:19px;left:160px;"></div><div class="mspbar" style="width:8px;height:19px;left:168px;"></div><div class="mspbar" style="width:8px;height:12px;left:176px;"></div><div class="mspbar" style="width:8px;height:11px;left:184px;"></div><div class="mspbar now" style="height:25px;left:154px;"></div></div>
<div class="clear-both"></div></div></div></div></body><html xmlns='http://jabber.org/protocol/xhtml-im'><body xmlns='http://www.w3.org/1999/xhtml'><div class="onebox ob-message"><a class="roomname" href="/transcript/message/263372#263372"><span title="2010-11-04 19:20:23Z">1 hour ago</span></a>, by <span class="user-name">Fosco</span><br />
<div class="quote">
<div class="room-mini"><div class="room-mini-header">
<h3><img class="small-site-logo" title="Gaming" alt="Gaming" width="16" height="16" src="http://sstatic.net/gaming/img/favicon.ico" /> <span class="room-name"><a href="http://chat.stackexchange.com/rooms/28/minecraft-talk">Minecraft Talk</a></span></h3>
<div class="room-mini-description">Everything Minecraft, including classic and survival mode</div>
</div>
<div class="room-current-user-count" title="current users">9</div>
<div class="mspark" style="height:25px;width:205px">
<div class="mspbar" style="width:8px;height:13px;left:0px;"></div>
<div class="mspbar" style="width:8px;height:9px;left:8px;"></div>
<div class="mspbar" style="width:8px;height:2px;left:16px;"></div>
<div class="mspbar" style="width:8px;height:8px;left:24px;"></div>
<div class="mspbar" style="width:8px;height:1px;left:32px;"></div>
<div class="mspbar" style="width:8px;height:1px;left:56px;"></div>
<div class="mspbar" style="width:8px;height:0px;left:64px;"></div>
<div class="mspbar" style="width:8px;height:0px;left:88px;"></div>
<div class="mspbar" style="width:8px;height:0px;left:96px;"></div>
<div class="mspbar" style="width:8px;height:11px;left:104px;"></div><div class="mspbar" style="width:8px;height:7px;left:112px;"></div><div class="mspbar" style="width:8px;height:7px;left:120px;"></div><div class="mspbar" style="width:8px;height:25px;left:128px;"></div><div class="mspbar" style="width:8px;height:14px;left:136px;"></div>
<div class="mspbar" style="width:8px;height:4px;left:144px;"></div>
<div class="mspbar" style="width:8px;height:7px;left:152px;"></div>
<div class="mspbar" style="width:8px;height:19px;left:160px;"></div>
<div class="mspbar" style="width:8px;height:19px;left:168px;"></div><div class="mspbar" style="width:8px;height:12px;left:176px;"></div>
<div class="mspbar" style="width:8px;height:11px;left:184px;"></div>
<div class="mspbar now" style="height:25px;left:154px;"></div>
</div>
<div class="clear-both"></div>
</div>
</div>
</div>
</body></html></message>
我收到的信息是一个引用聊天室链接的消息。
Ruby 给出的错误如下:
IOError: stream closed
/usr/lib/ruby/1.8/xmpp4r/stream.rb:594:in `empty?'
/usr/lib/ruby/1.8/rexml/parsers/baseparser.rb:153:in `empty?'
/usr/lib/ruby/1.8/rexml/parsers/baseparser.rb:193:in `pull'
/usr/lib/ruby/1.8/rexml/parsers/sax2parser.rb:92:in `parse'
/usr/lib/ruby/1.8/xmpp4r/streamparser.rb:79:in `parse'
/usr/lib/ruby/1.8/xmpp4r/stream.rb:75:in `start'
/usr/lib/ruby/1.8/xmpp4r/stream.rb:72:in `initialize'
/usr/lib/ruby/1.8/xmpp4r/stream.rb:72:in `new'
/usr/lib/ruby/1.8/xmpp4r/stream.rb:72:in `start'
/usr/lib/ruby/1.8/xmpp4r/connection.rb:119:in `start'
/usr/lib/ruby/1.8/xmpp4r/component.rb:70:in `start'
/usr/lib/ruby/1.8/xmpp4r/connection.rb:77:in `connect'
/usr/lib/ruby/1.8/xmpp4r/component.rb:47:in `connect'
./classes/SOXMPP_Bridge.rb:20:in `initialize'
./soxmpp.rb:81:in `new'
./soxmpp.rb:81
最后,我的问题!
考虑到发送无效的XML到XMPP服务器会将我踢出,有没有办法使用Ruby在将其发送到XMPP服务器之前验证(并且最好是 更正)XML?很可能,更正它将是我为每种情况编写额外代码的问题,其中Tidy未生成有效XML,但我至少想阻止脚本崩溃。那么,在将XML发送到XMPP服务器之前,如何验证XML?