如何在Delphi中将简单的RichText转换为HTML标签?

8

在StackOverflow上有很多讨论,但大多数都比我需要的更复杂,而且大多是针对其他语言的。

我有一个MySQL远程数据库,在其中有一个“帮助”表格,用于填充使用该数据库的动态网站的帮助页面代码。

我决定制作一个Delphi应用程序来管理该网站,而不是通过网站本身进行操作,以提高速度和安全性。

我想放置一个TRichEdit以使帮助文本并使用诸如对齐、粗体、斜体和下划线样式等简单的事物。我不想使用图片和字体。

如何选择那些富文本样式的文本并将其转换为HTML以将其放入我的远程数据库中的BLOB字段然后再次转换为富文本,以便以后进行编辑


1
有一个商业化的、易于使用的 RTF 到 HTML / RTF 到 XHTML 转换器库,适用于 Delphi,名为 ScroogeXHTML。它可以配置为仅处理基本字体属性。(免责声明:我是这个库的作者) - mjn
1
这个库要多少钱? - NaN
你可以在这个页面找到它;-) - TLama
@EASI - 大约120欧元 :) - Gabriel
3个回答

7
如果您确实想使用 TRichEdit 生成 RTF 内容,那么最好将其生成的原生 RTF 格式与转换后的 HTML 一起存储。如果您使用 TRichEdit 的唯一原因是为了具有简单的格式化功能,那么您可能最好使用生成本机 HTML 内容的 HTML 编辑控件。

无论选择哪种方式,最好将原始格式存储供用户编辑内容,并根据需要将其转换为其他格式(而不是在两个方向上进行转换)。

如果您使用 TRichEdit,则可以轻松地在控件中流式传输 RTF 内容,但我建议使用 TJvRichEdit 而不是 TRichEdit

procedure GetRTFData(MS: TMemoryStream; RTF: TRichEdit);
begin
  MS.Clear;
  RTF.Lines.SaveToStream(MS);
  MS.Position := 0;
end;

procedure SetRTFData(MS: TMemoryStream; RTF: TRichEdit);
begin
  MS.Position := 0;
  RTF.StreamFormat := sfRichText;
  RTF.Lines.LoadFromStream(MS);
end;

手动将RTF转换为HTML并不是一项容易的任务。需要考虑Unicode字符、字体样式、字体代码、段落格式、编号列表、特殊HTML字符等等。即使您只需要支持简单的格式,用户通常会使用其他功能,这会导致转换头痛--例如从MSWord复制内容并将其粘贴到应用程序中,包括各种格式和字体样式。
JvRichEditToHtml在将RTF转换为HTML方面做得还不错,但我们最终编写了自己的转换单元,因为我们在RTF方面做了更多的工作。只要用户不通过复制/粘贴引入复杂的内容或使用键盘快捷键格式化内容(例如,项目符号=ctrl+shft+L,缩进=ctrl+M等),JvRichEditToHtml应该可以轻松处理您所描述的内容。
此外,如果您想绕过在RTF中编写并将其转换为HTML的复杂性,则还有几个很好的HTML编辑控件可供Delphi使用: Google结果::Delphi,HTML,编辑器,组件

Stack Overflow :: Delphi, HTML, Editor, Component

我们使用TRichView,因为它具有广泛的功能。它可以加载/创建RTF,并导出HTML。然而它不是免费的。如果你正在寻找免费的东西,TJvRichViewJvRichEditToHtml是很好的选择。


我曾经苦于那些HTML编辑器。它们甚至都不算是半个好的/完整的编辑器。即使商业解决方案也很粗糙。 - Gabriel

6

在尝试了许多不准确的解决方案后,我受到了这个解决方案的启发:将RTF转换为HTML和HTML转换为RTF

这个想法是,TWebBrowser控件(在设计/编辑模式下)可以正确处理并转换从剪贴板粘贴的Rich文本格式。

uses SHDocVw, MSHTML;

function ClipboardToHTML(AParent: TWinControl): WideString;
var
  wb: TWebBrowser;

  function WaitDocumentReady: Boolean;
  var
    StartTime: DWORD;
  begin
    StartTime := GetTickCount;
    while wb.ReadyState <> READYSTATE_COMPLETE do
    begin
      Application.HandleMessage;
      if GetTickCount >= StartTime + 2000 then // time-out of max 2 sec
      begin
        Result := False; // time-out
        Exit;
      end;
    end;
    Result := True;
  end;
begin
  Result := '';
  wb := TWebBrowser.Create(nil);
  try
    wb.Silent := True;
    wb.Width := 0;
    wb.Height := 0;
    wb.Visible := False;
    TWinControl(wb).Parent := AParent;
    wb.HandleNeeded;
    if wb.HandleAllocated then
    begin
      wb.Navigate('about:blank');
      (wb.Document as IHTMLDocument2).designMode := 'on';
      if WaitDocumentReady then
      begin
        (wb.Document as IHTMLDocument2).execCommand('Paste', False, 0);
        Result := (wb.Document as IHTMLDocument2).body.innerHTML;
      end;
    end;
  finally
    wb.Free;
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  RichEdit1.SelectAll;
  RichEdit1.CopyToClipboard;

  ShowMessage(ClipboardToHTML(Self));
end;

很遗憾,它的表现不太好。字体大小不相等。 - Xel Naga

1

这对我非常有用,没有使用TWebBrowser。

但是从HTML到RichEdit。

希望有人会发现它很有用。

        ////////////////////////////////////////////////////////
    //                                                    //
    //          Formatting Richedit with HTML tags        //
    //                    by Carbonize                    //
    //                                                    //
    //    This is my second Delphi project and another    //
    //    conversion of one of my Visual Basic codes.     //
    //                                                    //
    //    This code goes through a string looking for     //
    //    <xxx> style tags then formats the richedit      //
    //    according to the text in the tag. It does       //
    //    colours, italics, bold, underline, line breaks, //
    //    font face, and font size.                       //
    //                                                    //
    //    I made the original VB version as a way of      //
    //    formatting the help files in one of my programs //
    //    to make them look better and be easier to read. //
    //                                                    //
    //    Please remember I am new to Delphi so some      //
    //    of the code may be sloppy. When handling        //
    //    <Font tags I did it the long way so it could    //
    //    handle tags with spaces between the 'face' or   //
    //    'size' and the actual face/size such as         //
    //    <font name = "Comic Sans MS"> as some people    //
    //    do their HTML this way.                         //
    //                                                    //
    ////////////////////////////////////////////////////////
    //                                                    //
    //             Monday, 27th January 2003              //
    //                                                    //
    ////////////////////////////////////////////////////////



    unit Unit1;

    interface

    uses
    Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
    Dialogs, ComCtrls, StdCtrls;

    type
    TForm1 = class(TForm)
    txtHTML: TMemo;
    Button1: TButton;
    rchHTML: TRichEdit;
    procedure Button1Click(Sender: TObject);
    procedure DisplayText(Tag: string; Buf:string);
    procedure FontTags(Tag: string; Buf:string);
    private
    { Private declarations }
    public
    { Public declarations }
    end;

    var
    Form1: TForm1;

    implementation

    {$R *.dfm}

    procedure TForm1.Button1Click(Sender: TObject);
    var
            Buf : string;
            Bumf : string;
            tag : string;
    begin
    //Clear the Richedit and set default formatting
    rchHTML.text := '';
    rchHTML.SelAttributes.style := [];
    rchHTML.selattributes.color := clBlack;
    rchHTML.SelAttributes.Name := 'MS Sans Serif';
    rchHTML.SelAttributes.Size := 8;

    //strip all new line commands as it screws the code up.
    Bumf := stringreplace(txtHTML.Text, #13#10, '', [rfReplaceAll]);

    //if there's no '<' then there's no tags so display whole string.
    if pos('<', Bumf) = 0 then
    begin   //but first convert any <> replacement strings.
            Bumf := stringreplace(Bumf, '&lt;', '<', [rfReplaceAll]);
            Bumf := stringreplace(Bumf, '&gt;', '>', [rfReplaceAll]);
            Bumf := stringreplace(Bumf, '&amp;', '&', [rfReplaceAll]);
            rchHTML.SelText := Bumf;
            exit;
    end;

    //else thats display all text before the '<'.
    //But first convert any replacements that are in there.
    Buf := copy(Bumf, 0, pos('<', Bumf) - 1);
    Buf := stringreplace(Buf, '&lt;', '<', [rfReplaceAll]);
    Buf := stringreplace(Buf, '&gt;', '>', [rfReplaceAll]);
    Buf := stringreplace(Buf, '&amp;', '&', [rfReplaceAll]);
    rchHTML.SelText := Buf;

    //then strip all text before the '<'
    delete(Bumf, 1 ,pos('<', Bumf) - 1);

    //If there's no '>' then it's not a tag so just post it all
    If pos('>', Bumf) = 0 then
    begin
            rchHTML.SelText := Bumf;
            exit;
    end;

    //else we need to parse any and all tags and the strings to post
    While length(Bumf) > 0 do
    begin
    //the tag := all text between '<' and '>'
    Tag := copy(Bumf, 2, pos('>', Bumf) - 2);
    //the text we will post is everything after the '>'
    Buf := copy(Bumf, pos('>', Bumf) + 1, length(Bumf) - pos('>', Bumf));
    //Empty Bumf
    Bumf := '';
    //Are there any '<'s in the tag?
    while pos('<', tag) > 0 do
    begin  //if so then post all text before the'<' as it's not part of a tag
            rchhtml.SelText := '<' + copy(Tag, 1, pos('<', Tag) - 1);
            //tag then := all text from the '<'
            Tag := copy(Tag, pos('<', Tag) + 1, length(Tag) - pos('<', Tag));
    End;
    //if there's a '<' in Buf then there may be another Tag
    If pos('<', Buf) > 0 then
    begin   //So we make Bumf := everything after the '<'
            Bumf := copy(Buf, pos('<', Buf), (length(Buf) - pos('<', Buf)) + 1);
            //And buf := everything before it
            Buf := copy(Buf, 1, pos('<', Buf) - 1);
    end;
    //now we pass the tag and the buf text to our text formatting procedure
    DisplayText(Tag, Buf);
    end;

    end;

    procedure TForm1.DisplayText(Tag: string; Buf:string);
    begin
    //There is a problem where if buf = '' the richedit attributes didn't get set
    //so I included this shoddy fix.
    //If you know why this bug happens please let me know.
    If Buf = '' then Buf := #12;

    //in case we want to actually show a tag or the markers used for < and >
    Buf := stringreplace(Buf, '&lt;', '<', [rfReplaceAll]);
    Buf := stringreplace(Buf, '&gt;', '>', [rfReplaceAll]);
    Buf := stringreplace(Buf, '&amp;', '&', [rfReplaceAll]);
    Tag := stringreplace(Tag, '&lt;', '<', [rfReplaceAll]);
    Tag := stringreplace(Tag, '&gt;', '>', [rfReplaceAll]);
    Tag := stringreplace(Tag, '&amp;', '&', [rfReplaceAll]);

    //if it's a font tag then send it to font handling
    If copy(lowercase(Tag), 0, 5) = 'font ' then
    begin
    FontTags(Tag, Buf);
    exit;
    end;

    //go through all known tags, formatting richedit as appropriate
    if lowercase(Tag) = 'red' then
            rchHTML.SelAttributes.Color := clRed
    else if lowercase(Tag) = 'black' then
            rchHTML.SelAttributes.Color := clBlack
    else if lowercase(Tag) = 'blue' then
            rchHTML.SelAttributes.Color := clBlue
    else if lowercase(Tag) = 'cyan' then
            rchHTML.SelAttributes.Color := clAqua
    else if ((lowercase(Tag) = 'gray') or (lowercase(Tag) = 'grey')) then
            rchHTML.SelAttributes.Color := clGray
    else if lowercase(Tag) = 'green' then
            rchHTML.SelAttributes.Color := clGreen
    else if lowercase(Tag) = 'pink' then
            rchHTML.SelAttributes.Color := clFuchsia
    else if lowercase(Tag) = 'purple' then
            rchHTML.SelAttributes.Color := clPurple
    else if lowercase(Tag) = 'yellow' then
            rchHTML.SelAttributes.Color := clYellow
    else if lowercase(Tag) = 'b' then
            rchHTML.SelAttributes.Style := rchHTML.SelAttributes.Style + [fsBold]
    else if lowercase(Tag) = '/b' then
            rchHTML.SelAttributes.Style := rchHTML.SelAttributes.Style - [fsBold]
    else if lowercase(Tag) = 'i' then
            rchHTML.SelAttributes.Style := rchHTML.SelAttributes.Style + [fsItalic]
    else if lowercase(Tag) = '/i' then
            rchHTML.SelAttributes.Style := rchHTML.SelAttributes.Style - [fsItalic]
    else if lowercase(Tag) = 'u' then
            rchHTML.SelAttributes.Style := rchHTML.SelAttributes.Style + [fsUnderline]
    else if lowercase(Tag) = '/u' then
            rchHTML.SelAttributes.Style := rchHTML.SelAttributes.Style - [fsUnderline]
    else if lowercase(Tag) = 'br' then
            Buf := #13#10 + Buf        
    //If it's an unknown tag then display the tag
    else Buf := '<' + Tag + '>' + Buf;

    //Now we've set the attributes we can display the text.
    rchHTML.SelText := Buf;
    end;

    procedure TForm1.FontTags(Tag: string; Buf:string);
    var
            a: integer;
            tag2: String;
    begin
    //we know it's a font tag so strip the 'font '
    Delete(Tag, 1, 5);

    //lets see if the want to set the font face
    If pos('face', lowercase(Tag)) > 0 then
    begin   //get the position of 'face'
            a := pos('face', lowercase(Tag));
            //set our temporary string to := all text from 'face' to the end.
            tag2 := copy(Tag, a, length(Tag) - (a - 1));
            //Then get position of the opening".
            a := pos('"', Tag2) + 1;
            //set our temporary string to := all text from " + 1 to the end.
            Tag2 := copy(Tag2, a, length(Tag2) - (a - 1));
            //Then locate the closing "
            a := pos('"', Tag2) - 1;
            //then make tag2 = the text between " and "
            Tag2 := copy(Tag2, 1, a);
            //Now set the font name to the chosen one.
            rchHTML.SelAttributes.Name := Tag2;
    end;

    //Now check if they want to set the fonts size.
    If pos('size', lowercase(Tag)) > 0 then
    begin   //get the position of 'size'
            a := pos('size', lowercase(Tag));
            //Make temporary string := all text from 'size' to end of the tag
            tag2 := copy(Tag, a, length(Tag) - (a - 1));
            //get position of opening "
            a := pos('"', Tag2) + 1;
            //make tag2 := all text from " + 1 to the end.
            Tag2 := copy(Tag2, a, length(Tag2) - (a - 1));
            //get position of the closing "
            a := pos('"', Tag2) - 1;
            //make tag2 := text between " and "
            Tag2 := copy(Tag2, 1, a);
            //set the fonts size
            rchHTML.SelAttributes.Size := strtoint(Tag2);
    end;

    //Now we've formatted we can display the text.
    rchHTML.seltext := Buf;

    end;

    end.

网站宕机了。你还有代码吗? - Xel Naga
1
更新了。添加了代码。只需添加一个RichEdit、Memo和一个按钮。 - Zen Of Kursat
谢谢,但这会将HTML转换为RichEdit。我不会使用它,因为现在HTML包含了大量的CSS。 - Xel Naga

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接