我目前正在开发一个应用程序,需要提取Body的innerHTML,然后将其中的文本以JSON格式提取出来。该JSON将用于翻译,然后将翻译后的JSON用作输入,创建具有翻译后文本的相同HTML标记。请参见下面的片段。
HTML输入
<section>Hello, <div>This is some text which I need to extract.<a class="link">It can be <strong> complicated.</strong></a></div><span>The extracted text should contain the html tag if it has any html tag in the span,p or a tag</span><p>Please see the <span>desired output below.</span></p>Thanks!</section>';
翻译JSON输出
{
"text1":"Hello, ",
"text2":"This is some text which I need to extract.",
"text3":"It can be <strong> complicated.</strong>",
"text4":"The extracted text should contain the html tag if it
has any html tag in the span,p or a tag",
"text5":"Please see the <span>desired output below.</span>",
"text6":"Thanks!"
}
翻译后的JSON输入
{
"text1":"Hello,-in spanish ",
"text2":"This is some text which I need to extract.-in spanish",
"text3":"It can be <strong> complicated.-in spanish</strong>",
"text4":"The extracted text should contain the html tag if it
has any html tag in the span,p or a tag-in spanish",
"text5":"Please see the <span>desired output below.-in spanish</span>",
"text6":"Thanks!-in spanish"
}
翻译后的HTML输出
<section>Hello,-in spanish <div>This is some text which I need to extract.-in spanish<a class="link">It can be <strong> complicated.-in spanish</strong></a></div><span>The extracted text should contain the html tag if it has any html tag in the span,p or a tag-in spanish</span><p>Please see the <span>desired output below.</span></p>Thanks!-in spanish</section>';
我尝试了各种正则表达式,但以下是我最终采用的其中一种,但使用它无法实现所需的输出。
//encode
const bodyHTML = '<a class="test">hello world<strong> this is gonna be hard</strong></a>';
//replace the quotes with escape quotes
const htmlContent = bodyHTML.replace(/"/g, '\\"');
let count = 0;
let translationObj = {};
let newHtml = htmlContent.replace(/\>(.*?)\</g, function(match) {
//remove the special character
match = match.replace(/\>|\</g, '');
count = count + 1;
translationObj[count] = match;
return '>~' + count + '~<';
});
const translationJSON = '{"1":"hello world in spanish","2":" this is gonna be hard in spanish","3":""}';
//decode
let trasnaltedHtml = '';
const translatedObj = JSON.parse(translationJSON)
trasnaltedHtml = newHtml.replace(/\~(.*?)\~/g, function(match) {
//remove the special character
match = match.replace(/\~|\~/g, '');
return translatedObj[match];
});
//replace the escape quotes with quotes
trasnaltedHtml = trasnaltedHtml.replace(/\\"/g, '"');
//console.log()
console.log("bodyHTML", bodyHTML);
console.log('tranlationObj', translationObj);
console.log("translationJSON", translationJSON);
console.log('newHtml', newHtml);
console.log("trasnaltedHtml", trasnaltedHtml);
<p>Click <a>here</a></p>
应被视为一个文本"Click <a>here</a>"
。我希望我澄清了所有疑虑。
提前感谢!
It can be <strong> complicated.</strong>
放在一起。 - dk111989