使用JavaScript检测单个Unicode字符的支持情况

30

能否检测客户端是否支持特定的Unicode字符,或者它是否会被呈现为缺失字形框?

重要提示:在尽可能多的浏览器中支持。

不重要的是:效率、速度或优雅。

我能想到的唯一方法是使用画布,所以在开始之前,我想询问一下。

这并不是用于公共网站;我只是试图编制每个浏览器支持的字符列表。


2
为什么这个问题是社区维基? - Pekka
我没有意识到将问题标记为社区维基会有缺点。我的错误。 - i-g
1
浏览器显示的字符集更多地取决于用户安装的字体,而不是浏览器本身。几乎所有浏览器都支持Unicode,大多数字符不需要任何特殊处理。 - Brian Campbell
相关: "Unicode符号和操作系统/浏览器字体支持" https://dev59.com/MlUK5IYBdhLWcg3w9zyJ?noredirect=1#comment89336916_51042771 - brillout
5个回答

9

这只是一个大胆的想法,不是一个真正的答案:

如果你能找到一个字符,你知道它总是会呈现为一个缺失字形框,你可以使用与这个javascript字体检测器相同的技术——将字符和缺失字形框渲染到屏幕外并比较它们的宽度。如果它们不同,那么你就知道该字符没有呈现为缺失字形框。当然,这对于固定宽度字体来说根本行不通,而且对于其他很多字符宽度相同的字体来说可能会有很多固定的负面影响。


1
谢谢!这非常有帮助。当然,对于与缺失字形框相同宽度和高度的任何字符都不起作用,但这是朝着正确方向迈出的一步。 - i-g
3
这种方法并不适用于每个字符,但如果你增加字体大小,应该可以得到良好的结果。我仍然喜欢这个答案...有些奇怪但可能会有效 :-) - TheHippo
@Hippo - 这是一个很好的观点:由于字体是在屏幕外呈现的,所以你可以把它们做得非常非常大。 - Annie
1
如果你能找到一个字符,你知道它总是会呈现为一个缺失字形框(missing glyph box),那么 U+FFFEU+FFFF 正好做到了这一点,因为它们保证不是有效的 Unicode 字符。 - Donald Duck

7
您可以使用画布来检查某个字符是否与您知道不支持的字符呈现相同。选择 U+FFFF 作为比较字符是一个好选择,因为它是无效的 Unicode 字符,保证不是有效的 Unicode 字符
因此,您创建一个画布,用于呈现 U+FFFF 字符,并创建另一个画布,用于呈现要测试的字符。然后通过使用 toDataURL 方法比较两个画布的数据 URL 来比较两个画布。如果两个画布相同,则测试字符与不支持的 U+FFFF 字符呈现相同,从而意味着不支持该字符;如果两个画布不同,则测试字符未以与不支持字符相同的方式呈现,因此该字符得到支持。
以下代码实现了这一过程:

//The first argument is the character you want to test, and the second argument is the font you want to test it in.
//If the second argument is left out, it defaults to the font of the <body> element.
//The third argument isn't used under normal circumstances, it's just used internally to avoid infinite recursion.
function characterIsSupported(character, font = getComputedStyle(document.body).fontFamily, recursion = false){
    //Create the canvases
    let testCanvas = document.createElement("canvas");
    let referenceCanvas = document.createElement("canvas");
    testCanvas.width = referenceCanvas.width = testCanvas.height = referenceCanvas.height = 150;

    //Render the characters
    let testContext = testCanvas.getContext("2d");
    let referenceContext = referenceCanvas.getContext("2d");
    testContext.font = referenceContext.font = "100px " + font;
    testContext.fillStyle = referenceContext.fillStyle = "black";
    testContext.fillText(character, 0, 100);
    referenceContext.fillText('\uffff', 0, 100);
    
    //Firefox renders unsupported characters by placing their character code inside the rectangle making each unsupported character look different.
    //As a workaround, in Firefox, we hide the inside of the character by placing a black rectangle on top of it.
    //The rectangle we use to hide the inside has an offset of 10px so it can still see part of the character, reducing the risk of false positives.
    //We check for Firefox and browers that behave similarly by checking if U+FFFE is supported, since U+FFFE is, just like U+FFFF, guaranteed not to be supported.
    if(!recursion && characterIsSupported('\ufffe', font, true)){
        testContext.fillStyle = referenceContext.fillStyle = "black";
        testContext.fillRect(10, 10, 80, 80);
        referenceContext.fillRect(10, 10, 80, 80);
    }

    //Check if the canvases are identical
    return testCanvas.toDataURL() != referenceCanvas.toDataURL();
}

//Examples
console.log("a is supported: " + characterIsSupported('a'));    //Returns true, 'a' should be supported in all browsers
console.log("\ufffe is supported: " + characterIsSupported('\ufffe'));    //Returns false, U+FFFE is guaranteed to be unsupported just like U+FFFF
console.log("\u2b61 is supported: " + characterIsSupported('\u2b61'));    //Results vary depending on the browser. At the time of writing this, this returns true in Chrome on Windows and false in Safari on iOS.
console.log("\uf8ff is supported: " + characterIsSupported('\uf8ff'));    //The unicode Apple logo is only supported on Apple devices, so this should return true on Apple devices and false on non-Apple devices.


4

不确定未来是否可靠(浏览器可能会更改未支持字符的显示内容),也不确定这是否被优化了(因为我对要测量的理想边界没有很好的理解),但以下方法(在画布中绘制文本并将结果检查为图像)如果经过审查,可能比检查宽度提供更可靠和准确的检查。所有开头的代码都是浏览器检测,由于无法进行功能检测,我们必须使用它。

(function () {

// http://www.quirksmode.org/js/detect.html
var BrowserDetect = {
    init: function () {
        this.browser = this.searchString(this.dataBrowser) || "An unknown browser";
        this.version = this.searchVersion(navigator.userAgent)
            || this.searchVersion(navigator.appVersion)
            || "an unknown version";
        this.OS = this.searchString(this.dataOS) || "an unknown OS";
    },
    searchString: function (data) {
        for (var i=0;i<data.length;i++) {
            var dataString = data[i].string;
            var dataProp = data[i].prop;
            this.versionSearchString = data[i].versionSearch || data[i].identity;
            if (dataString) {
                if (dataString.indexOf(data[i].subString) != -1)
                    return data[i].identity;
            }
            else if (dataProp)
                return data[i].identity;
        }
    },
    searchVersion: function (dataString) {
        var index = dataString.indexOf(this.versionSearchString);
        if (index == -1) return;
        return parseFloat(dataString.substring(index+this.versionSearchString.length+1));
    },
    dataBrowser: [
        {
            string: navigator.userAgent,
            subString: "Chrome",
            identity: "Chrome"
        },
        {   string: navigator.userAgent,
            subString: "OmniWeb",
            versionSearch: "OmniWeb/",
            identity: "OmniWeb"
        },
        {
            string: navigator.vendor,
            subString: "Apple",
            identity: "Safari",
            versionSearch: "Version"
        },
        {
            prop: window.opera,
            identity: "Opera",
            versionSearch: "Version"
        },
        {
            string: navigator.vendor,
            subString: "iCab",
            identity: "iCab"
        },
        {
            string: navigator.vendor,
            subString: "KDE",
            identity: "Konqueror"
        },
        {
            string: navigator.userAgent,
            subString: "Firefox",
            identity: "Firefox"
        },
        {
            string: navigator.vendor,
            subString: "Camino",
            identity: "Camino"
        },
        {       // for newer Netscapes (6+)
            string: navigator.userAgent,
            subString: "Netscape",
            identity: "Netscape"
        },
        {
            string: navigator.userAgent,
            subString: "MSIE",
            identity: "Explorer",
            versionSearch: "MSIE"
        },
        {
            string: navigator.userAgent,
            subString: "Gecko",
            identity: "Mozilla",
            versionSearch: "rv"
        },
        {       // for older Netscapes (4-)
            string: navigator.userAgent,
            subString: "Mozilla",
            identity: "Netscape",
            versionSearch: "Mozilla"
        }
    ],
    dataOS : [
        {
            string: navigator.platform,
            subString: "Win",
            identity: "Windows"
        },
        {
            string: navigator.platform,
            subString: "Mac",
            identity: "Mac"
        },
        {
               string: navigator.userAgent,
               subString: "iPhone",
               identity: "iPhone/iPod"
        },
        {
            string: navigator.platform,
            subString: "Linux",
            identity: "Linux"
        }
    ]

};
BrowserDetect.init();


/**
* Checks whether a given character is supported in the specified font. If the
*   font argument is not provided, it will default to sans-serif, the default
*   of the canvas element
* @param {String} chr Character to check for support
* @param {String} [font] Font Defaults to sans-serif
* @returns {Boolean} Whether or not the character is visually distinct from characters that are not supported
*/
function characterInFont (chr, font) {
    var data,
        size = 10, // We use 10 to confine results (could do further?) and minimum required for 10px
        x = 0, 
        y = size,
        canvas = document.createElement('canvas'),
        ctx = canvas.getContext('2d');
    // Necessary?
    canvas.width = size;
    canvas.height = size;

    if (font) { // Default of canvas is 10px sans-serif
        font = size + 'px ' + font; // Fix size so we can test consistently
        /**
        // Is there use to confining by this height?
        var d = document.createElement("span");
        d.font = font;
        d.textContent = chr;
        document.body.appendChild(d);
        var emHeight = d.offsetHeight;
        document.body.removeChild(d);
        alert(emHeight); // 19 after page load on Firefox and Chrome regardless of canvas height
        //*/
    }

    ctx.fillText(chr, x, y);
    data = ctx.getImageData(0, 0, ctx.measureText(chr).width, canvas.height).data; // canvas.width
    data = Array.prototype.slice.apply(data);

    function compareDataToBox (data, box, filter) {
        if (filter) { // We can stop making this conditional if we confirm the exact arrays will continue to work, or otherwise remove and rely on safer full arrays
            data = data.filter(function (item) {
                return item != 0;
            });
        }
        return data.toString() !== box;
    }

    var missingCharBox;
    switch (BrowserDetect.browser) {
        case 'Firefox': // Draws nothing
            missingCharBox = '';
            break;
        case 'Opera':
            //missingCharBox = '0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,197,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0,73,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,197,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0,73,0,0,0,0';
            missingCharBox = '197,255,255,255,255,73,36,36,36,36,36,36,36,36,197,255,255,255,255,73';
            break;
        case 'Chrome':
            missingCharBox = '2,151,255,255,255,255,67,2,26,2,26,2,26,2,26,2,26,2,26,2,26,2,26,2,151,255,255,255,255,67';
            break;
        case 'Safari':
            missingCharBox = '17,23,23,23,23,5,52,21,21,21,21,41,39,39,39,39,39,39,39,39,63,40,40,40,40,43';
            break;
        default:
            throw 'characterInFont() not tested successfully for this browser';
    }
    return compareDataToBox(data, missingCharBox, true);
}

// EXPORTS
((typeof exports !== 'undefined') ? exports : this).characterInFont = characterInFont;

}());

var r1 = characterInFont('a', 'Arial'); // true
var r2 = characterInFont('\uFAAA', 'Arial'); // false
alert(r1);
alert(r2);

更新 1

我试图更新到现代Firefox(以尝试检查画布中的期望十六进制数字),并检查以确保与上面的代码不同,画布(和匹配它的模式)恰好足够容纳每个context.measureText()的最宽字符(从我的测试中为U+0BCC,但根据字体而定,在我的案例中是“Arial Unicode MS”)。然而,根据https://bugzilla.mozilla.org/show_bug.cgi?id=442133#c9measureText目前错误地响应了未知字符的缩放。现在,如果只能在JavaScript画布中模拟缩放以影响这些测量值(仅限于那些测量值)…

参考代码请访问https://gist.github.com/brettz9/1f061bb2ce06368db3e5


Brett的解决方案在Firefox中不再适用了,因为现在当它找不到字符时会在一个框里显示十六进制Unicode码点。 - user3761609

-2

您可以使用charCodeAt()方法评估每个字符。这将返回Unicode字符值。根据您的需求,您可以限制要接受为“有效”字符的范围...如果您复制了在“框”中的字符,则可以在网络上使用字符转换器查看相应的Unicode值。

这是我找到的一个:输入链接说明


-4

如果您想最大化浏览器支持,可能不希望在任何情况下都依赖JavaScript。许多移动浏览器甚至不支持它。

如果浏览器不支持字符集,那么备用方案是什么?显示另一种语言的内容?也许在站点上提供在需求时切换语言的链接会更加健壮。


1
我正在尝试编制一个每个浏览器所支持的字符列表,而不是最大化公共页面的支持。 - i-g
所有主要的浏览器,包括所有主要的移动浏览器都支持Javascript。我知道这个答案有点老了,但是根据我提供的链接页面,即使在回答这个问题时也是如此。 - Donald Duck

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接