Javascript中的用户代理解析

7

我需要从用户代理字符串中提取操作系统名称和浏览器名称。

用户代理字符串示例:

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.9) Gecko/20100825 Ubuntu/9.10 (karmic) Firefox/3.6.9

如何仅获取操作系统(例如 "Linux i686""Firefox 3.6.9")?

以下是我在 fiddle 链接 中的代码:

function getBrowserAndOS(userAgent, elements) {
  var browserList = {
      'Chrome': [/Chrome\/(\S+)/],
      'Firefox': [/Firefox\/(\S+)/],
      'MSIE': [/MSIE (\S+);/],
      'Opera': [
        /Opera\/.*?Version\/(\S+)/,
        /Opera\/(\S+)/
      ],
      'Safari': [/Version\/(\S+).*?Safari\//]
    },
    re, m, browser, version;


  var osList = {
      'Windows': [/Windows\/(\S+)/],
      'Linux': [/Linux\/(\S+)/]
    },
    re2, m2, os;

  if (userAgent === undefined)
    userAgent = navigator.userAgent;

  if (elements === undefined)
    elements = 2;
  else if (elements === 0)
    elements = 1337;

  for (browser in browserList) {
    while (re = browserList[browser].shift()) {
      if (m = userAgent.match(re)) {
        version = (m[1].match(new RegExp('[^.]+(?:\.[^.]+){0,' + --elements + '}')))[0];
        //version = (m[1].match(new RegExp('[^.]+(?:\.[^.]+){0,}')))[0];
        //return browser + ' ' + version;
        console.log(browser + ' ' + version);
      }
    }
  }


  for (os in osList) {
    while (re2 = osList[os].shift()) {
      if (m2 = userAgent.match(re2)) {
        //version = (m[1].match(new RegExp('[^.]+(?:\.[^.]+){0,' + --elements + '}')))[0];
        //version = (m[1].match(new RegExp('[^.]+(?:\.[^.]+){0,}')))[0];
        //return browser + ' ' + version;
        console.log(os);
      }

    }
  }

  return null;
}

console.log(getBrowserAndOS(navigator.userAgent, 2));

我只需要提取操作系统名称和浏览器名称及其版本。如何解析以获取这些字符串?


1
单值数组有什么意义? - Ryan
@RPM 我猜意图是你可以在数组中有多个正则表达式,然后它会尝试所有的来确定你正在运行该操作系统。 - Barmar
4个回答

6

我不建议你自己去做这件事。我建议使用像Platform.js这样的解析器,它的工作方式如下:

<script src="platform.js"></script>
<script>
var os = platform.os;
var browser = platform.name + ' ' + platform.version;
</script>

1
鼓励网络程序员根据用户代理编写逻辑是一种罪恶,会导致可怕的、有缺陷的网站。 - Sophit
1
仅为获得操作系统名称和浏览器名称而使用整个库对我来说不太好。我已经解析了浏览器名称,但没有解析操作系统名称。你有任何想法如何解析以获取操作系统名称吗? - jeewan

3

这是一个用原生JavaScript编写的识别操作系统的解决方案,但是每当出现新的操作系统时,需要手动更新:

function getOs (userAgent) {

     //Converts the user-agent to a lower case string
     var userAgent = userAgent.toLowerCase();

     //Fallback in case the operating system can't be identified
     var os = "Unknown OS Platform";

     //Corresponding arrays of user-agent strings and operating systems
     match = ["windows nt 10","windows nt 6.3","windows nt 6.2","windows nt 6.1","windows nt 6.0","windows nt 5.2","windows nt 5.1","windows xp","windows nt 5.0","windows me","win98","win95","win16","macintosh","mac os x","mac_powerpc","android","linux","ubuntu","iphone","ipod","ipad","blackberry","webos"];
     result = ["Windows 10","Windows 8.1","Windows 8","Windows 7","Windows Vista","Windows Server 2003/XP x64","Windows XP","Windows XP","Windows 2000","Windows ME","Windows 98","Windows 95","Windows 3.11","Mac OS X","Mac OS X","Mac OS 9","Android","Linux","Ubuntu","iPhone","iPod","iPad","BlackBerry","Mobile"];

     //For each item in match array
     for (var i = 0; i < match.length; i++) {

              //If the string is contained within the user-agent then set the os 
              if (userAgent.indexOf(match[i]) !== -1) {
                   os = result[i];
                   break;
              }

     }

     //Return the determined os
     return os;
}

3
用户代理并不是用于提出“你是什么”的定性问题的元数据集合,它们仅仅对于如“你是否为Linux?”或“你使用的是哪个版本的Firefox?”等二元问题有用。
让我举个例子,这里有一个将用户代理转换为可爱的 JSON 可序列化对象的脚本:
parseUA = (() => {
    //useragent strings are just a set of phrases each optionally followed by a set of properties encapsulated in paretheses
    const part = /\s*([^\s/]+)(\/(\S+)|)(\s+\(([^)]+)\)|)/g;
    //these properties are delimited by semicolons
    const delim = /;\s*/;
    //the properties may be simple key-value pairs if;
    const single = [
        //it is a single comma separation,
        /^([^,]+),\s*([^,]+)$/,
        //it is a single space separation,
        /^(\S+)\s+(\S+)$/,
        //it is a single colon separation,
        /^([^:]+):([^:]+)$/,
        //it is a single slash separation
        /^([^/]+)\/([^/]+)$/,
        //or is a special string
        /^(.NET CLR|Windows)\s+(.+)$/
    ];
    //otherwise it is unparsable because everyone does it differently, looking at you iPhone
    const many = / +/;
    //oh yeah, bots like to use links
    const link = /^\+(.+)$/;

    const inner = (properties, property) => {
        let tmp;

        if (tmp = property.match(link)) {
            properties.link = tmp[1];
        }
        else if (tmp = single.reduce((match, regex) => (match || property.match(regex)), null)) {
            properties[tmp[1]] = tmp[2];
        }
        else if (many.test(property)) {
            if (!properties.properties)
                properties.properties = [];
            properties.properties.push(property);
        }
        else {
            properties[property] = true;
        }

        return properties;
    };

    return (input) => {
        const output = {};
        for (let match; match = part.exec(input); '') {
            output[match[1]] = {
                ...(match[5] && match[5].split(delim).reduce(inner, {})),
                ...(match[3] && {version:match[3]})
            };
        }
        return output;
    };
})();
//parseUA('user agent string here');

使用这个,我们可以将以下用户代理转换出来:

`Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; WOW64; Trident/4.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)`

{
    "Mozilla": {
        "compatible": true,
        "MSIE": "7.0",
        "Windows": "NT 6.0",
        "WOW64": true,
        "Trident": "4.0",
        "SLCC1": true,
        ".NET CLR": "3.0.30729",
        ".NET4.0C": true,
        ".NET4.0E": true,
        "version": "4.0"
    }
}

`Mozilla/5.0 (SAMSUNG; SAMSUNG-GT-S8500-BOUYGUES/S8500AGJF1; U; Bada/1.0; fr-fr) AppleWebKit/533.1 (KHTML, like Gecko) Dolfin/2.0 Mobile WVGA SMM-MMS/1.2.0 NexPlayer/3.0 profile/MIDP-2.1 configuration/CLDC-1.1 OPN-B`

{
    "Mozilla": {
        "SAMSUNG": true,
        "SAMSUNG-GT-S8500-BOUYGUES": "S8500AGJF1",
        "U": true,
        "Bada": "1.0",
        "fr-fr": true,
        "version": "5.0"
    },
    "AppleWebKit": {
        "KHTML": "like Gecko",
        "version": "533.1"
    },
    "Dolfin": {
        "version": "2.0"
    },
    "Mobile": {},
    "WVGA": {},
    "SMM-MMS": {
        "version": "1.2.0"
    },
    "NexPlayer": {
        "version": "3.0"
    },
    "profile": {
        "version": "MIDP-2.1"
    },
    "configuration": {
        "version": "CLDC-1.1"
    },
    "OPN-B": {}
}

`Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Comodo_Dragon/4.1.1.11 Chrome/4.1.249.1042 Safari/532.5`

{
    "Mozilla": {
        "Windows": "NT 5.1",
        "U": true,
        "en-US": true,
        "version": "5.0"
    },
    "AppleWebKit": {
        "KHTML": "like Gecko",
        "version": "532.5"
    },
    "Comodo_Dragon": {
        "version": "4.1.1.11"
    },
    "Chrome": {
        "version": "4.1.249.1042"
    },
    "Safari": {
        "version": "532.5"
    }
}

`Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36`

{
    "Mozilla": {
        "X11": true,
        "Fedora": true,
        "Linux": "x86_64",
        "version": "5.0"
    },
    "AppleWebKit": {
        "KHTML": "like Gecko",
        "version": "537.36"
    },
    "Chrome": {
        "version": "73.0.3683.86"
    },
    "Safari": {
        "version": "537.36"
    }
}

`Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0`

{
    "Mozilla": {
        "X11": true,
        "Fedora": true,
        "Linux": "x86_64",
        "rv": "66.0",
        "version": "5.0"
    },
    "Gecko": {
        "version": "20100101"
    },
    "Firefox": {
        "version": "66.0"
    }
}

`Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36`

{
    "Mozilla": {
        "X11": true,
        "Linux": "x86_64",
        "version": "5.0"
    },
    "AppleWebKit": {
        "KHTML": "like Gecko",
        "version": "537.36"
    },
    "Chrome": {
        "version": "73.0.3683.103"
    },
    "Safari": {
        "version": "537.36"
    }
}

`Mozilla/5.0 (Linux; Android 6.0.1; SM-G920V Build/MMB29K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36`

{
    "Mozilla": {
        "Linux": true,
        "Android": "6.0.1",
        "SM-G920V": "Build/MMB29K",
        "version": "5.0"
    },
    "AppleWebKit": {
        "KHTML": "like Gecko",
        "version": "537.36"
    },
    "Chrome": {
        "version": "52.0.2743.98"
    },
    "Mobile": {},
    "Safari": {
        "version": "537.36"
    }
}

`Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)`

{
    "Mozilla": {
        "iPhone": true,
        "properties": [
            "CPU iPhone OS 9_1 like Mac OS X"
        ],
        "version": "5.0"
    },
    "AppleWebKit": {
        "KHTML": "like Gecko",
        "version": "601.1.46"
    },
    "Version": {
        "version": "9.0"
    },
    "Mobile": {
        "version": "13B143"
    },
    "Safari": {
        "compatible": true,
        "AdsBot-Google-Mobile": true,
        "link": "http://www.google.com/mobile/adsbot.html",
        "version": "601.1"
    }
}

如果你展开它,你会发现,作为人类,你可以轻松读出操作系统的版本:Mozilla.Windows = NT 6.0Mozilla.Bada = 1.0Mozilla.Fedora && Mozilla.Linux = x86_64
但你看到问题了吗?没有一个说OS = "Windows"OS = "Samsung Bada"等等。
要问你想要的问题,你需要知道所有可能的值,就像@Peter Wetherall尝试的那样,或者说“我只关心这几个浏览器/操作系统”,就像你在你的问题中提到的那样。
如果这没问题,而且你不是用这些信息来改变代码的工作方式(根据@Sophit的说法,这是不应该做的),只是想显示有关浏览器的内容,那么我建议使用上面的parseUA()与手动检查Mozilla.Windows || Mozilla.Linux || //et cetera相结合,这将比尝试使用原始用户代理字符串上的正则表达式更少出错(这会导致误报:看看浏览器Comodo_Dragon中写着“Chrome”)。

确实很好。您是否有此脚本的任何更新,或者是否需要更新?或者它是未来可靠的吗? - Ciprian
1
它将继续正常工作,其解析语法非常灵活,“未来证明”的问题在于您决定查询生成的对象的方式。parseUA(navigator.userAgent) - Hashbrown

2

您是否计划根据从用户代理(UA)字符串中“嗅探”到的浏览器来控制您的网站行为?

请不要这样做,改用功能检测。

糟糕实现(非未来证明)的用户代理嗅探已被证明是每次新版本Internet Explorer发布时遇到的最大兼容性问题。因此,围绕用户代理字符串的逻辑在多年间变得越来越复杂;兼容模式的引入意味着浏览器现在有多个UA字符串,并且经过多年的滥用后,字符串的旧扩展性已被弃用。

默认情况下,在Windows 8.1上,Internet Explorer 11会发送以下用户代理字符串:

Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko

这个字符串是故意设计的,以使大多数UA字符串嗅探逻辑将其解释为Gecko或WebKit。这个设计选择是经过深思熟虑的——IE团队测试了许多UA字符串变体,以找出哪些会导致大多数网站对IE11用户“正常工作”。
这里有两个 链接可以帮助你。您还可以查看我评论的原始来源

不要把特性检测当作普遍真理。有时候别无选择,只能嗅探浏览器。例如,我现在正在处理当前版本Safari中的一个bug。背景颜色应该是透明的。Safari告诉javascript它是透明的。但很明显是白色的。(Safari没有正确应用CSS。)这在技术预览版中已经修复,但我仍然需要控制这个特定浏览器的错误行为,并且嗅探是我最好的选择。 - Arlen

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接