T
:'men shirt team brienne funny sarcasm shirt features graphic tees mugs babywear much real passion brilliant design detailed illustration strong appreciation things creative br shop thousands designs found across different shirt babywear mugs funny pop culture abstract witty many designs brighten day well day almost anyone else meet ul li quality short sleeve crew neck shirts 100 cotton soft durable comfortable feel fit standard size doubt l xl available li li sustainability label company conceived belief textiles industry start acting lot responsibly made cotton li li clothing printed using state art direct garment equipment crack peel washed li li graphic tee designs professionally printed unique design look great make someone smile funny cute vintage expressive artwork li ul'
我突出了上面字符串的一部分,因为上面是字符串的预处理版本,因此可能很难阅读。
我得到以下值:
fuzz.partial_ratio('short sleeve', T) 给出 50
fuzz.partial_ratio('long sleeve', T) 给出 73
fuzz.partial_ratio('dsfsdf sleeve', T) 给出 62
fuzz.partial_ratio('sleeve', T) 给出 50
我对此感到非常困惑。难道第一和第四个值不应该是100吗?我肯定错过了什么,但我无法找出原因。
编辑:这里是卸载 python-Levenshtein 库后运行的另一个示例:
'first succeed way wife told v 2 long sleeve shirt id 1084 first succeed way wife told v 2 long sleeve shirt design printed quality 100 long sleeve cotton shirt sports gray 90 cotton 10 polyester standard long sleeve shirts fashion fit tight fitting style please check size chart listed additional image feel free contact us first sizing questions satisfaction 100 guaranteed shirts usually ship business day ordered noon est next business day ordered noon est long sleeve shirts 100 cotton standard shirt fashion fit combined shipping multiple items'
fuzz.partial_ratio('long sleeve', T) 给出 27
fuzz.partial_ratio('short sleeve', T) 给出 33
fuzz.partial_ratio('sleeveless', T) 给出 40
fuzz.partial_ratio('dsfasd sleeve', T) 给出 23
不幸的是,这个问题似乎不仅限于 python-Levenshtein 库。
fuzzy
中的任何字符串进行比较,这不是搜索操作,而是“多接近”这些字符串之间的操作。 - Iluvatar