"mk_MK"区域设置的排序器基于sun.text.resources.mk.CollationData_mk
资源(jdk8u92-b14标记的jdk8u存储库中的CollationData_mk.java源代码)。
CollationData_mk
中的排序规则明确将“ѓ”放在“г”后面,“ќ”放在“к”后面。
由于可以使用自定义规则创建RuleBasedCollator
,因此获得所需排序顺序的最简单方法是稍微修改来自CollationData_mk
的规则:
public static Collator createMacedonianCollator() throws ParseException {
String DEFAULTRULES = "";
return new RuleBasedCollator( DEFAULTRULES +
"< \u0430 , \u0410" +
"< \u0431 , \u0411" +
"< \u0432 , \u0412" +
"< \u0433 , \u0413" +
"; \u0491 , \u0490" +
"; \u0495 , \u0494" +
"; \u0493 , \u0492" +
"< \u0434 , \u0414" +
"< \u0453 , \u0403" +
"< \u0452 , \u0402" +
"< \u0435 , \u0415" +
"; \u04bd , \u04bc" +
"; \u0451 , \u0401" +
"; \u04bf , \u04be" +
"< \u0454 , \u0404" +
"< \u0436 , \u0416" +
"; \u0497 , \u0496" +
"; \u04c2 , \u04c1" +
"< \u0437 , \u0417" +
"; \u0499 , \u0498" +
"< \u0455 , \u0405" +
"< \u0438 , \u0418" +
"< \u0456 , \u0406" +
"; \u04c0 " +
"< \u0457 , \u0407" +
"< \u0439 , \u0419" +
"< \u0458 , \u0408" +
"< \u043a , \u041a" +
"; \u049f , \u049e" +
"; \u04c4 , \u04c3" +
"; \u049d , \u049c" +
"; \u04a1 , \u04a0" +
"; \u049b , \u049a" +
"< \u043b , \u041b" +
"< \u0459 , \u0409" +
"< \u043c , \u041c" +
"< \u043d , \u041d" +
"; \u0463 " +
"; \u04a3 , \u04a2" +
"; \u04a5 , \u04a4" +
"; \u04bb , \u04ba" +
"; \u04c8 , \u04c7" +
"< \u045a , \u040a" +
"< \u043e , \u041e" +
"; \u04a9 , \u04a8" +
"< \u043f , \u041f" +
"; \u04a7 , \u04a6" +
"< \u0440 , \u0420" +
"< \u0441 , \u0421" +
"; \u04ab , \u04aa" +
"< \u0442 , \u0422" +
"; \u04ad , \u04ac" +
"< \u045b , \u040b" +
"< \u045c , \u040c" +
"< \u0443 , \u0423" +
"; \u04af , \u04ae" +
"< \u045e , \u040e" +
"< \u04b1 , \u04b0" +
"< \u0444 , \u0424" +
"< \u0445 , \u0425" +
"; \u04b3 , \u04b2" +
"< \u0446 , \u0426" +
"; \u04b5 , \u04b4" +
"< \u0447 , \u0427" +
"; \u04b7 ; \u04b6" +
"; \u04b9 , \u04b8" +
"; \u04cc , \u04cb" +
"< \u045f , \u040f" +
"< \u0448 , \u0428" +
"< \u0449 , \u0429" +
"< \u044a , \u042a" +
"< \u044b , \u042b" +
"< \u044c , \u042c" +
"< \u044d , \u042d" +
"< \u044e , \u042e" +
"< \u044f , \u042f" +
"< \u0461 , \u0460" +
"< \u0462 " +
"< \u0465 , \u0464" +
"< \u0467 , \u0466" +
"< \u0469 , \u0468" +
"< \u046b , \u046a" +
"< \u046d , \u046c" +
"< \u046f , \u046e" +
"< \u0471 , \u0470" +
"< \u0473 , \u0472" +
"< \u0475 , \u0474" +
"; \u0477 , \u0476" +
"< \u0479 , \u0478" +
"< \u047b , \u047a" +
"< \u047d , \u047c" +
"< \u047f , \u047e" +
"< \u0481 , \u0480"
);
}
规则可以进一步简化,只包含31个基本字母而不包括重音变体。
collator.getRules()
,它将'ѓ'放在'd'之前,因此问题出现在Java的马其顿语排序器中。RuleBasedCollator
的Javadoc解释了如何创建自己带有自己规则的排序器,如果你真的需要它的话。 - Oleg Estekhin