我有一个Java程序,逐行读取文件,并尝试将每一行与四个正则表达式之一进行匹配。根据匹配的表达式不同,程序执行不同的操作。以下是我的代码:
private void processFile(ArrayList<String> lines) {
ArrayList<Component> Components = new ArrayList<>();
Pattern pattern = Pattern.compile(
"Object name\\.{7}: (.++)|"
+ "\\{CAT=([^\\}]++)\\}|"
+ "\\{CODE=([^\\}]++)\\}|"
+ "\\{DESC=([^\\}]++)\\}");
Matcher matcher;
// Go through each line and see if the line matches the any of the regexes
// defined
Component currentComponent = null;
for (String line : lines) {
matcher = pattern.matcher(line);
if (matcher.find()) {
// We found a tag. Find out which one
String match = matcher.group();
if (match.startsWith("Obj")) {
// We've got the object name
if (currentComponent != null) {
Components.add(currentComponent);
}
currentComponent = new Component();
currentComponent.setName(matcher.group(1));
} else if (currentComponent != null) {
if (match.startsWith("{CAT")) {
currentComponent.setCategory(matcher.group(2));
} else if (match.startsWith("{CODE")) {
currentComponent.setOrderCode(matcher.group(3));
} else if (match.startsWith("{DESC")) {
currentComponent.setDescription(matcher.group(4));
}
}
}
}
if (currentComponent != null) {
Components.add(currentComponent);
}
}
如您所见,我已将这四个正则表达式合并为一个,并将整个正则表达式应用于该行。如果找到匹配项,则检查字符串的开头以确定匹配了哪个表达式,然后从组中提取数据。如果有人有兴趣运行代码,则以下是一些示例数据:
Object name.......: PMF3800SN
Last modified.....: Wednesday 9 November 2011 11:55:04 AM
File offset (hex).: 00140598 (Hex).
Checksum (hex)....: C1C0 (Hex).
Size (bytes)......: 1,736
Properties........: {*DEVICE}
{PREFIX=Q}
{*PROPDEFS}
{PACKAGE="PCB Package",PACKAGE,1,SOT-323 MOSFET}
{*INDEX}
{CAT=Transistors}
{SUBCAT=MOSFET}
{MFR=NXP}
{DESC=N-channel TrenchMOS standard level FET with ESD protection}
{CODE=1894711}
{*COMPONENT}
{PACKAGE=SOT-323 MOSFET}
*PINOUT SOT-323 MOSFET
{ELEMENTS=1}
{PIN "D" = D}
{PIN "G" = G}
{PIN "S" = S}
尽管我的代码可以运行,但我不喜欢在调用startsWith例程时重复字符串的部分。
我很好奇其他人会如何编写这个代码。
Amr