正如其他人所述,除非您可以在适当的位置包括换行符的原始数据,否则下一个最好的方法是获取关键名称列表。
我假设其他60K行具有与您提供的示例行相同的关键名称?如果是这样,如果没有人能够提供给您该列表,则手动(而不是编程方式)识别关键名称似乎是唯一的方法。
我自己尝试过。做起来并不太难(最多几分钟),但可能仍然需要有知识的人确认关键字列表是否正确。
一旦您获得了列表,那么您就可以通过关键字进行分割,然后将它们重新组合成一个新列表:
string rawData =
"mc_gross=22.99invoice=ff1ca57d9fa80cf93e6b300dd7f063e1protection_eligibility=Ineligibleaddress_status=confirmedpayer_id=SGA8X3TX9HCVYtax=0.00address_street=155 5th ave sepayment_date=16:08:28 Nov 15, 2010 PSTpayment_status=Completedcharset=windows-1252address_zip=98045first_name=jackobmc_fee=1.08address_country_code=USaddress_name=john martinnotify_version=3.0custom=ff1ca5asdf7d9fa80cf93e6b300dd7f063e1payer_status=unverifiedbusiness=gold-me@hotmail.comaddress_country=United Statesaddress_city=north bendquantity=1verify_sign=AZussRXZRkuk7frhfirfxxTkj0BDJGA2dJF3eF263eEsjLixS.xRxCzfaYLpayer_email=me@gmail.comtxn_id=4DU53818WJ271531Mpayment_type=instantlast_name=Martinaddress_state=WAreceiver_email=cravbill@hotmail.compayment_fee=1.08receiver_id=QG8JPB4RZJGG4txn_type=web_acceptitem_name=Some item of consequenceSpecifiemc_currency=USDitem_number=G10W151residence_country=UShandling_amount=0.00transaction_subject=ff1ca57d9fad80cf93e6b300dd7f063e1payment_gross=22.99shipping=0.00";
string[] keys = {
"mc_gross", "invoice", "protection_eligibility", "address_status", "payer_id", "tax",
"address_street", "payment_date", "payment_status", "charset", "address_zip",
"first_name", "mc_fee", "address_country_code", "address_name", "notify_version",
"custom", "payer_status", "business", "address_country", "address_city", "quantity",
"verify_sign", "payer_email", "txn_id", "payment_type", "last_name", "address_state",
"receiver_email", "payment_fee", "receiver_id", "txn_type", "item_name",
"mc_currency", "item_number", "residence_country", "handling_amount",
"transaction_subject", "payment_gross", "shipping"
};
string[] values = rawData.Split(keys, StringSplitOptions.RemoveEmptyEntries);
IEnumerable<string> parsedList = keys.Zip(values, (key, value) => key + value);
foreach (string item in parsedList)
{
Console.WriteLine(item);
}
这将以以下格式输出数据:
mc_gross=22.99
invoice=ff1ca57d9fa80cf93e6b300dd7f063e1
protection_eligibility=Ineligible
address_status=confirmed
payer_id=SGA8X3TX9HCVY
tax=0.00
address_street=155 5th ave se
payment_date=16:08:28 Nov 15, 2010 PST
payment_status=Completed
charset=windows-1252
address_zip=98045
first_name=jackob
mc_fee=1.08
address_country_code=US
address_name=john martin
notify_version=3.0
custom=ff1ca5asdf7d9fa80cf93e6b300dd7f063e1
payer_status=unverified
business=gold-me@hotmail.com
address_country=United States
address_city=north bend
quantity=1
verify_sign=AZussRXZRkuk7frhfirfxxTkj0BDJGA2dJF3eF263eEsjLixS.xRxCzfaYL
payer_email=me@gmail.com
txn_id=4DU53818WJ271531M
payment_type=instant
last_name=Martin
address_state=WA
receiver_email=cravbill@hotmail.com
payment_fee=1.08
receiver_id=QG8JPB4RZJGG4
txn_type=web_accept
item_name=Some item of consequenceSpecifie
mc_currency=USD
item_number=G10W151
residence_country=US
handling_amount=0.00
transaction_subject=ff1ca57d9fad80cf93e6b300dd7f063e1
payment_gross=22.99
shipping=0.00
您可以通过将每个项目按等号(“=”)拆分或用包含缺失换行符的新数据字符串替换原始数据字符串来进一步解析列表:
string newData = parsedList.Aggregate((data, next) => data + Environment.NewLine + next)