我刚接触COBOL,一直在尝试从表格输出的文本文件中读取记录信息。
对于大多数非复合数据类型,我都没问题,但是对于 "COMP" 类型的数据却卡住了。
我一整天都在尝试弄懂这个问题,阅读尽可能多的资料。
下面的日期字段是我无法转换为日期字符串的字段:
05 VALDATE PIC 9(6) COMP
05 PAYDATE PIC 9(6) COMP
05 SYSDATE PIC 9(6) COMP
根据我的理解,上述所有类型在文件中都将占用4个字节。它们应该是代表
YYMMDD 的日期,但数据似乎不会这么小。我尝试过查看EBCDIC编码、反转byte[]
数据以及使用 BitConverter.ToUNIT32()
和更改使用的编码来读取文件,但没有成功。
我了解到,计算为整数的日期以自1601年1月1日起的天数存储,因此下面的代码正在尝试将该值添加到1601年。(http://www.techtricky.com/cobol-date-functions-list-add-find-duration/)
我的问题是,要么文本文件中的数据不正确,要么我漏掉了某个步骤,以获取类似于YYMMDD
的日期。
以上3个数据如下:
[ 32] [237] [ 44] [ 4] | 00100000 11101101 00101100 00000100
[ 33] [ 14] [ 32] [237] | 00100001 00001110 00100000 11101101
[131] [ 48] [ 48] [ 48] | 10000011 00110000 00110000 00110000
我尝试打开文件,但更改编码为ASCII却没有成功:
using (BinaryReader reader = new BinaryReader(File.Open(nFilePath, FileMode.Open), Encoding.Default))
用于尝试读取COMP字段的代码:
public class DateFromUIntExtractor : LineExtractor
{
public DateFromUIntExtractor() : base(4)
{
}
public override string ExtractText(BinaryReader nReader)
{
// e.g 32,237,44,44, included but commented out things i've tried
byte[] data = nReader.ReadBytes(Length); // Length = 4
//Array.Reverse(data); - Makes num = 552414212
//data = ConvertAsciiToEbcdic(data);
int num = BitConverter.ToUInt32(data, 0);
// in this example num = 70053152
DateTime date = new DateTime(1601,1,1);
date = date.AddDays(num); // Error : num is too big
Extract = date.ToString("yyyyMMdd");
return Extract;
}
}
数据有误吗?还是我漏掉了什么?
更新
我要完成的任务是复制一个将数据从一种定义转换为另一种定义的COBOL程序,但以CSV格式输出,因为该程序输出.dat文件。
源
我对源定义的不成熟解释是文本文件中的数据可能是 PUA-ICGROUP 或 PUA-PUGROUP。查看COBOL程序时,当 PUA-HEADER > PUA-KEY > PUA-RTYPE = "03" 时,它选择 PUA-ICGROUP,否则就是 PUA-PUGROUP。
C-WRITE-START.
IF PUA-RTYPE = 3 THEN
PERFORM C-WRITE-A
ELSE
PERFORM C-WRITE-B
END-IF.
C-WRITE-EXIT.
EXIT.
定义
01 DLRPUARC.
03 PUA-HEADER.
05 PUA-KEY.
07 PUA-CDELIM PIC 99.
07 PUA-SUPNO PIC 9(7).
07 PUA-RTYPE PIC 99.
07 PUA-REF PIC 9(9).
07 PUA-SEQ PIC 999.
05 PUA-ALTKEY.
07 PUA-ACDELIM PIC 99.
07 PUA-ASUPNO PIC 9(7).
07 PUA-ATRNDATE PIC 9(6).
07 PUA-ARTYPE PIC 99.
07 PUA-AREF PIC 9(9).
07 PUA-ASEQ PIC 999.
05 FILLER PIC X(82).
03 PUA-ICGROUP REDEFINES PUA-HEADER.
05 FILLER PIC X(52).
05 PUA-ICEXTREF PIC X(10).
05 PUA-ICORDNO PIC 9(11).
05 PUA-ICVALDATE PIC 9(6) COMP.
05 PUA-ICPAYDATE PIC 9(6) COMP.
05 PUA-ICSYSDATE PIC 9(6) COMP.
05 PUA-ICTRNVAL PIC S9(9).
05 PUA-ICCLRREF PIC 9(6).
05 PUA-ICDELDATE PIC 9(6) COMP.
05 PUA-ICOTHQRY PIC X.
05 PUA-ICPRCQRY PIC X.
05 PUA-ICMRSQRY PIC X.
05 PUA-ICDSCTYPE PIC 9.
05 PUA-ICDSCVAL PIC S9(9) COMP.
05 PUA-ICVATCODE PIC 9.
05 PUA-ICVATAMT PIC S9(8) COMP.
05 PUA-ICTAXAMT PIC S9(8) COMP.
05 PUA-ICMRSREF PIC 9(6).
05 PUA-ICSUBDIV PIC 9.
05 PUA-ICCOSTCTR PIC X(5).
05 PUA-ICSEQIND PIC X.
05 FILLER PIC X(4).
03 PUA-PUGROUP REDEFINES PUA-HEADER.
05 FILLER PIC X(52).
05 PUA-PUEXTREF PIC X(10).
05 PUA-PUORDNO PIC 9(11).
05 PUA-PUVALDATE PIC 9(6) COMP.
05 FILLER PIC XXX.
05 PUA-PUSYSDATE PIC 9(6) COMP.
05 PUA-PUTRNVAL PIC S9(9).
05 PUA-PUCLRREF PIC 9(6).
05 PUA-PUDELDATE PIC 9(6) COMP.
05 PUA-PUOTHQRY PIC X.
05 PUA-PUSUBDIV PIC 9.
05 FILLER PIC X(32).
输出定义
01 OUT-A-REC.
03 OUT-A-PUA-CDELIM PIC 99.
03 OUT-A-PUA-SUPNO PIC 9(7).
03 OUT-A-PUA-RTYPE PIC 99.
03 OUT-A-PUA-REF PIC 9(9).
03 OUT-A-PUA-SEQ PIC 999.
03 OUT-A-PUA-ATRNDATE PIC 9(8).
03 OUT-A-PUA-ICEXTREF PIC X(10).
03 OUT-A-PUA-ICORDNO PIC 9(11).
03 OUT-A-PUA-ICVALDATE PIC 9(8).
03 OUT-A-PUA-ICPAYDATE PIC 9(8).
03 OUT-A-PUA-ICSYSDATE PIC 9(8).
03 OUT-A-PUA-ICTRNVAL PIC S9(9) SIGN LEADING SEPARATE.
03 OUT-A-PUA-ICCLRREF PIC 9(6).
03 OUT-A-PUA-ICDELDATE PIC 9(8).
03 OUT-A-PUA-ICOTHQRY PIC X.
03 OUT-A-PUA-ICPRCQRY PIC X.
03 OUT-A-PUA-ICMRSQRY PIC X.
03 OUT-A-PUA-ICDSCTYPE PIC 9.
03 OUT-A-PUA-ICDSCVAL PIC S9(9) SIGN LEADING SEPARATE.
03 OUT-A-PUA-ICVATCODE PIC 9.
03 OUT-A-PUA-ICVATAMT PIC S9(8) SIGN LEADING SEPARATE.
03 OUT-A-PUA-ICTAXAMT PIC S9(8) SIGN LEADING SEPARATE.
03 OUT-A-PUA-ICMRSREF PIC 9(6).
03 OUT-A-PUA-ICSUBDIV PIC 9.
03 OUT-A-PUA-ICCOSTCTR PIC X(5).
03 OUT-A-PUA-ICSEQIND PIC X.
03 OUT-A-CTRL-M PIC X.
03 OUT-A-NL PIC X.
FD F-OUTPUTB
LABEL RECORDS OMITTED.
01 OUT-B-REC.
03 OUT-B-PUA-CDELIM PIC 99.
03 OUT-B-PUA-SUPNO PIC 9(7).
03 OUT-B-PUA-RTYPE PIC 99.
03 OUT-B-PUA-REF PIC 9(9).
03 OUT-B-PUA-SEQ PIC 999.
03 OUT-B-PUA-ATRNDATE PIC 9(8).
03 OUT-B-PUA-PUEXTREF PIC X(10).
03 OUT-B-PUA-PUORDNO PIC 9(11).
03 OUT-B-PUA-PUVALDATE PIC 9(8).
03 OUT-B-PUA-PUSYSDATE PIC 9(8).
03 OUT-B-PUA-PUTRNVAL PIC S9(9) SIGN LEADING SEPARATE.
03 OUT-B-PUA-PUCLRREF PIC 9(6).
03 OUT-B-PUA-PUDELDATE PIC 9(8).
03 OUT-B-PUA-PUOTHQRY PIC X.
03 OUT-B-PUA-PUSUBDIV PIC 9.
03 OUT-B-CTRL-M PIC X.
03 OUT-B-NL PIC X.
程序
以下是COBOL程序对日期的处理示例,无论其源代码是否为COMP。
(我并没有编写此代码)。看起来它似乎在尝试解决2K年问题。
IF PUA-ATRNDATE IS ZERO THEN
MOVE ZERO TO OUT-A-PUA-ATRNDATE
ELSE
MOVE PUA-ATRNDATE TO W-DATE-6DIGIT
MOVE W-DATE-SEG1 TO W-DATE-YY
MOVE W-DATE-SEG2 TO W-DATE-MM
MOVE W-DATE-SEG3 TO W-DATE-DD
IF W-DATE-YY > 50 THEN
MOVE "19" TO W-DATE-CC
ELSE
MOVE "20" TO W-DATE-CC
END-IF
MOVE W-DATE-CCYYMMDD TO OUT-A-PUA-ATRNDATE
END-IF.
MOVE PUA-ICEXTREF TO OUT-A-PUA-ICEXTREF.
MOVE PUA-ICORDNO TO OUT-A-PUA-ICORDNO.
IF PUA-ICVALDATE IS ZERO THEN
MOVE ZERO TO OUT-A-PUA-ICVALDATE
ELSE
MOVE PUA-ICVALDATE TO W-DATE-6DIGIT
MOVE W-DATE-SEG1 TO W-DATE-YY
MOVE W-DATE-SEG2 TO W-DATE-MM
MOVE W-DATE-SEG3 TO W-DATE-DD
IF W-DATE-YY > 50 THEN
MOVE "19" TO W-DATE-CC
ELSE
MOVE "20" TO W-DATE-CC
END-IF
MOVE W-DATE-CCYYMMDD TO OUT-A-PUA-ICVALDATE
END-IF.
程序工作存储区段
01 W-DATE-6DIGIT PIC 9(6).
01 W-DATE-6DIGIT-REDEF REDEFINES W-DATE-6DIGIT.
03 W-DATE-SEG1 PIC 99.
03 W-DATE-SEG2 PIC 99.
03 W-DATE-SEG3 PIC 99.
01 W-DATE-CCYYMMDD PIC 9(8).
01 W-DATE-CCYYMMDD-REDEF REDEFINES W-DATE-CCYYMMDD.
03 W-DATE-CC PIC 99.
03 W-DATE-YY PIC 99.
03 W-DATE-MM PIC 99.
03 W-DATE-DD PIC 99.
数据
从Notepad++复制,每行以“220…”开头,结束列为下一行之前的135,也就是长度为134(?)
2200010010300005463400022000100106062003000054634000062703 09720200000 í,! íƒ00056319D001144ÕšNNN0 1 G¨ 000000197202G
2200010010300005463500022000100106062903000054635000062858 09720200000 í, í" íƒ00082838{050906±RNNN0 1 áð 000000197202G
2200010010300005465500022000100106073003000054655000063378 09720200000 í í† í00179637A050906±RNNN0 1 000000197202G
注意到上面缺少一些符号:
2200010010300005463400022000100106062003000054634000062703 09720200000 í,[EOT]![SO] íƒ00056319D001144[SOH]ÕšNNN0 1 [SOH]G¨ 000000197202G
2200010010300005463500022000100106062903000054635000062858 09720200000 í, í" íƒ00082838{050906[SOH]±RNNN0 1 [SOH]áð 000000197202G
2200010010300005465500022000100106073003000054655000063378 09720200000 í í† í00179637A050906[SOH]±RNNN0 1 [EOT][NAK][EM] 000000197202G
更新2
我已经接受了Rick Smith在下面给出的答案,因为当我使用他的数据时,我得到了正确的日期时间值。所以要么我的数据已经损坏,要么是其他问题,因为我的数据会抛出错误或者将日期时间值设置在未来的几千年。
我已经成功获取了这些日期时间的CSV输出,它们应该是:
[使用:int n = ((b [0] << 16) + (b [1] << 8) + b [2]);]
HEX: 0x20 0xED 0x2C
BIN: 32 237 44
INT: 2157868 (longer than 6 digit)
Actual DATE: 2006-07-16
HEX: 0x04 0x21 0x0e
BIN: 4 33 14
INT: 270606 (correct but segments are in reverse)
Actual DATE: 2006-06-27
HEX: 0x20 0xED 0x83
BIN: 32 237 131
INT: 2157955 (longer than 6 digits)
Actual DATE: 2006-08-03
更新3
原来是数据有误...
48
出现的地方是ASCII码的000
。这意味着无需转换。 - Rick Smith