我正在将一个17列的CSV文件读入数据库。
偶尔会有“少于17列”的行。
我试图忽略这行,但即使所有列都设置为忽略,我也无法忽略这行,导致程序失败。
如何忽略这些行?
如何忽略这些行?
您可以通过添加一个Flat File Connection Manager
,仅添加一个数据类型为DT_WSTR
且长度为4000
的列(假设其名称为Column0
) - 所有列都被视为一个大列
Dataflow task
中,在Flat File Source
后面添加一个Script Component
Column0
标记为输入列,并添加17个输出列Input0_ProcessInputRow
方法中通过分隔符拆分Column0
,然后检查数组的长度是否等于17,如果是,则将值分配给输出列,否则忽略该行。DT_WSTR
,长度为4000
Column0
作为输入列OutputBuffer
的SynchronousInput
属性更改为None
Visual Basic
In the Script Editor write the following Script
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.Column0_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Column0.Trim) Then
Dim strColumns As String() = Row.Column0.Split(CChar(";"))
If strColumns.Length <> 17 Then Exit Sub
Output0Buffer.AddRow()
Output0Buffer.Column = strColumns(0)
Output0Buffer.Column1 = strColumns(1)
Output0Buffer.Column2 = strColumns(2)
Output0Buffer.Column3 = strColumns(3)
Output0Buffer.Column4 = strColumns(4)
Output0Buffer.Column5 = strColumns(5)
Output0Buffer.Column6 = strColumns(6)
Output0Buffer.Column7 = strColumns(7)
Output0Buffer.Column8 = strColumns(8)
Output0Buffer.Column9 = strColumns(9)
Output0Buffer.Column10 = strColumns(10)
Output0Buffer.Column11 = strColumns(11)
Output0Buffer.Column12 = strColumns(12)
Output0Buffer.Column13 = strColumns(13)
Output0Buffer.Column14 = strColumns(14)
Output0Buffer.Column15 = strColumns(15)
Output0Buffer.Column16 = strColumns(16)
End If
End Sub
Map the Output Columns to the Destination Columns
DT_STR
。 - Hadistring fName = @"C:\test.csv" // Full file path: it should reference via variable
string[] lines = System.IO.File.ReadAllLines(fName);
//add a counter
int ctr = 1;
foreach(string line in lines)
{
string[] cols = line.Split(',');
if(ctr!=1) //Assumes Header row. elim if 1st row has data
{
if(cols.Length == 17)
{
//Write out to Output
Output0Buffer.AddRow();
Output0Buffer.Col1 = cols[0].ToString(); //You need to cast to data type
Output0Buffer.Col2 = int.Parse(cols[1]) // example to cast to int
Output0Buffer.Col3 = DateTime.Parse(cols[2]) // example of datetime
... //rest of Columns
}
//optional else to handle skipped lines
//else
// write out line somewhere
}
ctr++; //increment counter
}
这是针对我在另一个答案中收到的 @SidC 评论。
这让你可以处理多个文件:
//set up variables
string line;
int ctr = 0;
string[] files = System.IO.Directory.GetFiles(@"c:/path", "filenames*.txt");
foreach(string file in files)
{
var str = new System.IO.StreamReader(file);
while((line = str.ReadLine()) != null)
{
// Work with line here similar to the other answer
}
}