我有两个文件
- 第一个包含数据(无列标头)
- 第二个包含列标头
我想将这两个文件合并成一个文件。我的做法是将数据推入数据框架中,并在它们上使用concat来获取文件结果集。
到目前为止,我的代码如下:
import pandas as pd
from xlrd import open_workbook
#contains mapping, Column present
#DataFileName FolderLocation ColumnFileName
#Data1 F:\Desktop ColFile1
#Data2 F:\Desktop ColFile2
filelocation = 'F:\Desktop\Mapping.xlsx'
wb = open_workbook(filelocation)
Separator = ','
items = []
for sheet in wb.sheets():
number_of_rows = sheet.nrows
number_of_columns = sheet.ncols
for row in range(1, number_of_rows):
for col in range(number_of_columns):
ColumnFileName = sheet.cell(row,0).value
Path = sheet.cell(row,1).value
DataFileName = sheet.cell(row,2).value
DataFileCompName = Path + "\\" + DataFileName +FileExtension
ColumnFileCompName = Path + "\\" + ColumnFileName+ FileExtension
HeaderDataFrame = pd.read_csv(ColumnFileCompName,sep=Separator)#,index_col=0)#,header=0)
DataDataFrame = pd.read_csv(DataFileCompName,sep=Separator)#,header=None)
CompleteDataFrame = pd.concat([HeaderDataFrame,DataDataFrame], ignore_index=True,axis=1)
现在,使用concat函数,我希望得到以下结果集:
HeaderDataFrame
DataDataFrame
我得到的结果是
HeaderDataFrame|DataDataFrame