Pandas目前没有直接读取表格的方法,但是下面这个函数可以使用openpyxl
库来读取(这也是pandas用于读取当前excel文件的库)。
请注意,这种技术是从我未撰写的博客文章(在此处找到)中学到的,尽管我的代码略有不同。
import pandas as pd
import openpyxl
def read_table(file_name: str, table_name: str) -> pd.DataFrame:
wb = openpyxl.load_workbook(file_name, read_only= False, data_only = True) # openpyxl does not have table info if read_only is True; data_only means any functions will pull the last saved value instead of the formula
for sheetname in wb.sheetnames: # pulls as strings
sheet = wb[sheetname] # get the sheet object instead of string
if table_name in sheet.tables: # tables are stored within sheets, not within the workbook, although table names are unique in a workbook
tbl = sheet.tables[table_name] # get table object instead of string
tbl_range = tbl.ref #something like 'C4:F9'
break # we've got our table, bail from for-loop
data = sheet[tbl_range] # returns a tuple that contains rows, where each row is a tuple containing cells
content = [[cell.value for cell in row] for row in data] # loop through those row/cell tuples
header = content[0] # first row is column headers
rest = content[1:] # every row that isn't the first is data
df = pd.DataFrame(rest, columns = header)
wb.close()
return df
版本:
In [50]: pd.__version__
Out[50]: '1.3.5'
In [51]: openpyxl.__version__
Out[51]: '3.0.9'
ColA,ColB,ColC
。此外,您是否知道表中的行数? - meW
0.22
及以上,此问题已得到解决。请尝试升级 pandas 版本。它应该能够轻松读取表格。 - Mayank Porwal