新版Pandas使用以下接口来加载Excel文件:
read_excel('path_to_file.xls', 'Sheet1', index_col=None, na_values=['NA'])
但是如果我不知道可用的工作表怎么办?
例如,我正在处理以下工作簿的Excel文件:
Data 1, Data 2 ..., Data N, foo, bar
但我事先不知道 N
是多少。
是否有一种方法可以在Pandas中从Excel文档中获取工作表列表?
新版Pandas使用以下接口来加载Excel文件:
read_excel('path_to_file.xls', 'Sheet1', index_col=None, na_values=['NA'])
但是如果我不知道可用的工作表怎么办?
例如,我正在处理以下工作簿的Excel文件:
Data 1, Data 2 ..., Data N, foo, bar
但我事先不知道 N
是多少。
是否有一种方法可以在Pandas中从Excel文档中获取工作表列表?
import pandas as pd
path = "\\DB\\Expense\\reconcile\\"
file_name = "202209-v01.xlsx"
df = pd.read_excel(path + file_name, None)
print(df)
sheet_names = list(df.keys())
# print last sheet name
print(sheet_names[len(sheet_names)-1])
last_month = df.get(sheet_names[len(sheet_names)-1])
print(last_month)
With the load_workbook readonly option, what was earlier seen as a execution seen visibly waiting for many seconds happened with milliseconds. The solution could however be still improved.
import pandas as pd
from openpyxl import load_workbook
class ExcelFile:
def __init__(self, **kwargs):
........
.....
self._SheetNames = list(load_workbook(self._name,read_only=True,keep_links=False).sheetnames)
The Excelfile.parse takes the same time as reading the complete xls in order of 10s of sec. This result was obtained with windows 10 operating system with below package versions
C:\>python -V
Python 3.9.1
C:\>pip list
Package Version
--------------- -------
et-xmlfile 1.0.1
numpy 1.20.2
openpyxl 3.0.7
pandas 1.2.3
pip 21.0.1
python-dateutil 2.8.1
pytz 2021.1
pyxlsb 1.0.8
setuptools 49.2.1
six 1.15.0
xlrd 2.0.1