面对相同的问题,需要读取单元格的值,无论这些单元格是标量、预先计算过的公式还是未计算的公式,都希望具有容错性而非正确性。
策略很简单:
1. 如果单元格不包含公式,则返回单元格的值;
2. 如果是公式,则尝试获取其预计算值;
3. 如果无法获取,则尝试使用
pycel
进行评估;
4. 如果失败(由于
pycel
对公式的支持有限或出现某些错误),则发出警告并返回 None。
我创建了一个类,隐藏了所有这些机制,并提供了简单的接口来读取单元格的值。
很容易修改该类,以便在第4步上抛出异常,如果更看重正确性而非容错性。
希望它能帮助到某些人。
from traceback import format_exc
from pathlib import Path
from openpyxl import load_workbook
from pycel.excelcompiler import ExcelCompiler
import logging
class MESSAGES:
CANT_EVALUATE_CELL = ("Couldn't evaluate cell {address}."
" Try to load and save xlsx file.")
class XLSXReader:
"""
Provides (almost) universal interface to read xlsx file cell values.
For formulae, tries to get their precomputed values or, if none,
to evaluate them.
"""
def __init__(self, path: Path):
self.__path = path
self.__book = load_workbook(self.__path, data_only=False)
def get_cell_value(self, address: str, sheet: str = None):
if sheet is None:
sheet = self.__book.active.title
if not self.__cell_contains_formula(address, sheet):
return self.__get_as_is(address, sheet)
precomputed_value = self.__get_precomputed(address, sheet)
if precomputed_value is not None:
return precomputed_value
try:
computed_value = self.__compute(address, sheet)
except:
logging.warning(MESSAGES.CANT_EVALUATE_CELL
.format(address=address))
logging.debug(format_exc())
return None
return computed_value
def __cell_contains_formula(self, address, sheet):
cell = self.__book[sheet][address]
return cell.data_type is cell.TYPE_FORMULA
def __get_as_is(self, address, sheet):
return self.__book[sheet][address].value
def __get_precomputed(self, address, sheet):
if not hasattr(self, '__book_with_precomputed_values'):
self.__book_with_precomputed_values = load_workbook(
self.__path, data_only=True)
return self.__book_with_precomputed_values[sheet][address].value
def __compute(self, address, sheet):
if not hasattr(self, '__formulae_calculator'):
self.__formulae_calculator = ExcelCompiler(self.__path)
computation_graph = self.__formulae_calculator.gen_graph(
address, sheet=sheet)
return computation_graph.evaluate(f"{sheet}!{address}")
openpyxl
不能复制Excel的庞大公式计算子系统,因此我认为你无法让它自动重新计算所有公式。最好的情况是,你可以为该大型公式语言的子集实现自己的基于eval
的解释器。 - Alex Martelli