我正在编写一个脚本,最终将允许使用ipywidgets进行数据探索。我已经使一些部分可以按照用户感兴趣的列数量进行筛选,但以动态方式实现interact函数正在变得困难。以下是我在Jupyter中运行的示例代码:
import ipywidgets as widgets
from ipywidgets import interact
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/yankev/testing/master/datasets/nycflights.csv')
df = df.drop(df.columns[[0]], axis=1)
filter_cols = list(['origin','dest','carrier']) #list N columns we want to filter on
filter_df = df[filter_cols] #pull selected N columns from dataframe
filter_df.drop_duplicates(inplace=True) #remove duplicates
#loop through columns and create variables/widgets
for idx, val in enumerate(filter_cols):
#creates N variables (filter0, filter1, filter2) with unique values for each column with an All option
globals()['filter{}'.format(idx)] = ['All']+sorted(filter_df[val].unique().tolist())
#creates N widgets (widget0, widget1, widget2) for interact function below
globals()['widget{}'.format(idx)] = widgets.SelectMultiple(
options=globals()['filter{}'.format(idx)],
value=['All'],
description=val,
disabled=False
)
#looking to make this function dynamic based on the number of columns we want to filter by
#filters down source dataframe based on widget value selections
def viewer(a, b, c = list()):
#if widget selection is 'All', pass the full filter list, else filter only to what is selected in the widget
return df[df['origin'].isin(filter0 if a==('All',) else a)
& df['dest'].isin(filter1 if b==('All',) else b)
& df['carrier'].isin(filter2 if c==('All',) else c)].shape[0]
#displays N filters
#returns record count for filter combination
interact(viewer, a=widget0, b=widget1, c=widget2)
循环之后的代码是我想要让它动态化的部分。当前情况下,如果需要添加/删除任何额外的过滤器,我需要更改列名调用并添加/删除代码。将操作量限制在脚本中的几个点上会很不错。
非常感谢您提出的任何建议。谢谢!