绘制直方图的折线图

6
我正在尝试使用Altair尽可能复制此图表。 https://fivethirtyeight.com/wp-content/uploads/2014/04/hickey-bechdel-11.png?w=575 我卡在了获取分界通过/失败的黑线上。这类似于此Altair示例: https://altair-viz.github.io/gallery/step_chart.html。 然而:在538可视化中,最后一个日期的值必须延伸到最后一个元素的整个宽度。在步进图示例和我的解决方案中,该行在遇到最后一个日期元素时停止。
我查看了Altair的github和Google群组,但没有找到类似的问题。
import altair as alt
import pandas as pd

movies=pd.read_csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/bechdel/movies.csv')
domain = ['ok', 'dubious','men', 'notalk', 'nowomen']

base=alt.Chart(movies).encode(
  alt.X("year:N",bin=alt.BinParams(step=5,extent=[1970,2015]),axis=alt.Axis(labelAngle=0, labelLimit=50,labelFontSize=8),title=None),  alt.Y("count()",stack='normalize',title=None,axis=alt.Axis(format='%',values=[0, 0.25,0.50,0.75,1]))

).properties(width=400)
main=base.transform_calculate(cleanrank='datum.clean_test == "ok" ? 1 : datum.clean_test == "dubious" ? 2 : datum.clean_test == "men" ? 3 : datum.clean_test == "notalk" ? 4 : 5'
                ).mark_bar(stroke='white' #add horizontal lines
                ).encode(  
  alt.Color("clean_test:N",scale=alt.Scale(
      domain=domain,
      range=['dodgerblue', 'skyblue', 'pink', 'coral','red']))
    ,order=alt.Order('cleanrank:O', sort='ascending')
)

extra=base.transform_calculate(cleanpass='datum.clean_test == "ok" ? "PASS" : datum.clean_test == "dubious" ? "PASS" : "FAIL"'
                      ).mark_line(interpolate='step-after'
                      ).encode(alt.Color("cleanpass:N",scale=alt.Scale(domain=['PASS','FAIL'],range=['black','white']))
                      )



alt.layer(main,extra).configure_scale(
    bandPaddingInner=0.01 #smaller vertical lines
).resolve_scale(color='independent')
1个回答

0

让步数图覆盖第一个到最后一个箱子的开头是一种相当糟糕的方法,需要手动控制箱子的位置(使用有序箱子的等级)。

这样我们就可以添加两条线:一条使用'step-after',另一条使用step-before向前移动一个箱子。从这里开始,刻度标签仍然需要用适当的箱子标签替换和居中,例如来自pd.cut的级别...

enter image description here

数据框准备

import altair as alt
import pandas as pd

movies=pd.read_csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/bechdel/movies.csv')
domain = ['ok', 'dubious','men', 'notalk', 'nowomen']

movies['year_bin'] = pd.cut(movies['year'], range(1970, 2016, 5))
movies['year_rank'] = movies['year_bin'].cat.codes
movies = movies[movies['year_rank']>=0]
df_plot = movies[['year_rank', 'clean_test']].copy()
df_plot['year_rank_end'] = df_plot['year_rank'] + 1
df_plot['clean_pass'] = df_plot['clean_test'].apply(lambda x: 'PASS' if x in ['ok', 'dubious'] else 'FAIL')

图表声明

base=alt.Chart(df_plot).encode(
    x=alt.X('year_rank', 
        axis=alt.Axis(labelAngle=0, labelLimit=50,labelFontSize=8),
        title=None
        ),  
  x2='year_rank_end',
  y=alt.Y('count()',title=None, stack='normalize',
        axis=alt.Axis(format='%',values=[0, 0.25,0.50,0.75,1])
        )
).properties(width=400)

main=base.transform_calculate(
    cleanrank='datum.clean_test == "ok" ? 1 : datum.clean_test == "dubious" ? 2 : datum.clean_test == "men" ? 3 : datum.clean_test == "notalk" ? 4 : 5'
    ).mark_bar(
        stroke='white' #add horizontal lines
    ).encode( 
  alt.Color("clean_test:N",scale=alt.Scale(
      domain=domain,
      range=['dodgerblue', 'skyblue', 'pink', 'coral','red']))
    ,order=alt.Order('cleanrank:O', sort='ascending')
)

extra=base.transform_calculate(
    ).mark_line(
        interpolate='step-after'
    ).encode(
        alt.Color("clean_pass:N",scale=alt.Scale(domain=['PASS','FAIL'],range=['black','white']))
    )

extra2=base.transform_calculate(
    # shift data by one bin, so that step-before matches the unshifted step-after
    year_rank='datum.year_rank +1' 
    ).mark_line(
        interpolate='step-before'
    ).encode(
        alt.Color("clean_pass:N",scale=alt.Scale(domain=['PASS','FAIL'],range=['black','white']), legend=None)
    )

alt.layer(main, extra, extra2).configure_scale(
    bandPaddingInner=0.01 #smaller vertical lines
).resolve_scale(color='independent')

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接