问题描述
我需要帮助在从数据帧中的交叉表创建的 pandas 堆积条形图的每个部分中添加总数的百分比分布(无小数).
I need help adding the percent distribution of the total (no decimals) in each section of a stacked bar plot in pandas created from a crosstab in a dataframe.
这里是示例数据:
data = { 'Name':['Alisa','Bobby','Bobby','Alisa','Bobby','Alisa', 'Alisa','Bobby','Bobby','Alisa','Bobby','Alisa'], 'Exam':['Semester 1','Semester 1','Semester 1','Semester 1','Semester 1','Semester 1', 'Semester 2','Semester 2','Semester 2','Semester 2','Semester 2','Semester 2'], 'Subject':['Mathematics','Mathematics','English','English','Science','Science', 'Mathematics','Mathematics','English','English','Science','Science'], 'Result':['Pass','Pass','Fail','Pass','Fail','Pass','Pass','Fail','Fail','Pass','Pass','Fail']} df = pd.DataFrame(data) # display(df) Name Exam Subject Result 0 Alisa Semester 1 Mathematics Pass 1 Bobby Semester 1 Mathematics Pass 2 Bobby Semester 1 English Fail 3 Alisa Semester 1 English Pass 4 Bobby Semester 1 Science Fail 5 Alisa Semester 1 Science Pass 6 Alisa Semester 2 Mathematics Pass 7 Bobby Semester 2 Mathematics Fail 8 Bobby Semester 2 English Fail 9 Alisa Semester 2 English Pass 10 Bobby Semester 2 Science Pass 11 Alisa Semester 2 Science Fail
这是我的代码:
#crosstab pal = ["royalblue", "dodgerblue", "lightskyblue", "lightblue"] ax= pd.crosstab(df['Name'], df['Subject']).apply(lambda r: r/r.sum()*100, axis=1) ax.plot.bar(figsize=(10,10),stacked=True, rot=0, color=pal) display(ax) plt.legend(loc='best', bbox_to_anchor=(0.1, 1.0),title="Subject",) plt.xlabel('Name') plt.ylabel('Percent Distribution') plt.show()
我知道我需要以某种方式添加 plt.text,但无法弄清楚.我希望将总数的百分比嵌入堆叠的条形图中.
I know I need to add a plt.text some how, but can't figure it out. I would like the percent of the totals to be embedded within the stacked bars.
推荐答案
我们试试吧:
# crosstab pal = ["royalblue", "dodgerblue", "lightskyblue", "lightblue"] ax= pd.crosstab(df['Name'], df['Subject']).apply(lambda r: r/r.sum()*100, axis=1) ax_1 = ax.plot.bar(figsize=(10,10), stacked=True, rot=0, color=pal) display(ax) plt.legend(loc='upper center', bbox_to_anchor=(0.1, 1.0), title="Subject") plt.xlabel('Name') plt.ylabel('Percent Distribution') for rec in ax_1.patches: height = rec.get_height() ax_1.text(rec.get_x() + rec.get_width() / 2, rec.get_y() + height / 2, "{:.0f}%".format(height), ha='center', va='bottom') plt.show()
输出:
Subject English Mathematics Science Name Alisa 33.333333 33.333333 33.333333 Bobby 33.333333 33.333333 33.333333