问题描述
我需要一些帮助.所以我有这样的东西
I need some help here. So i have something like this
import pandas as pd path = '/Users/arronteb/Desktop/excel/ejemplo.xlsx' xlsx = pd.ExcelFile(path) df = pd.read_excel(xlsx,'Sheet1') df['is_duplicated'] = df.duplicated('#CSR') df_nodup = df.loc[df['is_duplicated'] == False] df_nodup.to_excel('ejemplo.xlsx', encoding='utf-8')
所以基本上这个程序将 ejemlo.xlsx (ejemlo 是西班牙语的例子,只是文件名)加载到 df (一个 DataFrame),然后检查特定列中的重复值 .它会删除重复项并再次保存文件.那部分工作正常.问题是,我需要用不同的颜色(如黄色)突出显示包含重复项的单元格,而不是删除重复项.
So basically this program load the ejemplo.xlsx (ejemplo is example in Spanish, just the name of the file) into df (a DataFrame), then checks for duplicate values in a specific column??. It deletes the duplicates and saves the file again. That part works correctly. The problem is that instead of removing duplicates, I need highlight the cells containing them with a different color, like yellow.
推荐答案
你可以创建一个函数来做高亮...
You can create a function to do the highlighting...
def highlight_cells(): # provide your criteria for highlighting the cells here return ['background-color: yellow']
然后将突出显示功能应用于您的数据框...
And then apply your highlighting function to your dataframe...
df.style.apply(highlight_cells)