问题描述
假设我有以下数据框:
更新 feat 和 another_feat 列的值的最有效方法是什么/strong>?
What is the most efficient way to update the values of the columns feat and another_feat where the stream is number 2?
是这个吗?
for index, row in df.iterrows(): if df1.loc[index,'stream'] == 2: # do something
更新:如果我有超过 100 列怎么办?我不想明确命名要更新的列.我想将每列的值除以 2(流列除外).
UPDATE: What to do if I have more than a 100 columns? I don't want to explicitly name the columns that I want to update. I want to divide the value of each column by 2 (except for the stream column).
所以要明确我的目标是什么:
So to be clear what my goal is:
将所有具有流 2 的行的所有值除以 2,但不更改流列
推荐答案
我觉得你可以使用loc 如果您需要将两列更新为相同的值:
I think you can use loc if you need update two columns to same value:
df1.loc[df1['stream'] == 2, ['feat','another_feat']] = 'aaaa' print df1 stream feat another_feat a 1 some_value some_value b 2 aaaa aaaa c 2 aaaa aaaa d 3 some_value some_value
如果您需要单独更新,一个选项是使用:
If you need update separate, one option is use:
df1.loc[df1['stream'] == 2, 'feat'] = 10 print df1 stream feat another_feat a 1 some_value some_value b 2 10 some_value c 2 10 some_value d 3 some_value some_value
另一个常见的选项是使用 numpy.where:
Another common option is use numpy.where:
df1['feat'] = np.where(df1['stream'] == 2, 10,20) print df1 stream feat another_feat a 1 20 some_value b 2 10 some_value c 2 10 some_value d 3 20 some_value
如果您需要在条件为 True 的情况下划分所有不带 stream 的列,请使用:
If you need divide all columns without stream where condition is True, use:
print df1 stream feat another_feat a 1 4 5 b 2 4 5 c 2 2 9 d 3 1 7 #filter columns all without stream cols = [col for col in df1.columns if col != 'stream'] print cols ['feat', 'another_feat'] df1.loc[df1['stream'] == 2, cols ] = df1 / 2 print df1 stream feat another_feat a 1 4.0 5.0 b 2 2.0 2.5 c 2 1.0 4.5 d 3 1.0 7.0
如果可以使用多个条件,请使用多个 numpy.在哪里或 numpy.select:
If working with multiple conditions is possible use multiple numpy.where or numpy.select:
df0 = pd.DataFrame({'Col':[5,0,-6]}) df0['New Col1'] = np.where((df0['Col'] > 0), 'Increasing', np.where((df0['Col'] < 0), 'Decreasing', 'No Change')) df0['New Col2'] = np.select([df0['Col'] > 0, df0['Col'] < 0], ['Increasing', 'Decreasing'], default='No Change') print (df0) Col New Col1 New Col2 0 5 Increasing Increasing 1 0 No Change No Change 2 -6 Decreasing Decreasing