问题描述
我在 Pandas 数据框中有两列是日期.
I have two columns in a Pandas data frame that are dates.
我希望从另一列中减去一列,结果是天数的差异作为整数.
I am looking to subtract one column from another and the result being the difference in numbers of days as an integer.
查看数据:
df_test.head(10) Out[20]: First_Date Second Date 0 2016-02-09 2015-11-19 1 2016-01-06 2015-11-30 2 NaT 2015-12-04 3 2016-01-06 2015-12-08 4 NaT 2015-12-09 5 2016-01-07 2015-12-11 6 NaT 2015-12-12 7 NaT 2015-12-14 8 2016-01-06 2015-12-14 9 NaT 2015-12-15
我已经成功创建了一个新列,不同之处:
I have created a new column successfully with the difference:
df_test['Difference'] = df_test['First_Date'].sub(df_test['Second Date'], axis=0) df_test.head() Out[22]: First_Date Second Date Difference 0 2016-02-09 2015-11-19 82 days 1 2016-01-06 2015-11-30 37 days 2 NaT 2015-12-04 NaT 3 2016-01-06 2015-12-08 29 days 4 NaT 2015-12-09 NaT
但是我无法获得结果的数字版本:
However I am unable to get a numeric version of the result:
df_test['Difference'] = df_test[['Difference']].apply(pd.to_numeric) df_test.head() Out[25]: First_Date Second Date Difference 0 2016-02-09 2015-11-19 7.084800e+15 1 2016-01-06 2015-11-30 3.196800e+15 2 NaT 2015-12-04 NaN 3 2016-01-06 2015-12-08 2.505600e+15 4 NaT 2015-12-09 NaN
推荐答案
怎么样:
df_test['Difference'] = (df_test['First_Date'] - df_test['Second Date']).dt.days
如果没有缺失值(NaT),这将返回差异为 int,如果有,则返回 float.
This will return difference as int if there are no missing values(NaT) and float if there is.
Pandas 有关于 时间序列/日期功能 和 时间增量
Pandas have a rich documentation on Time series / date functionality and Time deltas