如何在保留列的情况下找到累积计数行之间的差异

Question

如何在保留列的情况下找到累积计数行之间的差异

3

我有以下数据：

machine_id  time_to_failure
430494        1000
430494        700
430494        500
430494        100
430495        1000
430495        200

故障时间数据是从参考日0开始计算的，我希望将其转换为上次故障发生后的时间:

machine_id  time_to_failure
430494        300
430494        200
430494        400
430494        100
430495        800
430495        200

我尝试使用groupby和pivoting将重复的行转换为新列进行减法。但是，我想在原地进行操作以保留其他列。

- Sample_friend

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Quang Hoang · Accepted Answer

我们尝试使用groupby().diff()：

df['time_to_failure'] = (df.groupby('machine_id')
                            ['time_to_failure'].diff(-1)
                           .fillna(df['time_to_failure'])
                        )

输出：

   machine_id  time_to_failure
0      430494            300.0
1      430494            200.0
2      430494            400.0
3      430494            100.0
4      430495            800.0
5      430495            200.0