Pandas - 合并两列

Question

Pandas - 合并两列

3

我有2列数据，分别称为 x 和 y。我想创建一个名为 xy 的新列：

x    y    xy
1         1
2         2

     4    4
     8    8

不应该有任何冲突的值，但如果存在，则以y为优先。如果这样做可以使解决方案更容易，请假定x始终为NaN，而y具有值。

- JesusMonroe

3个回答

3

注意，您当前的列类型是字符串而不是数值型。

df = df.apply(lambda x : pd.to_numeric(x, errors='coerce'))

df['xy'] = df.sum(1)

更多

df['xy'] =df[['x','y']].astype(str).apply(''.join,1)

#df[['x','y']].astype(str).apply(''.join,1)
Out[655]: 
0    1.0
1    2.0
2       
3    4.0
4    8.0
dtype: object

- BENY

在这里不需要使用 lambda：可以写成 df.apply(pd.to_numeric, errors='coerce')。 - Jon Clements

0

你也可以使用 NumPy：

import pandas as pd, numpy as np

df = pd.DataFrame({'x': [1, 2, np.nan, np.nan],
                   'y': [np.nan, np.nan, 4, 8]})

arr = df.values
df['xy'] = arr[~np.isnan(arr)].astype(int)

print(df)

     x    y  xy
0  1.0  NaN   1
1  2.0  NaN   2
2  NaN  4.0   4
3  NaN  8.0   8

- jpp

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- SuperStew · Accepted Answer

4

如果您提供的样例准确无误，这很简单。

df.fillna(0)      #if the blanks are nan will need this line first
df['xy']=df['x']+df['y']

- SuperStew

2

或者 df.x.combine_first(df.y) - Jon Clements

1

或者那也可以。Pandas 就像剥一只猫的皮。 - SuperStew

太棒了，这个可行。不过看一下 combine_first 的例子，如果你想让 y 优先（在它们都有值的情况下），那么应该是 df.y.combine_first(df.x) 对吧？ - JesusMonroe

如果空白是空的，使用您的代码我会收到“TypeError: unsupported operand type(s) for +: 'float' and 'str'”错误。 - BENY

@JesusMonroe 是的...按优先顺序排列...那只是一个例子 :) - Jon Clements