我想要重新排列DataFrame中的列,将一些列放在前面,然后将所有其他列放在后面。
使用R的dplyr
,代码如下:
library(dplyr)
df = tibble(col1 = c("a", "b", "c"),
id = c(1, 2, 3),
col2 = c(2, 4, 6),
date = c("1 Feb", "2 Feb", "3 Feb"))
df2 = select(df,
id, date, everything())
很简单。使用Python的pandas
,我尝试了以下内容:
import pandas as pd
df = pd.DataFrame({
"col1": ["a", "b", "c"],
"id": [1, 2, 3],
"col2": [2, 4, 6],
"date": ["1 Feb", "2 Feb", "3 Feb"]
})
# using sets
cols = df.columns.tolist()
cols_1st = {"id", "date"}
cols = set(cols) - cols_1st
cols = list(cols_1st) + list(cols)
# wrong column order
df2 = df[cols]
# using lists
cols = df.columns.tolist()
cols_1st = ["id", "date"]
cols = [c for c in cols if c not in cols_1st]
cols = cols_1st + cols
# right column order, but is there a better way?
df3 = df[cols]
使用 pandas
的方式更加繁琐,但我对此还比较陌生。有没有更好的方法呢?
cols_1st = ["id", "date"]
更改为cols_1st =[value for value in ["id", "date", "non existing column without exception"] if value in df.columns.tolist()]
。适用于有很多列的数据框。 - phili_b