如何对除最后一列之外的所有列进行缩放？

Question

如何对除最后一列之外的所有列进行缩放？

pythonpandasscikit-learn

4

我正在使用Python 3.7.6。

我正在处理分类问题。

我想要对数据框(df)的特征列进行缩放。这个dataframe包含了56列（55个特征列和最后一列是目标列）。

我想对特征列进行缩放。

我的操作如下：

y = df.iloc[:,-1]
target_name = df.columns[-1]
from FeatureScaling import feature_scaling
df = feature_scaling.scale(df.iloc[:,0:-1], standardize=False)
df[target_name] = y

但似乎不太有效，因为我需要重新创建dataframe（将目标列添加到缩放结果中）。

有没有一种方法可以只缩放某些列而不改变其他列，且方法高效？（即scale的结果将包含缩放的列和一个未被缩放的列）

- user3668129

2个回答

0

您可以选择您需要的列进行切片:

df.iloc[:, :-1] = feature_scaling.scale(df.iloc[:, :-1], standardize=False)

- Bruno Mello

它不起作用。它重新调整了所有列（包括最后一列）。 - user3668129

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Bishwarup Bhattacharjee · Accepted Answer

使用列索引进行缩放或其他预处理操作并不是一个好主意，因为每次创建新特征时都会破坏代码。相反，请使用列名称。例如：

使用scikit-learn：

from sklearn.preprocessing import StandardScaler, MinMaxScaler
features = [<featues to standardize>]
scalar = StandardScaler()
# the fit_transform ops returns a 2d numpy.array, we cast it to a pd.DataFrame
standardized_features = pd.DataFrame(scalar.fit_transform(df[features].copy()), columns = features)
old_shape = df.shape
# drop the unnormalized features from the dataframe
df.drop(features, axis = 1, inplace = True)
# join back the normalized features
df = pd.concat([df, standardized_features], axis= 1)
assert old_shape == df.shape, "something went wrong!"

如果您不喜欢拆分和重新组合数据，您可以使用以下类似的函数来处理数据。

import numpy as np
def normalize(x):
    if np.std(x) == 0:
        raise ValueError('Constant column')
    return (x -np.mean(x)) / np.std(x)

for col in features:
    df[col] = df[col].map(normalize)