我有一个数据框,其中包含两个数值变量fatcontent和saltcontent以及两个因子变量cond和spice,用于描述不同的处理方法。在这个数据框中,每个数值变量的测量值都被重复了两次。
a <- data.frame(cond = rep(c("uncooked", "fried", "steamed", "baked", "grilled"),
each = 2, times = 3),
spice = rep(c("none", "chilli", "basil"), each = 10),
fatcontent = c(4, 5, 6828, 7530, 6910, 7132, 5885, 613, 2845, 2867,
25, 18, 2385, 33227, 4233, 4023, 953, 1025, 4465, 5016,
5, 5, 10235, 12545, 5511, 5111, 596, 585, 4012, 3633),
saltcontent = c(2, 5, 4733, 5500, 5724, 15885, 14885, 217, 193, 148,
6, 4, 26738, 24738, 22738, 23738, 267, 256, 1121, 1558,
1, 1, 21738, 20738, 26738, 27738, 195, 202, 129, 131)
)
现在,我希望对每种香料组的数字变量进行归一化处理(在这种情况下意味着除以平均值),通过未加工条件的平均值。
例如,对于a$spice == "none"
cond spice fatcontent saltcontent
1 uncooked none 4 2
2 uncooked none 5 5
3 fried none 6828 4733
4 fried none 7530 5500
5 steamed none 6910 5724
6 steamed none 7132 15885
7 baked none 5885 14885
8 baked none 613 217
9 grilled none 2845 193
10 grilled none 2867 148
标准化后:
cond spice fatcontent saltcontent
1 uncooked none 0.8888889 0.5714286
2 uncooked none 1.1111111 1.4285714
3 fried none 1517.3333333 1352.2857143
4 fried none 1673.3333333 1571.4285714
5 steamed none 1535.5555556 1635.4285714
6 steamed none 1584.8888889 4538.5714286
7 baked none 1307.7777778 4252.8571429
8 baked none 136.2222222 62.0000000
9 grilled none 632.2222222 55.1428571
10 grilled none 637.1111111 42.2857143
我的问题是如何针对数据框中的所有组和变量执行此操作?我假设可以使用dplyr包,但不确定最佳方法是什么。感谢任何帮助!