我想要获取数据框中特定列中出现最多的值。以下是我的示例数据和代码。
data("Forbes2000", package = "HSAUR")
head(Forbes2000)
rank name country category sales profits assets marketvalue
1 1 Citigroup United States Banking 94.71 17.85 1264.03 255.30
2 2 General Electric United States Conglomerates 134.19 15.59 626.93 328.54
3 3 American Intl Group United States Insurance 76.66 6.46 647.66 194.87
4 4 ExxonMobil United States Oil & gas operations 222.88 20.96 166.99 277.02
5 5 BP United Kingdom Oil & gas operations 232.57 10.27 177.57 173.54
6 6 Bank of America United States Banking 49.01 10.81 736.45 117.55
根据我的样本数据,我需要返回最常重复的类别,即保险。subset(subset(Forbes2000,country=="Bermuda")
sort(table(yourdata$category), decreasing=TRUE)[1]
。当然还有很多其他方法! - Justinnames(sort(table(yourdata$category), decreasing=TRUE)[1])
。但是,Josh在下面提出了一个很好的观点,如果你有并列呢! - Justin