在基础R中,您可以使用
ave
计算二进制向量:
Democrat$winner <- ave(Democrat$fraction_votes, Democrat$fips, FUN=function(i) i == max(i))
which returns
Democrat
state state_abbreviation county fips party candidate votes fraction_votes winner
1 Alabama AL Autauga 1001 Democrat Bernie 544 0.182 0
2 Alabama AL Autauga 1001 Democrat Hillary 2387 0.800 1
3 Alabama AL Baldwin 1003 Democrat Bernie 2694 0.329 0
4 Alabama AL Baldwin 1003 Democrat Hillary 5290 0.647 1
5 Alabama AL Barbour 1005 Democrat Bernie 222 0.078 0
6 Alabama AL Barbour 1005 Democrat Hillary 2567 0.906 1
如果需要,可以通过将ave
包装在as.logical
中将其转换为逻辑值。
在 data.table
中,这也非常简单。假设 fips 是唯一的州县 ID:
library(data.table)
setDT(Democrat)
Democrat[, winner := fraction_votes == max(fraction_votes), by=fips]
which返回
Democrat
state state_abbreviation county fips party candidate votes fraction_votes winner
1: Alabama AL Autauga 1001 Democrat Bernie 544 0.182 FALSE
2: Alabama AL Autauga 1001 Democrat Hillary 2387 0.800 TRUE
3: Alabama AL Baldwin 1003 Democrat Bernie 2694 0.329 FALSE
4: Alabama AL Baldwin 1003 Democrat Hillary 5290 0.647 TRUE
5: Alabama AL Barbour 1005 Democrat Bernie 222 0.078 FALSE
6: Alabama AL Barbour 1005 Democrat Hillary 2567 0.906 TRUE
数据
Democrat <-
structure(list(state = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Alabama", class = "factor"),
state_abbreviation = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "AL", class = "factor"),
county = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("Autauga",
"Baldwin", "Barbour"), class = "factor"), fips = c(1001L,
1001L, 1003L, 1003L, 1005L, 1005L), party = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = "Democrat", class = "factor"),
candidate = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("Bernie",
"Hillary"), class = "factor"), votes = c(544L, 2387L, 2694L,
5290L, 222L, 2567L), fraction_votes = c(0.182, 0.8, 0.329,
0.647, 0.078, 0.906)), .Names = c("state", "state_abbreviation",
"county", "fips", "party", "candidate", "votes", "fraction_votes"
), row.names = c("1", "2", "3", "4", "5", "6"), class = "data.frame")