我正在做这道练习题,要求我编写一个函数,可以计算出数组中出现最多次数的数字。
例子输入如下:
"返回值为2、3、5,因为这些数字2、3和5都出现了2次,这是最多的。"
例子输入如下:
x = c(25, 2, 3, 57, 38, 41)
"返回值为2、3、5,因为这些数字2、3和5都出现了2次,这是最多的。"
x = c(25, 2, 3, 57, 38, 41)
my_vector <- c(25, 2, 3, 57, 38, 41)
# function to evaluate the number of times a certain digit occurrs
digit_occurrence <- function(vector) {
# collape vector to a single string without commas
x <- paste(vector, sep = '', collapse = '')
# create empty vector
digit <- c()
# loop over each unique digit and store its occurrence
for(i in paste(as.character(0:9))) {
digit[i] <- lengths(regmatches(x, gregexpr(i, x)))
}
digit
}
> digit_occurrence(my_vector)
0 1 2 3 4 5 6 7 8 9
0 1 2 2 1 2 0 1 1 0
table()
函数来获取每个数字的频率数据框(而不是通过for循环进行计数),然后按照频率排列该数据框并直接提取前三个数字的解决方案:input_vector <- c(25, 2, 3, 57, 38, 41)
top_digits <- function(my_array, n=3) {
# `as.character` converts the digits to strings,
# `strsplit` splits each one into individual characters (e.g. "23" into "2" and "3")
# and `unlist` "flattens" the result to a unique string vector
my_array_splitted <- unlist(strsplit(as.character(input_vector), ""))
# `table` creates a vector of frequencies
# `as.data.frame` converts the vector into a DF with 2 columns: digits and frequencies
df_digits <- as.data.frame(table(my_array_splitted))
# Sorting the DF by frequency
df_digits <- df_digits[order(df_digits$Freq, decreasing = TRUE),]
# Extracting the first `n` elements of the digits column (which is now sorted) and converting back to integer
# (we need the intermediate step as character because the column is originally factor, and converting directly to integer is unsafe
as.integer(as.character(df_digits$my_array_splitted[1:n]))
}
top_digits(input_vector)
。这将返回输入的前三位数字。 - Francisco Yirá这种方法类似并使用了table
count = function(x) {
# make a table of counts of all the digits
tab = table(strsplit(paste(x, collapse=""), ""))
# access the names of the last digits
names(tab[max(tab)])
}
因为现在是圣诞节,所以我们来做一个有趣的基准测试:
x = sample(1:1000, 100000, replace=T)
Unit: milliseconds
expr min lq mean median uq max
me(x) 46.63262 52.34020 57.33796 53.87266 58.91561 123.5481
anou(x) 319.14199 351.43877 381.35371 374.78037 408.67354 490.3464
digit_occurrence(x) 149.83663 151.61908 160.47220 156.88108 161.57646 245.5067
top_digits(x) 42.40598 49.92426 55.87991 51.90813 56.61563 109.5608
fn <- function(x) {
# First We separate every single digit in each element but we need to turn
# the each element into character string beforehand. We then use do.call
# function to apply c function on every element of the resulting list to
# flatten the list to a vector
digits <- do.call(c, sapply(x, function(y) strsplit(as.character(y), "")))
# In the end we calculate the frequencies and sort the in decreasing order
most_freq <- sort(table(digits), decreasing = TRUE)
most_freq
}
fn(x)
digits_num
2 3 5 1 4 7 8
2 2 2 1 1 1 1
另一种方法是将您的数字向量转换为字符向量,拆分整个字符串,然后制作频率表:
table(unlist(strsplit(as.character(x), ""))) -> t
# 1 2 3 4 5 7 8
# 1 2 2 1 2 1 1
as.integer(names(t[t == which.max(t)]))
#2 3 5
my_vector
中出现最多的数字,那么只需在for
循环后将digit
对象替换为names(digit[digit == max(digit)])
即可。 - Dion Groothof