问题
我需要计算Pearson和Spearman相关性,并在tensorflow中将其用作度量标准。
对于Pearson,这很简单:
tf.contrib.metrics.streaming_pearson_correlation(y_pred, y_true)
但是对于Spearman,我一无所知!
我尝试过的:
从这个答案:
samples = 1
predictions_rank = tf.nn.top_k(y_pred, k=samples, sorted=True, name='prediction_rank').indices
real_rank = tf.nn.top_k(y_true, k=samples, sorted=True, name='real_rank').indices
rank_diffs = predictions_rank - real_rank
rank_diffs_squared_sum = tf.reduce_sum(rank_diffs * rank_diffs)
six = tf.constant(6)
one = tf.constant(1.0)
numerator = tf.cast(six * rank_diffs_squared_sum, dtype=tf.float32)
divider = tf.cast(samples * samples * samples - samples, dtype=tf.float32)
spearman_batch = one - numerator / divider
但是它返回了 NaN
...
根据维基百科的定义:
![enter image description here](https://istack.dev59.com/M8RQM.webp)
我尝试了:
size = tf.size(y_pred)
indice_of_ranks_pred = tf.nn.top_k(y_pred, k=size)[1]
indice_of_ranks_label = tf.nn.top_k(y_true, k=size)[1]
rank_pred = tf.nn.top_k(-indice_of_ranks_pred, k=size)[1]
rank_label = tf.nn.top_k(-indice_of_ranks_label, k=size)[1]
rank_pred = tf.to_float(rank_pred)
rank_label = tf.to_float(rank_label)
spearman = tf.contrib.metrics.streaming_pearson_correlation(rank_pred, rank_label)
但在运行时,我遇到了以下错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError:输入必须至少有k列。现有1列,需要32列。
[[{{node metrics/spearman/TopKV2}} = TopKV2 [T=DT_FLOAT,sorted = true,_device =“/job:localhost/replica:0/task:0/device:CPU:0”](lambda_1 / add,metrics/pearson/pearson_r/variance_predictions/Size)]]