我有一些R代码,现在要转换成C++。它会读取文件、解析字符,并计算大量的均值和标准差,并返回它们以及每个字符出现的次数。
现在,R和C++产生的结果在小数位上略有不同。由于计数矩阵是整数,因此数字完全相同。然而,在均值矩阵中,这些值相同,但只到小数点后两位,超过这个范围就会有所不同。对于标准差矩阵,这些值的差异更大,只到小数点后一位。
这是什么原因?我想可能是R和C++处理带有小数的数字的精度不同。我知道计算机本来就不擅长表示浮点数,但我该如何判断哪个输出更好呢?
...我尝试在R、C++和Windows 7的计算器中执行sqrt(41111.5/4522)的计算。它们都产生了相同的结果。那么为什么在运行时遇到完全相同的计算时它们会产生差异呢?在运行时输出中,C++与计算器一致,而R则不同。我还注意到,在执行这些大量计算时,后面的输出会稍微有些不同。是R在做这么多计算时变得疲劳并开始出错了吗?怎么回事?
下面是均值的输出:
C++:
38.6068 39.0122 38.633 38.5914 0
38.6159 38.7874 38.5053 38.7195 0
38.5205 38.7352 38.3694 38.5388 0
38.6331 38.7408 38.4588 38.5283 0
38.7503 38.6933 38.4173 38.6808 0
38.7637 38.7978 38.4967 38.603 0
38.7616 38.7384 38.4728 38.6946 0
38.6227 38.7689 38.4016 38.5352 0
38.5993 38.7334 38.3206 38.5514 0
38.6395 38.6598 38.43 38.4887 0
38.6414 38.746 38.4353 38.4908 0
38.4353 38.6767 38.3158 38.4694 0
38.35 38.5801 38.1486 38.3528 0
38.4122 38.6267 38.1731 38.3447 0
38.3751 38.5353 38.1782 38.2229 0
38.3373 38.6117 37.8952 38.2017 4.12443
38.332 38.4991 38.027 38.1984 0
38.2005 38.4417 38.0192 38.0446 4.12443
38.1719 38.4435 37.9727 38.0385 0
38.1346 38.3878 37.8634 37.9746 0
37.8505 38.2289 37.6202 37.6986 0
38.0932 38.142 37.7865 37.815 4.12443
37.9176 38.1381 37.5577 37.7273 0
37.7346 38.0934 37.4874 37.6546 0
37.6961 37.897 37.3342 37.4844 0
37.5534 37.9234 37.3341 37.3369 0
37.4914 37.7409 37.094 37.3211 0
37.2179 37.6653 36.9031 37.2592 0
37.0682 37.5625 36.6972 37.0218 4.12443
36.9713 37.4819 36.5387 36.8767 4.12443
36.8284 37.2411 36.223 36.6869 4.12443
36.7396 36.9682 36.0171 36.4556 4.12443
36.7874 36.9482 36.1641 36.5667 4.12443
36.695 36.9307 36.1856 36.3638 0
36.7224 36.9455 36.2212 36.695 4.12443
36.8983 37.1286 36.2652 36.8055 0
36.7835 36.8905 35.9562 36.4745 0
36.5364 36.9037 36.0927 36.4888 0
36.3959 36.6637 35.7378 36.323 0
35.9372 36.2034 35.452 35.6974 0
R:
A C G T N
[1,] 38.60573 39.01141 38.63195 38.59036 0
[2,] 38.61464 38.78523 38.50391 38.71826 0
[3,] 38.51908 38.73228 38.36774 38.53731 0
[4,] 38.63182 38.73834 38.45730 38.52657 0
[5,] 38.74903 38.69083 38.41585 38.67933 0
[6,] 38.76250 38.79534 38.49556 38.60156 0
[7,] 38.76039 38.73632 38.47145 38.69319 0
[8,] 38.62123 38.76703 38.40030 38.53354 0
[9,] 38.59810 38.73163 38.31917 38.55015 0
[10,] 38.63819 38.65792 38.42873 38.48740 0
[11,] 38.64002 38.74333 38.43387 38.48920 0
[12,] 38.43359 38.67401 38.31414 38.46783 0
[13,] 38.34827 38.57804 38.14686 38.35125 0
[14,] 38.41038 38.62463 38.17138 38.34302 0
[15,] 38.37329 38.53267 38.17653 38.22097 0
[16,] 38.33555 38.60949 37.89278 38.19956 4
[17,] 38.33024 38.49720 38.02496 38.19627 0
[18,] 38.19842 38.43880 38.01730 38.04205 4
[19,] 38.16998 38.44113 37.97058 38.03598 0
[20,] 38.13242 38.38488 37.86108 37.97245 0
[21,] 37.84771 38.22579 37.61745 37.69546 0
[22,] 38.09113 38.13806 37.78409 37.81250 4
[23,] 37.91487 38.13428 37.55473 37.72422 0
[24,] 37.73137 38.09007 37.48473 37.65181 0
[25,] 37.69295 37.89276 37.33098 37.48131 0
[26,] 37.54974 37.91984 37.33063 37.33263 0
[27,] 37.48773 37.73676 37.09027 37.31701 0
[28,] 37.21365 37.66051 36.89896 37.25519 0
[29,] 37.06418 37.55768 36.69254 37.01714 4
[30,] 36.96674 37.47745 36.53390 36.87150 4
[31,] 36.82324 37.23622 36.21721 36.68085 4
[32,] 36.73433 36.96207 36.01076 36.44930 4
[33,] 36.78201 36.94274 36.15842 36.56135 4
[34,] 36.68991 36.92524 36.17984 36.35769 0
[35,] 36.71720 36.94031 36.21548 36.68985 4
[36,] 36.89332 37.12322 36.25921 36.80057 0
[37,] 36.77870 36.88471 35.94958 36.46900 0
[38,] 36.53080 36.89801 36.08650 36.48348 0
[39,] 36.38996 36.65730 35.73058 36.31767 0
[40,] 35.93152 36.19707 35.44496 35.69141 0
float
和double
,其中double
更精确。我对R细节了解甚少,但它几乎肯定使用其中一种类型(可能是两种都有)。如果你的C++代码大量使用float
,那么R可能更准确,你可以通过更多地使用double
来改变这一点。反之亦然,如果C++代码使用double
,情况也是如此。 - David Thornley