我正在处理一个项目,该项目可以在给定一些数据列表时执行许多功能。我已经将这些列表分开,并定义了一些函数,我确定它们是正确的,其中包括均值函数和标准差函数。问题在于当我测试我的列表时,我得到了正确的平均值、正确的标准偏差,但错误的相关系数。我的数学运算可能出问题了吗?我需要使用Python标准库来查找相关系数。
我的代码:
def correlCo(someList1, someList2):
# First establish the means and standard deviations for both lists.
xMean = mean(someList1)
yMean = mean(someList2)
xStandDev = standDev(someList1)
yStandDev = standDev(someList2)
zList1 = []
zList2 = []
# Create 2 new lists taking (a[i]-a's Mean)/standard deviation of a
for x in someList1:
z1 = ((float(x)-xMean)/xStandDev)
zList1.append(z1)
for y in someList2:
z2 = ((float(y)-yMean)/yStandDev)
zList2.append(z2)
# Mapping out the lists to be float values instead of string
zList1 = list(map(float,zList1))
zList2 = list(map(float,zList2))
# Multiplying each value from the lists
zFinal = [a*b for a,b in zip(zList1,zList2)]
totalZ = 0
# Taking the sum of all the products
for a in zFinal:
totalZ += a
# Finally calculating correlation coefficient
r = (1/(len(someList1) - 1)) * totalZ
return r
样例运行:
我有一个列表[1,2,3,4,4,8]和[3,3,4,5,8,9]
我期望得到正确答案r = 0.8848,但是得到的是r = .203727
编辑:为了包含我所做的平均值和标准差函数。
def mean(someList):
total = 0
for a in someList:
total += float(a)
mean = total/len(someList)
return mean
def standDev(someList):
newList = []
sdTotal = 0
listMean = mean(someList)
for a in someList:
newNum = (float(a) - listMean)**2
newList.append(newNum)
for z in newList:
sdTotal += float(z)
standardDeviation = sdTotal/(len(newList))
return standardDeviation