Python 3:图像kmeans中最常见的颜色,数据类型匹配

4
我正在尝试将一些Python 2.4代码适配到3.5版本。我正试图使用这个线程的顶部答案: Python - Find dominant/most common color in an image,但是它给了我麻烦。这位作者也遇到了麻烦,但不同的麻烦 Error with hex encode in Python 3.3 具体来说,与scipy和kmeans变量类型有关?以下是代码和traceback。非常感谢您的帮助!-S
import struct
from PIL import Image
import scipy
import scipy.misc
import scipy.cluster
import numpy as np

NUM_CLUSTERS = 3

print('reading image')
im = Image.open('image.jpg')
im = im.resize((150, 150))      # optional, to reduce time
ar = scipy.misc.fromimage(im)
shape = ar.shape
ar = ar.reshape(scipy.product(shape[:2]), shape[2])

print ('finding clusters')
print(ar)
print("Variable type:", type(ar))
codes, dist = scipy.cluster.vq.kmeans(ar, NUM_CLUSTERS)
print('cluster centres:\n', codes)

vecs, dist = scipy.cluster.vq.vq(ar, codes)         # assign codes
counts, bins = scipy.histogram(vecs, len(codes))    # count occurrences
index_max = scipy.argmax(counts)                    # find most frequent
peak = codes[index_max]

colour = ''.join(format(c, '02x') for c in peak).encode('hex_codec')
print ('most frequent is %s (#%s)' % (peak, colour))

而且回溯信息是:

=========== RESTART: /Users/splash/Dropbox/PY/image-dom-color.py ============
reading image
finding clusters
[[255 255 255]
 [255 255 255]
 [255 255 255]
 ..., 
 [255 255 255]
 [255 255 255]
 [255 255 255]]
Variable type: <class 'numpy.ndarray'>
Traceback (most recent call last):
  File "/Users/splash/Dropbox/PY/image-dom-color.py", line 20, in <module>
    codes, dist = scipy.cluster.vq.kmeans(ar, NUM_CLUSTERS)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/scipy/cluster/vq.py", line 568, in kmeans
    book, dist = _kmeans(obs, guess, thresh=thresh)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/scipy/cluster/vq.py", line 436, in _kmeans
    code_book, has_members = _vq.update_cluster_means(obs, obs_code, nc)
  File "scipy/cluster/_vq.pyx", line 347, in scipy.cluster._vq.update_cluster_means (scipy/cluster/_vq.c:4695)
TypeError: type other than float or double not supported
>>> 
1个回答

8
堆栈跟踪告诉我们,scipy.cluster._vq.update_cluster_means()仅支持floatdouble数据类型。查看scipy的源代码可以证实这一点:
def update_cluster_means(np.ndarray obs, np.ndarray labels, int nc):
    """
    The update-step of K-means. Calculate the mean of observations in each
    cluster.
    Parameters
    ----------
    obs : ndarray
        The observation matrix. Each row is an observation. Its dtype must be
        float32 or float64.
    ...

来源:GitHub上的_vq.pyx

要解决问题,您首先需要使用numpy.ndarray.astype()将输入转换为支持的数据类型:

codes, dist = scipy.cluster.vq.kmeans(ar.astype(float), NUM_CLUSTERS)
# Or:       = scipy.cluster.vq.kmeans(ar.astype('double'), NUM_CLUSTERS)

1
谢谢Johan - 帮我解决了那个问题 :) 现在正在解决下一个 bug。我已经尝试过多种方式将变量转换为数字,但都没有成功。非常感谢你的提示! - Plashkes
1
如果下一个错误与打印颜色的十六进制值有关,则似乎不需要最终的 encode('hex_codec')。还需要将结果转换回整数:`peak = peak.astype(int)` `colour = ''.join(format(c, '02x') for c in peak)` `print ('most frequent is %s (#%s)' % (peak, colour))` - elmis

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接