我正在尝试使用caffe和Python进行实时图像分类。我在一个进程中使用OpenCV从我的网络摄像头进行视频流,并在另一个进程中使用caffe对从网络摄像头获取的帧进行图像分类。然后,我将分类结果传回主线程以说明网络摄像头流。问题是,即使我拥有NVIDIA GPU并在GPU上执行caffe预测,主线程也会变慢。通常情况下,在不进行任何预测的情况下,我的网络摄像头流以30 fps运行;但是,使用预测后,最佳情况下我的网络摄像头流只有15 fps。我已确认caffe确实在进行预测时使用了GPU,并且我的GPU或GPU内存没有达到最大值。我还验证了我的CPU核心在程序执行期间没有达到最大值。我想知道我是否做错了什么,或者是否无法将这两个进程真正分离。欢迎任何建议。以下是我的代码供参考。
class Consumer(multiprocessing.Process):
def __init__(self, task_queue, result_queue):
multiprocessing.Process.__init__(self)
self.task_queue = task_queue
self.result_queue = result_queue
#other initialization stuff
def run(self):
caffe.set_mode_gpu()
caffe.set_device(0)
#Load caffe net -- code omitted
while True:
image = self.task_queue.get()
#crop image -- code omitted
text = net.predict(image)
self.result_queue.put(text)
return
import cv2
import caffe
import multiprocessing
import Queue
tasks = multiprocessing.Queue()
results = multiprocessing.Queue()
consumer = Consumer(tasks,results)
consumer.start()
#Creating window and starting video capturer from camera
cv2.namedWindow("preview")
vc = cv2.VideoCapture(0)
#Try to get the first frame
if vc.isOpened():
rval, frame = vc.read()
else:
rval = False
frame_copy[:] = frame
task_empty = True
while rval:
if task_empty:
tasks.put(frame_copy)
task_empty = False
if not results.empty():
text = results.get()
#Add text to frame
cv2.putText(frame,text)
task_empty = True
#Showing the frame with all the applied modifications
cv2.imshow("preview", frame)
#Getting next frame from camera
rval, frame = vc.read()
frame_copy[:] = frame
#Getting keyboard input
key = cv2.waitKey(1)
#exit on ESC
if key == 27:
break
我相信问题出在caffe预测上,因为当我注释掉预测部分,并让进程之间传递虚假文本时,帧率再次达到了30fps。
class Consumer(multiprocessing.Process):
def __init__(self, task_queue, result_queue):
multiprocessing.Process.__init__(self)
self.task_queue = task_queue
self.result_queue = result_queue
#other initialization stuff
def run(self):
caffe.set_mode_gpu()
caffe.set_device(0)
#Load caffe net -- code omitted
while True:
image = self.task_queue.get()
#crop image -- code omitted
#text = net.predict(image)
text = "dummy text"
self.result_queue.put(text)
return
import cv2
import caffe
import multiprocessing
import Queue
tasks = multiprocessing.Queue()
results = multiprocessing.Queue()
consumer = Consumer(tasks,results)
consumer.start()
#Creating window and starting video capturer from camera
cv2.namedWindow("preview")
vc = cv2.VideoCapture(0)
#Try to get the first frame
if vc.isOpened():
rval, frame = vc.read()
else:
rval = False
frame_copy[:] = frame
task_empty = True
while rval:
if task_empty:
tasks.put(frame_copy)
task_empty = False
if not results.empty():
text = results.get()
#Add text to frame
cv2.putText(frame,text)
task_empty = True
#Showing the frame with all the applied modifications
cv2.imshow("preview", frame)
#Getting next frame from camera
rval, frame = vc.read()
frame_copy[:] = frame
#Getting keyboard input
key = cv2.waitKey(1)
#exit on ESC
if key == 27:
break
time.sleep(1)
进行了测试,并没有在我的程序中遇到减速的情况。我已经以CPU_ONLY模式运行了caffe,并且注意到了更严重的减速。尽管单帧预测为什么会如此大程度地压力CPU我还不确定。 - user3543300net.predict(image)
需要多长时间? - Imanol Luengocv2.waitKey(1)
吗?当你的代码只能达到15帧每秒时,这个调用中会发生一些“魔法”(事件处理)(参见文档),我曾经遇到过奇怪的交互。如果不是这个问题,你可以计时循环的其他部分(例如vc.read()
)来缩小可能导致15fps减速的语句范围。 - Ulrich Stern