我有一个多线程程序,我在以下环境下运行:
- Windows 10 PRO x64
- Python 3.8.2 (x64)(Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32)
- 也尝试了使用 Python 3.8.0,出现相同的错误
- VS Code (x64) 1.43.0
- ms-python 扩展程序适用于 VS Code (ms-python.python-2020.2.64397)
我遇到了这个错误:
Could not connect to 127.0.0.1: 63323
Could not connect to 127.0.0.1: 63323
Traceback (most recent call last):
Traceback (most recent call last):
File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 514, in start_client
s.connect((host, port))
ConnectionRefusedError: [WinError 10061] No connection could be established because the target computer actively rejected it
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 514, in start_client
s.connect((host, port))
ConnectionRefusedError: [WinError 10061] No connection could be established because the target computer actively rejected it
Traceback (most recent call last):
File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\pydevd.py", line 2536, in settrace
File "<string>", line 1, in <module>
File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\pydevd.py", line 2536, in settrace
Could not connect to 127.0.0.1: 63323
_locked_settrace(
_locked_settrace( File "c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\_vendored\pydevd\pydevd.py", line 2610, in _locked_settrace
Could not connect to 127.0.0.1: 63323
在这个应用程序中,我使用:
import functions as fnc
from multiprocessing import freeze_support
来自 functions.py 文件:
import sys
import csv
import time
import datetime
import argparse
import itertools as it
from os import system, name
from enum import Enum, unique
from tqdm import tqdm
from math import ceil
from multiprocessing import Pool, cpu_count
import codecs
程序在另一台安装了 Python 3.8.0 的电脑上运行良好,我不理解这个错误。
此程序仅比较两个文件并显示差异,不使用任何连接到另一个服务器或互联网。
唯一的区别是我现在使用英特尔i9-9900(8核/16线程),而在第二台计算机上仅使用具有4个核心的i5-7500。
编辑
当我将核心数从16设置为8时,程序可以正常运行。我的处理器有8个物理核心和16个逻辑核心,我使用cpu_count()检查CPU数量,如下:
threads = 8 #cpu_count()
p = Pool(threads)
问题在哪里?
编辑 - 2020年9月3日 - 源代码
main.py
import functions as fnc
from multiprocessing import freeze_support
# Run main program
if __name__ == '__main__':
freeze_support()
fnc.main()
functions.py
import sys
import csv
import time
import datetime
import argparse
import itertools as it
from os import system, name
from enum import Enum, unique
from tqdm import tqdm
from math import ceil
from multiprocessing import Pool, cpu_count
import codecs
# ENUM
@unique
class Type(Enum):
TT = 1
# CLASS
class TextFormat:
PURPLE = '\033[95m'
CYAN = '\033[96m'
DARKCYAN = '\033[36m'
BLUE = '\033[94m'
GREEN = '\033[92m'
YELLOW = '\033[93m'
RED = '\033[91m'
BOLD = '\033[1m'
UNDERLINE = '\033[4m'
END = '\033[0m'
class CrossReference:
def __init__(self, pn, comp, comp_name, type, diff):
self.pn = pn
self.comp = comp
self.comp_name = comp_name
self.type = type
self.diff = diff
def __str__(self):
return f'{self.pn} {get_red("|")} {self.comp} {get_red("|")} {self.comp_name} {get_red("|")} {self.type} {get_red("|")} {self.diff}\n'
def __repr__(self):
return str(self)
def getFullRow(self):
return self.pn + ';' + self.comp + ';' + self.comp_name + ';' + self.type + ';' + self.diff + '\n'
class CrossDuplication:
def __init__(self, pn, comp, cnt):
self.pn = pn
self.comp = comp
self.cnt = cnt
def __str__(self):
return f'{self.pn};{self.comp};{self.cnt}\n'
def __repr__(self):
return str(self)
def __hash__(self):
return hash(('pn', self.pn,
'competitor', self.comp))
def __eq__(self, other):
return self.pn == other.pn and self.comp == other.comp
# FUNCTIONS
def get_formated_time(mili):
sec = mili / 1000.0
return str(datetime.timedelta(seconds = sec))
def get_green(text): # return red text
return(TextFormat.GREEN + str(text) + TextFormat.END)
def get_red(text): # return red text
return(TextFormat.RED + str(text) + TextFormat.END)
def get_yellow(text): # return yellow text
return(TextFormat.YELLOW + str(text) + TextFormat.END)
def get_blue(text): # return blue text
return(TextFormat.BLUE + str(text) + TextFormat.END)
def get_bold(text): # return bold text format
return(TextFormat.BOLD + str(text) + TextFormat.END)
def print_info(text): # print info text format
print("=== " + str(text) + " ===")
# ### LOADER ### Load Cross Reference file
def CSVCrossLoader(file_url, type):
try:
print(get_yellow("============ LOAD CROSS CSV DATA ==========="))
print_info(get_green(f"Try to load data from {file_url}"))
destination = []
with open(file_url, encoding="utf-8-sig") as csv_file:
csv_reader = csv.reader(csv_file, delimiter=';')
line_count = 0
for row in csv_reader:
if row[0].startswith('*'):
continue
if Type[row[3]] is not type:
continue
cr = CrossReference(row[0], row[1], row[2], row[3], row[4])
destination.append(cr)
line_count += 1
filename = file_url.rsplit('\\', 1)
print(
f'Processed {get_red(line_count)} lines for {get_red(type.name)} from {filename[1]}')
print_info(get_green(f"Data was loaded successfully"))
return destination
except Exception as e:
print(e)
print_info(get_red(f"File {file_url} could not be loaded"))
print_info(get_red("Program End"))
exit(0)
# ### LOADER ### Load Catalog with PN details (load only first row)
def CSVCatalogLoader(file_url):
try:
print(get_yellow("=========== LOAD CATALOG CSV DATA =========="))
print_info(get_green(f"Try to load data from {file_url}"))
destination = []
with open(file_url, encoding="utf-8-sig") as csv_file:
csv_reader = csv.reader(csv_file, delimiter=';')
line_count = 0
for row in csv_reader:
if row[0].startswith('*'):
continue
destination.append(row[0])
line_count += 1
filename = file_url.rsplit('\\', 1)
print(f'Processed {get_red(line_count)} lines from {filename[1]}')
print_info(get_green(f"Data was loaded successfully"))
return destination
except:
print_info(get_red(f"File {file_url} could not be loaded"))
print_info(get_red("Program End"))
exit(0)
def FindDuplications(tasks):
dlist, start, count = tasks
duplicates = []
for r in tqdm(dlist[start:start + count]):
matches = [x for x in dlist if r.pn == x.pn and r.comp == x.comp]
duplicates.append(CrossDuplication(r.pn, r.comp, len(matches)))
return {d for d in duplicates if d.cnt > 1}
def CheckDuplications(cross_list):
threads = cpu_count()
tasks_per_thread = ceil(len(cross_list) / threads)
tasks = [(cross_list, tasks_per_thread * i, tasks_per_thread) for i in range(threads)]
p = Pool(threads)
duplicates = p.map(FindDuplications, tasks)
p.close()
p.join()
duplicates = {item for sublist in duplicates for item in sublist}
return duplicates
def main():
# Main Title of app
print_info(get_yellow("Run app"))
# VARIABLES
catalog_list = []
cross_list = []
# Start calculate program running time
start_time = int(round(time.time() * 1000))
# load URL param from program input arguments
validation_type = Type[sys.argv[1]]
cross_ref_url = sys.argv[2]
catalog_url = sys.argv[3]
# Get info abou tested type
print_info(get_blue(f"||| Validate data for {validation_type.name} |||"))
print("Number of processors: ", cpu_count())
print()
# load data
cross_list = CSVCrossLoader(cross_ref_url, validation_type)
catalog_list = CSVCatalogLoader(catalog_url)
# Chech data in Cross Reference for Duplications [ MULTITHREAD ]
duplicates = CheckDuplications(cross_list)
# Print duration of execution script
mili = int(int(round(time.time() * 1000)) - start_time)
print(f'Script duration - {mili} ms | {get_formated_time(mili)}')
# End of program
print_info(get_yellow(""))
print()
launch.json - VS Code 配置文件
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "First",
"type": "python",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal",
"args": [
"TT",
"..\\url\\ref.csv",
"..\\url\\catalog.csv"
]
}
]
}
数据文件 - ref.csv(示例)
xxx;123;
ccc;dgd;
xxx;323;
xxx;dgd;
xxx;123;
...etc:.
数据文件 - catalog.csv(示例)
xxx;
ccc;
vvv;
fff;
xyx;
xxx;
cff;
ccc;
www;
...etc:.
应用程序加载2个CSV文件,并在ref.csv中查找行的重复项,该文件超过100k+行,并使用foreach循环将每行的第一列和第二列与相同数据进行比较。
Python中使用多线程的对象列表的For循环-我的上一个问题,如何进行多线程处理
编辑 - 2020年10月3日 - 第三台电脑
今天我在我的笔记本电脑(联想T480s)上尝试了它,搭载Intel Core i7-8550U处理器,具有4个核心/8个线程
我使用threads = cpu_count()
运行它,该函数返回8个核心/线程,并且一切正常,与上两台先前的PC相同的配置,但仅在Intel Core i9-9900上出现错误
此外,我尝试在i9-9900上设置:
threads = 8 # OK
threads = 12 # OK
threads = 14 # ERROR
threads = 16 # ERROR
=========================================== 在CMD或Powershell中运行本地程序可以正常工作,使用16个线程- OK
C:\Users\test\AppData\Local\Programs\Python\Python38\python.exe 'c:\Users\test\Documents\Work\Sources\APP\src\APP\cross-validator.py' 'TT' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\APP\Definitions\ref.csv' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\APP\Definitions\Tan\catalog.csv'
通过 VS Code 调试运行时,添加一个链接到 ms-python 的参数 - 错误
${env:PTVSD_LAUNCHER_PORT}='49376'; & 'C:\Users\test\AppData\Local\Programs\Python\Python38\python.exe' 'c:\Users\test\.vscode\extensions\ms-python.python-2020.2.64397\pythonFiles\lib\python\new_ptvsd\no_wheels\ptvsd\launcher' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\cross-validator.py' 'TAN' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\APP\Definitions\ref.csv' 'c:\Users\test\Documents\Work\Sources\APP\src\APP\APP\Definitions\Tan\catalog.csv'
谢谢您的帮助