问题描述
我尝试重写一些 csv 读取代码,以便能够在 Python 3.2.2 的多个内核上运行它.我尝试使用多处理的 Pool 对象,该对象是我从工作示例中改编而来的(并且已经为我的项目的另一部分工作).我遇到了一条难以解读和排除故障的错误消息.
I tried to rewrite some csv-reading code to be able to run it on multiple cores in Python 3.2.2. I tried to use the Pool object of multiprocessing, which I adapted from working examples (and already worked for me for another part of my project). I ran into an error message I found hard to decipher and troubleshoot.
错误:
Traceback (most recent call last): File "parser5_nodots_parallel.py", line 256, in <module> MG,ppl = csv2graph(r) File "parser5_nodots_parallel.py", line 245, in csv2graph node_chunks) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/multiprocessing/pool.py", line 552, in get raise self._value AttributeError: __exit__
相关代码:
import csv import time import datetime import re from operator import itemgetter from multiprocessing import Pool import itertools def chunks(l,n): """Divide a list of nodes `l` in `n` chunks""" l_c = iter(l) while 1: x = tuple(itertools.islice(l_c,n)) if not x: return yield x def csv2nodes(r): strptime = time.strptime mktime = time.mktime l = [] ppl = set() pattern = re.compile(r"""[A-Za-z0-9"/]+?(?=[, ])""") for row in r: with pattern.findall(row) as f: cell = int(f[3]) id = int(f[2]) st = mktime(strptime(f[0],'%d/%m/%Y')) ed = mktime(strptime(f[1],'%d/%m/%Y')) # collect list l.append([(id,cell,{1:st,2: ed})]) # collect separate sets ppl.add(id) return (l,ppl) def csv2graph(source): MG=nx.MultiGraph() # Remember that I use integers for edge attributes, to save space! Dic above. # start: 1 # end: 2 p = Pool() node_divisor = len(p._pool) node_chunks = list(chunks(source,int(len(source)/int(node_divisor)))) num_chunks = len(node_chunks) pedgelists = p.map(csv2nodes, node_chunks) ll = [] ppl = set() for l in pedgelists: ll.append(l[0]) ppl.update(l[1]) MG.add_edges_from(ll) return (MG,ppl) with open('/Users/laszlosandor/Dropbox/peers_prisons/python/codetenus_test.txt','r') as source: r = source.readlines() MG,ppl = csv2graph(r)
解决此问题的好方法是什么?
What's a good way to troubleshoot this?
推荐答案
问题出在这一行:
with pattern.findall(row) as f:
您正在使用 with 语句.它需要一个带有 __enter__ 和 __exit__ 方法的对象.但是pattern.findall返回一个list,with试图存储__exit__方法,但是找不到它,并引发错误.只需使用
You are using the with statement. It requires an object with __enter__ and __exit__ methods. But pattern.findall returns a list, with tries to store the __exit__ method, but it can't find it, and raises an error. Just use
f = pattern.findall(row)
改为.