问题描述
为了更好地了解 itertools groupby 的感觉,所以我按数字对元组列表进行了分组,并尝试获取结果组的列表.但是,当我将 groupby 的结果转换为列表时,我得到了一个奇怪的结果:除了最后一组之外,所有的都是空的.这是为什么?我认为将迭代器转换为列表效率会降低,但永远不会改变行为.我猜列表是空的,因为遍历了内部迭代器,但是什么时候/在哪里发生?
I was playing around to get a better feeling for itertools groupby, so I grouped a list of tuples by the number and tried to get a list of the resulting groups. When I convert the result of groupby to a list however, I get a strange result: all but the last group are empty. Why is that? I assumed turning an iterator into a list would be less efficient but never change behavior. I guess the lists are empty because the inner iterators are traversed but when/where does that happen?
import itertools l=list(zip([1,2,2,3,3,3],['a','b','c','d','e','f'])) #[(1, 'a'), (2, 'b'), (2, 'c'), (3, 'd'), (3, 'e'), (3, 'f')] grouped_l = list(itertools.groupby(l, key=lambda x:x[0])) #[(1, <itertools._grouper at ...>), (2, <itertools._grouper at ...>), (3, <itertools._grouper at ...>)] [list(x[1]) for x in grouped_l] [[], [], [(3, 'f')]] grouped_i = itertools.groupby(l, key=lambda x:x[0]) #<itertools.groupby at ...> [list(x[1]) for x in grouped_i] [[(1, 'a')], [(2, 'b'), (2, 'c')], [(3, 'd'), (3, 'e'), (3, 'f')]]
推荐答案
来自 itertools.groupby() 文档:
返回的组本身是一个迭代器,它与 groupby() 共享底层迭代器.因为源是共享的,当groupby()对象前进时,之前的组就不再可见了.
The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible.
将 groupby() 的输出转换为列表会推进 groupby() 对象.
Turning the output from groupby() into a list advances the groupby() object.
因此,您不应该将 itertools.groupby 对象类型转换为列表.如果您想将值存储为 list,那么您应该执行类似 list comprehension 的操作,以便创建 groupby 对象的副本:
Hence, you shouldn't be type-casting itertools.groupby object to list. If you want to store the values as list, then you should be doing something like this list comprehension in order to create copy of groupby object:
grouped_l = [(a, list(b)) for a, b in itertools.groupby(l, key=lambda x:x[0])]
这将允许您多次迭代您的列表(从 groupby 对象转换).但是,如果您对只迭代一次结果感兴趣,那么您在问题中提到的第二个解决方案将满足您的要求.
This will allow you to iterate your list (transformed from groupby object) multiple times. However, if you are interested in only iterating the result once, then the second solution you mentioned in the question will suffice your requirement.