问题描述
我正在使用一个实现 __add__ 但不继承 int 的 Python 对象.MyObj1 + MyObj2 工作正常,但 sum([MyObj1, MyObj2]) 导致 TypeError,因为 sum() 第一次尝试 0 + MyObj.为了使用 sum(),我的对象需要 __radd__ 来处理 MyObj + 0 或我需要提供一个空对象作为 start 参数.有问题的对象并非设计为空的.
I am working with a Python object that implements __add__, but does not subclass int. MyObj1 + MyObj2 works fine, but sum([MyObj1, MyObj2]) led to a TypeError, becausesum() first attempts 0 + MyObj. In order to use sum(), my object needs __radd__ to handle MyObj + 0 or I need to provide an empty object as the start parameter. The object in question is not designed to be empty.
在任何人问之前,对象不是类似列表或类似字符串的,因此使用 join() 或 itertools 无济于事.
Before anyone asks, the object is not list-like or string-like, so use of join() or itertools would not help.
编辑详情:该模块有一个 SimpleLocation 和一个 CompoundLocation.我将 Location 缩写为 Loc.SimpleLoc 包含一个右开区间,即 [start, end).添加 SimpleLoc 会产生一个 CompoundLoc,其中包含间隔列表,例如[[3, 6), [10, 13)].最终用途包括遍历联合,例如[3, 4, 5, 10, 11, 12],检查长度,检查成员.
Edit for details: the module has a SimpleLocation and a CompoundLocation. I'll abbreviate Location to Loc. A SimpleLoc contains one right-open interval, i.e. [start, end). Adding SimpleLoc yields a CompoundLoc, which contains a list of the intervals, e.g. [[3, 6), [10, 13)]. End uses include iterating through the union, e.g. [3, 4, 5, 10, 11, 12], checking length, and checking membership.
数字可能相对较大(例如,小于 2^32,但通常为 2^20).间隔可能不会很长(100-2000,但可能更长).目前,仅存储端点.我现在正在试探性地考虑尝试对 set 进行子类化,以便将位置构造为 set(xrange(start, end)).但是,添加集合会让 Python(和数学家)适应.
The numbers can be relatively large (say, smaller than 2^32 but commonly 2^20). The intervals probably won't be extremely long (100-2000, but could be longer). Currently, only the endpoints are stored. I am now tentatively thinking of attempting to subclass set such that the location is constructed as set(xrange(start, end)). However, adding sets will give Python (and mathematicians) fits.
我看过的问题:
- python 的 sum() 和非整数值
- 为什么在 python 中有一个 start 参数内置求和函数
- 重写 __add__ 方法后出现类型错误
我正在考虑两种解决方案.一种是避免 sum() 并使用此 评论.我不明白为什么 sum() 首先将迭代的第 0 项添加到 0 而不是添加第 0 项和第 1 项(如链接注释中的循环);我希望有一个神秘的整数优化原因.
I'm considering two solutions. One is to avoid sum() and use the loop offered in this comment. I don't understand why sum() begins by adding the 0th item of the iterable to 0 rather than adding the 0th and 1st items (like the loop in the linked comment); I hope there's an arcane integer optimization reason.
我的其他解决方案如下;虽然我不喜欢硬编码的零校验,但这是我能够使 sum() 工作的唯一方法.
My other solution is as follows; while I don't like the hard-coded zero check, it's the only way I've been able to make sum() work.
# ... def __radd__(self, other): # This allows sum() to work (the default start value is zero) if other == 0: return self return self.__add__(other)
总而言之,还有其他方法可以对既不能加整数也不能为空的对象使用sum()?
In summary, is there another way to use sum() on objects that can neither be added to integers nor be empty?
推荐答案
代替sum,使用:
import operator from functools import reduce reduce(operator.add, seq)
在 Python 2 中 reduce 是内置的,所以看起来像:
in Python 2 reduce was built-in so this looks like:
import operator reduce(operator.add, seq)
Reduce 通常比 sum 更灵活——你可以提供任何二进制函数,不仅 add,而且你可以可选地提供一个初始元素,而 sum 总是使用一个.
Reduce is generally more flexible than sum - you can provide any binary function, not only add, and you can optionally provide an initial element while sum always uses one.
另请注意:(警告:数学在前面咆哮)
从代数的角度来看,为没有中性元素的 add w/r/t 对象提供支持有点尴尬.
Providing support for add w/r/t objects that have no neutral element is a bit awkward from the algebraic points of view.
请注意:
- 自然
- 真实
- 复数
- N-d 个向量
- NxM 矩阵
- 字符串
连同添加形式的Monoid - 即它们是关联的并且具有某种中性元素.
together with addition form a Monoid - i.e. they are associative and have some kind of neutral element.
如果您的操作不是关联的并且没有中性元素,那么它就不会类似于"加法.因此,不要期望它与 一起工作得很好总和.
If your operation isn't associative and doesn't have a neutral element, then it doesn't "resemble" addition. Hence, don't expect it to work well with sum.
在这种情况下,使用函数或方法而不是运算符可能会更好.这可能不那么令人困惑,因为您的类的用户看到它支持 +,可能会期望它会以单向方式表现(就像加法通常那样).
In such case, you might be better off with using a function or a method instead of an operator. This may be less confusing since the users of your class, seeing that it supports +, are likely to expect that it will behave in a monoidic way (as addition normally does).
感谢您的扩展,我现在将参考您的特定模块:
Thanks for expanding, I'll refer to your particular module now:
这里有两个概念:
- 简单的地点,
- 复合地点.
可以添加简单的位置确实是有道理的,但是它们不会形成一个幺半群,因为它们的添加不满足闭包的基本属性——两个 SimpleLoc 的总和不是一个 SimpleLoc.它通常是一个 CompoundLoc.
It indeed makes sense that simple locations could be added, but they don't form a monoid because their addition doesn't satisfy the basic property of closure - the sum of two SimpleLocs isn't a SimpleLoc. It's, generally, a CompoundLoc.
OTOH,带有加法的 CompoundLocs 对我来说就像一个幺半群(一个可交换的幺半群,而我们正在使用它):它们的总和也是一个 CompoundLoc,它们的加法是关联的、可交换的和 中性元素是一个包含零个 SimpleLocs 的空 CompoundLoc.
OTOH, CompoundLocs with addition looks like a monoid to me (a commutative monoid, while we're at it): A sum of those is a CompoundLoc too, and their addition is associative, commutative and the neutral element is an empty CompoundLoc that contains zero SimpleLocs.
如果您同意我的观点(并且以上内容与您的实现相匹配),那么您将能够使用 sum,如下所示:
If you agree with me (and the above matches your implementation), then you'll be able to use sum as following:
sum( [SimpleLoc1, SimpleLoc2, SimpleLoc3], start=ComplexLoc() )
确实,这似乎有效.
我现在正在尝试对 set 进行子类化,以便将位置构造为 set(xrange(start, end)).但是,添加集合会让 Python(和数学家)适应.
I am now tentatively thinking of attempting to subclass set such that the location is constructed as set(xrange(start, end)). However, adding sets will give Python (and mathematicians) fits.
嗯,位置是一组数字,所以在它们之上抛出一个类似集合的接口是有意义的(所以 __contains__、__iter__、__len__,也许 __or__ 作为 + 的别名,__and__ 作为产品等).
Well, locations are some sets of numbers, so it makes sense to throw a set-like interface on top of them (so __contains__, __iter__, __len__, perhaps __or__ as an alias of +, __and__ as the product, etc).
至于 xrange 的构造,你真的需要吗?如果您知道要存储间隔集,那么您可能会通过坚持 [start, end) 对的表示来节省空间.如果您觉得有帮助,您可以输入一个实用方法,该方法采用任意整数序列并将其转换为最佳 SimpleLoc 或 CompoundLoc.
As for construction from xrange, do you really need it? If you know that you're storing sets of intervals, then you're likely to save space by sticking to your representation of [start, end) pairs. You could throw in an utility method that takes an arbitrary sequence of integers and translates it to an optimal SimpleLoc or CompoundLoc if you feel it's going to help.