问题描述
更新:我的问题已解决,我更新了问题中的代码源以与 Jason 的回答相匹配.请注意,rikitikitik 的答案是解决从样本中抽取卡片并替换的问题.
Update: my problem has been solved, I updated the code source in my question to match with Jason's answer. Note that rikitikitik answer is solving the issue of picking cards from a sample with replacement.
我想从加权列表中选择 x 个随机元素.采样是无更换的.我找到了这个答案:https://stackoverflow.com/a/2149533/57369 用 Python 实现.我在 C# 中实现了它并对其进行了测试.但是结果(如下所述)与我的预期不符.我对 Python 一无所知,所以我很确定我在将代码移植到 C# 时犯了一个错误,但我看不到 Pythong 中的代码在哪里有很好的文档记录.
I want to select x random elements from a weighted list. The sampling is without replacement. I found this answer: https://stackoverflow.com/a/2149533/57369 with an implementation in Python. I implemented it in C# and tested it. But the results (as described below) were not matching what I expected. I've no knowledge of Python so I'm quite sure I made a mistake while porting the code to C# but I can't see where as the code in Pythong was really well documented.
我选择了一张卡片 10000 次,这是我得到的结果(结果在执行中是一致的):
I picked one card 10000 times and this is the results I obtained (the result is consistent accross executions):
Card 1: 18.25 % (10.00 % expected) Card 2: 26.85 % (30.00 % expected) Card 3: 46.22 % (50.00 % expected) Card 4: 8.68 % (10.00 % expected)
如您所见,卡片 1 和卡片 4 的权重均为 1,但卡片 1 的选择频率高于卡片 4(即使我选择 2 或 3 张卡片).
As you can see Card 1 and Card 4 have both a weigth of 1 but Card 1 is awlays picked way more often than card 4 (even if I pick 2 or 3 cards).
测试数据:
var cards = new List<Card> { new Card { Id = 1, AttributionRate = 1 }, // 10 % new Card { Id = 2, AttributionRate = 3 }, // 30 % new Card { Id = 3, AttributionRate = 5 }, // 50 % new Card { Id = 4, AttributionRate = 1 }, // 10 % };
这是我在 C# 中的实现
Here is my implementation in C#
public class CardAttributor : ICardsAttributor { private static Random random = new Random(); private List<Node> GenerateHeap(List<Card> cards) { List<Node> nodes = new List<Node>(); nodes.Add(null); foreach (Card card in cards) { nodes.Add(new Node(card.AttributionRate, card, card.AttributionRate)); } for (int i = nodes.Count - 1; i > 1; i--) { nodes[i>>1].TotalWeight += nodes[i].TotalWeight; } return nodes; } private Card PopFromHeap(List<Node> heap) { Card card = null; int gas = random.Next(heap[1].TotalWeight); int i = 1; while (gas >= heap[i].Weight) { gas -= heap[i].Weight; i <<= 1; if (gas >= heap[i].TotalWeight) { gas -= heap[i].TotalWeight; i += 1; } } int weight = heap[i].Weight; card = heap[i].Value; heap[i].Weight = 0; while (i > 0) { heap[i].TotalWeight -= weight; i >>= 1; } return card; } public List<Card> PickMultipleCards(List<Card> cards, int cardsToPickCount) { List<Card> pickedCards = new List<Card>(); List<Node> heap = GenerateHeap(cards); for (int i = 0; i < cardsToPickCount; i++) { pickedCards.Add(PopFromHeap(heap)); } return pickedCards; } } class Node { public int Weight { get; set; } public Card Value { get; set; } public int TotalWeight { get; set; } public Node(int weight, Card value, int totalWeight) { Weight = weight; Value = value; TotalWeight = totalWeight; } } public class Card { public int Id { get; set; } public int AttributionRate { get; set; } }
推荐答案
程序中有两个小错误.首先,随机数的范围应该正好等于所有物品的总重量:
There are two minor bugs in the program. First, the range of the random number should be exactly equal to the total weight of all the items:
int gas = random.Next(heap[1].TotalWeight);
其次,将 gas > 的两个地方都改为 gas >=.
Second, change both places where it says gas > to say gas >=.
(原Python代码可以,因为gas是浮点数,所以>和>=的区别可以忽略不计.编写该代码是为了接受整数或浮点权重.)
(The original Python code is OK because gas is a floating-point number, so the difference between > and >= is negligible. That code was written to accept either integer or floating-point weights.)
更新:好的,您在代码中进行了建议的更改.我认为该代码现在是正确的!
Update: OK, you made the recommended changes in your code. I think that code is correct now!