Placing Marbles Into Buckets - Expected Max Marbles
This question is inspired by an estimation problem I noticed in the workplace. We wanted to predict the QPS our service needs to support when client requests are rare but come in bursts when they do arrive.
Problem Statement
We have N marbles and M buckets. Randomly and uniformly distribute these marbles among the buckets. How many marbles do we expect to be in the bucket with the most marbles?
Solution
This is not the first "expected value of max/min" problem I've addressed. You might remember Unconventional Statistics With Dice – Expected Minimum Roll from 3 dice. We will follow a similar strategy here.
Firstly, lets define the key output of the problem:
With that defined, let's jump into the solution:
We can express the final equation as the following Python function:
def cdf(N,M,a):
if N <= a:
return 1
if M == 1:
return 0
ans = 0
for take in range(a+1): # {0,1,...a}
ans = cdf(N-take, M-1, a)
return ans
And that's it!
Check out this Google Colab for an executable implementation.
Contact me on Twitter or at contact [at] thornewolf [dot] com!