Kristofer Karlsson's development blog: August 2010

Friday, August 20, 2010

Random numbers without repetition

The origin of this problem comes from picking K elements from a collection of elements of size N (N >= K, obviously). If K = N, this is the same thing as shuffling, which can be done using the excellent Fisher-Yates algorithm in O(N). This algorithm can be used when K is smaller than N as well, you just have to disregard N - K of the elements. However, if K is very small and N is very large, this algorithm is quite far from optimal.

Another approach is to simply pick elements randomly from the collection and keep track of which elements have already been picked. If you hit the same element again, you simply pick another one. This would work if the entropy source was good enough to ensure that would always pick numbers in a range with equal probability, to avoid getting stuck picking numbers already used. If K is much smaller than N, this would work well, but if K was very close to N, you could probably end up doing far too many retries to be reasonable as soon as you start filling up the result.

Instead, you could use the original Fisher-Yates strike-out approach and simply stop after K elements. This could be done in O(K*(log(K) + K)), which can be simplified to O(K^2) with a naive implementation:


public <T> List<T> take(List<T> source, int k) {
int n = source.size();
if (k > n) {
throw new IllegalStateException(
"Can not take " + k +
" elements from a list with " +
n + " elements");
}
List<T> result = new ArrayList<T>(k);
SortedSet<Integer> offsets = new TreeSet<Integer>();

for (int i = 0; i < k; i++) {  // O(K)
int off = random.nextInt(n - i);

for (int offset : offsets) {  // O(K)
if (off >= offset) {
off++;
} else {
break;
}
}
offsets.add(off); // O(log(K))
result.add(source.get(off));
}
return result;
}

if K >= O(sqrt(N)), then the plain shuffle is better, but when K is smaller than sqrt(N), this approach would be the winner.

I wonder if it's possible to do it faster than O(K^2). One idea is to use some sort of binary search to quicker find the actual offset without having to iterate the list of offsets. This would make the algorithm O(K*log(K)) instead which would be much better.

I experimented a bit by repeatedly running a binary search of the offset until the offset can not be incremented further. If N is large and K is small, this would often stop after the first iteration since it's sparse between findings, but you can still easily construct worst case scenarios which would need to iterate O(K) times.

Does anyone know any good articles / papers / blogs about algorithms for picking K distinct random numbers in the range 1..N?

Update:

After having eaten lunch, I figured out how to make it run in O(K):


public <T> List<T> take(List<T> source, int k) {
int n = source.size();
if (k > n) {
throw new IllegalStateException(
"Can not take " + k +
" elements from a list with " + n +
" elements");
}
List<T> result = new ArrayList<T>(k);
Map<Integer,Integer> used = new HashMap<Integer,Integer>();
int metric = 0;
for (int i = 0; i < k; i++) {
int off = random.nextInt(n - i);
while (true) {
metric++;
Integer redirect = used.put(off, n - i - 1);
if (redirect == null) {
  break;
}
off = redirect;
}
result.add(source.get(off));
}
assert metric <= 2*k;
return result;
}

The idea here is to instead of recalculating off to the actual position, as in Fisher-Yates, just assume that the random index is correct. Then, use the map to check if the index has already been used, and if so, use another guaranteed free index instead.

The while loop make look bad, but it never runs more than 2*K times in total.
(I haven't proven this formally, I just ran a lot of benchmarks on random input.)

If I were to prove it though, my general strategy would be that the loop is entered exactly K times, and you will follow at most K links, since only K links will be constructed and no link is followed more than once.

Update 2:
Apparently Robert Floyd already invented this algorithm (well, it's very similar at least) (published in Communications of the ACM, September 1987, Volume 30, Number 9).

public <T> List<T> take(List<T> source, int k) {
 int n = source.size();
 List<T> result = new ArrayList<T>(k);
 Set<Integer> used = new HashSet<Integer>();
 for (int i = n - k; i < n; i++) {
   int off = random.nextInt(i + 1);
   if (!used.add(off)) {
     off = i;
     used.add(off);
   }
   result.add(source.get(off));
 }
 return result;
}

The only problem with this is that the ordering of result is not random, only which elements have been chosen. This is easily fixed by running Collections.shuffle(result, random) though.

I see two significant differences with this approach.

The first (which is of the least relevance) is the amount of entropy used.
If we assume that calling random.nextInt(n) consumes log_2(n) bits of entropy then my algorithm uses: log_2(n) + log_2(n-1) + log_2(n-2) + ... + log_2(n - k + 1) = log_2(n! / (n - k)!) = log_2(n!) - log_2((n - k)!) bits of entropy.

Robery Floyd with an additional plain shuffle uses:
log_2(n!) - log_2((n - k)!) + log_2(k!) bits of entropy which may be significant if k is large compared to n, and if entropy is expensive where the algorithm is used.

The second difference is much more important. The Robert Floyd-algorithm can't be run interactively, picking one or more numbers at a time. You need to know the value of k (or an upper bound of k) before requesting a random number, since the first number chosen by the algorithm will be in the range [0, n - k).
My algorithm on the other hand can be run partially and resumed without knowing the value of k at all.

Friday, August 6, 2010

Python annoyances - generators

Another annoyance in Python is generators. Since Python 2.5, the generator concept was extended to include coroutine support. (See http://www.python.org/dev/peps/pep-0342/ )
I for one think it's an odd design of coroutines.
Note that this comes from someone who is used to (and spoiled by) Lua's coroutines.

I have two problems with coroutines/generators in Python:

First of all, "def" gets multiple meanings. It's not just a function definition anymore.
It could also be a generator definition! And the only way to notice this is to
scan the entire function definition for a yield statement.

I would not have a problem with this if it wasn't for the case that generator
definitions and function definitions behave completely differently.

def gen():
 return 123

I can call this with gen() and get 123 back. Simple!

def gen():
 yield 123

If I call gen() now, I don't get 123 back, I get a generator object.
So I have to do this instead:

g = gen()
value = g()

That by itself isn't much trickier, but it's hard to know by just looking at the code if it's a generator. It's completely invisible at the call site, and at the definition it could be anywhere in the function. This is probably why most generators I've seen in Python have been fairly short and simple (which is a good thing for function design over all anyway).

Let's compare it to Lua, which was the language that introduced me to coroutines:

-- Simple function call
function gen()
 return 123
end
value = gen()

-- Using a generator / coroutine
function gen()
 coroutine.yield(123)
end
c = coroutine.wrap(gen)
value = c()

Here it's quite clear at the call site that we're dealing with a coroutine. And if we were to call gen as a regular function, we'd get a runtime error unless we were actually running in a coroutine already.

That gets us into the second annoyance. Python only supports yield in the generator definition. If you were to build a more complex generator, you'd have to write it as one long function instead of splitting it up into many small functions.

As it happens, I have a good example to use. The following is actual code I have for a hobby project. It's an AI implemented with Lua coroutines. To keep it small, not all functions have been defined, but I've kept everything that deals with coroutines.

function resume(entity)
 if not entity.coroutine then
   entity.coroutine = coroutine.create(function()
     return entity:runAI()
   end)
 end
 coroutine.resume(entity.coroutine)
end

function Entity:moveTowards(x, y)
 local dx = sign(x - self.x)
 local dy = sign(y - self.y)
 local dir = getDirection(dx, dy)
 if dir ~= -1 then
   TryMove(self.entity, dir)
   coroutine.yield()
 end
end

function Entity:moveTo(x, y)
 while self.x ~= x or self.y ~= y do
   self:moveTowards(x, y)
 end
end

function Entity:attack(dx, dy)
 local dir = getDirection(dx, dy)
 if dir ~= -1 then
   TryAttack(self.entity, dir)
   coroutine.yield()
 end
end

function Entity:chasePlayer()
 while true do
   local distance = self:distanceToPlayer()
   if distance > 8 then
     break
   end
   if distance <= 2 then
     local dx = px - self.x
     local dy = py - self.y
     self:attack(dx, dy)
   else
     self:moveTowards(px, py)
   end
 end
end

function Entity:patrolTo(x, y)
 while self.x ~= x or self.y ~= y do
   local distance = self:distanceToPlayer()
   if distance <= 5 then
      self:chasePlayer()
   else
     self:moveTowards(x, y)
   end
 end
end

-- Populate world with deadly enemies
local entity = Entity:new(20, 10)
function entity:runAI()
 while true do
   self:moveTo(20, 8)
   self:moveTo(20, 12)
 end
end
addEntity(entity)

local entity = Entity:new(15, 15)
function entity:runAI()
 while true do
   self:patrolTo(15, 15)
   self:patrolTo(15, 10)
   self:patrolTo(10, 10)
   self:patrolTo(10, 15)
 end
end
addEntity(entity)

As you can see, each entity type can be written by using very high level
constructs, which themselves are defined by simple logic and calls to lower level
functions. This lets you write an AI as if it was running in its own thread,
but in reality, the game engine only calls the resume-function for each entity
on its main thread.

Without coroutines, you'd have to resort to writing a state machine instead,
which can be more painful to write and maintain.

Now, I've tried converting this to Python to see what it would look like.
This was my first iteration, which won't run because the generators itself
don't contain the actual yield-call.

def resume(entity):
 if entity.coroutine == None:
   entity.coroutine = entity.runAI()
 return entity.coroutine()

class Entity:
 def __init__(self, x, y):
   self.x = x
   self.y = y

 def moveTowards(self, x, y):
   dx = sign(x - self.x)
   dy = sign(y - self.y)
   dir = getDirection(dx, dy)
   if dir != -1:
     TryMove(self.entity, dir)
     yield

 def moveTo(self, x, y):
   while self.x != x or self.y != y:
     self.moveTowards(x, y)

 def attack(self, dx, dy):
   dir = getDirection(dx, dy)
   if dir != -1:
     TryAttack(self.entity, dir)
     yield

 def chasePlayer(self):
   while True:
     distance = self.distanceToPlayer()
     if distance > 8:
       break
     if distance <= 2:
       dx = px - self.x
       dy = py - self.y
       self.attack(dx, dy)
     else:
       self.moveTowards(px, py)

 def patrolTo(self, x, y):
   while self.x != x or self.y != y:
     distance = self.distanceToPlayer()
     if distance <= 5:
       self.chasePlayer()
     else:
       self.moveTowards(x, y)

# Populate world with deadly enemies
class Runner(Entity):
 def runAi(self):
   while True:
     self.moveTo(20, 8)
     self.moveTo(20, 12)
entity = Runner(20, 10)
addEntity(entity)

class Patroller(Entity):
 def runAi(self):
   while True:
     self.patrolTo(15, 15)
     self.patrolTo(15, 10)
     self.patrolTo(10, 10)
     self.patrolTo(10, 15)

entity = Patroller(15, 15)
addEntity(entity)

Now, let's try refactoring until all yields are in the actual generator.
To do this, we could either simply inline all the functions that would yield,
until they're all in the runAi-method. The problem with this is a lot of
duplicated code, so that's not a good option.

Another way would be to only have the yields in the runAi, and let the currently
yielding functions return some value instead. The problem with that is that we
lose the knowledge of where to resume from, and thus we have to build some sort
of state maching to remember it, which would remove the point of having
coroutines in the first place.

If anyone can see a good way of refactoring that Python code (I am fairly new
to Python after all) to work as a coroutine, please let me know!

Until then, I'll just assume that python coroutines are inferior to Lua's. ;-)

[Here's an interesting article about coroutines: http://www.inf.puc-rio.br/~roberto/docs/MCC15-04.pdf ]

Python annoyance - scoping

I recently tried writing a small program in Python, and I have to give it credit here - it was pretty easy to get started, using the online language reference and standard library reference for help.

One thing I stumbled over was a subtlety in the scoping that I didn't expect. Here's an example:

foo = 123
def fun():
 print(foo)
fun()
print(foo)

This prints 123 and 123, as you would expect. This implies that function blocks can access variables in outer blocks. Let's try another example:

foo = 123
def fun():
 foo = 456
 print(foo)
fun()
print(foo)

This prints 456 and 123. This means that assignment of a variable in a function doesn't assign the outer variable. Instead it creates a new variable local to the function block. Let's look at the final example:

foo = 123
def fun():
 print(foo)
 foo = 456
 print(foo)
fun()
print(foo)

I can think of two reasonable ways to interpret this:

First print 123 (from the outer block), then 456 (creating a new variable in the function block), then 123 (outer block variable unchanged)
First print 123 (from the outer block), then 456 (modifying the outer variable), then 456.

Instead, Python throws an error:

UnboundLocalError: local variable 'foo' referenced before assignment

This means that Python actually inspects the entire function and searches for assignment operations. If it finds any, it creates a new local variable. If it doesn't, it refers to the outer.

You can override this behaviour by doing this:

foo = 123
def fun():
 global foo
 print(foo)
 foo = 456
 print(foo)
fun()
print(foo)

This forces python to only refer to the outer variable instead of creating a new. This outputs 123, 456, 456.

The annoyance is that Pythons automatic choice between "read outer" and "create new local" comes from inspecting the entire function, which means that an assignment at the end of the function can create an error at the first line of the function, which can be very confusing for a beginner.

I think a better solution would have been to always have outer by default, except for in loops and function parameter definitions. You could then add the keyword "local, my, var or val" for when you really want to create a new local variable (like lua, perl, javascript, scala et.c. does).

Kristofer Karlsson's development blog

Friday, August 20, 2010

Random numbers without repetition

Friday, August 6, 2010

Python annoyances - generators

Python annoyance - scoping

My Java projects

My WoW projects

Blog Archive

Labels