15 July 2012

Subscribe to Engineering Growth

Stay up to date on new essays and updates on Growth Engineering.

My weekend hack is a Markov Chain baby name generator. @MarkovBaby will come up with a new baby name once an hour and tweets it out.

The result: Follow @markovbaby on Twitter.

What is a Markov Chain?

Read the wikipedia entry for a more thorough introduction, but in our case, a Markov Chain is a simple random process to generate text that looks sort of like other text.

For example: as we’re generating baby names, say we start with the letter “C”. What should be our next letter? Well, what kind of letters usually come after a “C” in names? Let’s look through our list of existing names and see what usually comes next.

A Markov Baby Name Generator

Great. Let’s pick the next character at random from within this list, weighing each possibility by how often it appears. If we get ‘end of word’, we’re done.

But let’s say we’ve picked ‘h’. Great - so far the name starts with a ‘ch’ - let’s look for the next character: what tends to follow an ‘h’ in our existing names? And so on.

Results

The result are a rather eclectic set of names. Some are silly and non-sensical (C, Ieahaholijayson), while others seem pretty reasonable (Marin, Gacon). A lot of them sound like they belong in Middle Earth (Miaviria), to Weseteros (Josth, Mindron). Occasionaly it’ll accidentally recreate a real name. Those are my favorite.

@MarkovBaby may be suitable for an expecting couple with just the appropriate amount of eccentricity and love of statistics.

The code

Is available on my github. Hopefully nothing too complicated; random.choice and collections.defaultdict proved rather helpful. I hadn’t touched Markov Chains since proving things about them in Randomized Algorithms class, so it was good to know that with a bit of clever python you could write one in a few dozen lines. For reference, mine was an ‘order-1’ (IE, only look at one previous character) chain.

See also, a discussion on Markov Chain implementations in Programming Pearls.

Possible extensions

  • Favorite or RT your favorite baby names, and I’ll put up a leaderboard for favorite ones.
  • Apply the same techniques (and same code) to startup names, using crunchbase: Markov 2.0.

If you’ve got a twitter bot missing in your life, follow @markovbaby on Twitter. Or follow me. That would be cool too.

Tags: #technical