Richard Scarry and hexadecimal

When you read a hexadecimal number out loud, how do you pronounce the letters?

At my workplace, I’ve grown used to our custom of pronouncing the letters using the Joint Army/Navy Phonetic Alphabet standardized in 1941. The letter digits are pronounced Able, Baker, Charlie, Dog, Easy and Fox. Under this scheme, the hexadecimal number 0x7F8D3BC0 would be pronounced “Seven Fox Eight Dog Three Baker Charlie Zero.” This was disorienting to me at first, but after eight years this is now so natural that this is how I pronounce the digits in my mind even if I’m not speaking them.

We’ve started collecting Richard Scarry’s children’s books. Richard Scarry writes with a degree of detail and whimsy that holds an adult’s interest — much like old-school Sesame Street. (How far it has fallen — modern-day Sesame Street is much too postmodern, pluralistic, saccharine and juvenile for my taste. I console myself by searching for old Sesame Street clips on Youtube.) Recently I was amused and pleased to discover that one of Richard Scarry’s characters is named Able Baker Charlie! What a strange juxtaposition of worlds for me — programming and children’s books.

Able Baker Charlie is a mouse. He is a baker, and assists Baker Humperdink, a pig. Despite his small size, Able Baker Charlie is capable assisting with any step of the baking process, from stoking the oven, to mixing the dough, to putting loaves in the oven, and even delivering bread around Busytown. Below you may see a picture of Able Baker Charlie ably distributing French baguettes to Louie’s Restaurant.

Richard Scarry served in the U. S. Army during World War II. No doubt this is the source of the Able Baker Charlie aptonym. It still gives me a chuckle every time we read it.

Programming Fonts

I’ve long been dissatisfied with Courier New as a programming font; I’ve found the characters to be bulky and the serifs distracting.  For the past year or so I’ve settled on the beautiful Lucida Console as my favorite font to code in.  Lucida Console is bundled with Windows XP and Windows Vista, and unfortunately it doesn’t seem to be available for free redistribution.  However, you can find some other aesthetically pleasing fixed-width programming fonts, some of which are available for free download, at Hamstu’s Typography of Code.

HT: Woot

Python generators – saving time and memory

Python version 2.2 introduced support for generator functions. A generator is a function that returns something kind of like a list. You can think of a generator as a function that gets to return more than once; each time it returns, it produces a new item in the list. Here is a simple example of a generator:

def mygen() :
  print "calling generator"
  for x in [1, 2, 3, 4] :
    print "yielding next value"
    yield x

for item in mygen() :
  print item

This example prints the following:

calling generator
yielding next value
1
yielding next value
2
yielding next value
3
yielding next value
4

Instead of return, we used the yield statement inside our generator. yield returns an item from the generator, but marks that point in the code so that we continue processing when we go back to the generator to fetch the next item in the pseudo-list. Notice how the generator is only called once, but the yield points are interleaved with the print statements in the calling code. Each time the generator needs to produce a new value, it picks up from the previous yield point. When the generator reaches the end of the function, no more values are produced. You cannot use return within a generator.

Let’s look at some code where the use of generators might help us. The following code instantiates a list of objects and then creates HTML to display them. We’ll assume the existence of a database API to execute queries and retrieve results:

def get_objects() :
  result = []
  query = db_execute("...")
  row = query.fetchrow()
  while not row is None :
    result.append(MyObject(row))  # Build object from DB, append to result
    row = query.fetchrow()
  return result
. . .
for object in get_objects() :
  print object.getHTML()

This code creates the entire list of objects before it can print any of them. There are two problems with this — first, there is a huge delay to create all of the objects before any progress is made in printing; this means that the user’s browser has no partial results to display. Second, all of the objects must be held in memory at the same time; if there are many objects, this can cause significant overhead.

Generators allow us to attack both of these problems. Since a generator produces items one at a time, on demand, it avoids both of these problems. We don’t have to wait to construct all of the objects in the list before we use the first one. And once we are done using an object, the Python garbage collector is now free to immediately clean it up. Here’s how we might rewrite the above code to use generators:

def get_objects() :
  query = db_execute("...")
  row = query.fetchrow()
  while not row is None :
    yield MyObject(row)    # Build object from DB, yield to caller
    row = query.fetchrow()
. . .
for object in get_objects() :
  print object.getHTML()

With a small change we have now significantly improved our code’s memory footprint — all of the objects do not need to be held in memory at the same time. And we are now producing the output for each object as we create it, without needing to wait for all the objects to be instantiated first. This is a significant improvement!

We can also build a chain of generators. Let’s assume that we need to modify the code above to optionally display only those objects that have an “active” flag set. Ordinarily, if we use Python’s filter function to accomplish this, it needs to create the entire list all at once. But if we use a generator to perform the filtering, then we can still keep our optimizations since we are only ever creating and filtering one object at a time. Here’s an example:

my_objs = get_objects()         # This returns a generator object

if display_active_only :
  def active_filter(objects) :  # A filtering generator
    for object in objects :
      if object.active :
        yield object

  my_objs = active_filter(my_objs)

for object in my_objs :
  print object.getHTML()

A generator function doesn’t produce a real list; instead, it produces a generator object that behaves like something called an iterator. You can’t write either of the following statements for a generator or iterator:

print len(get_objects())
print get_objects()[3]

The for statement is smart enough to traverse a generator, and it will probably be sufficient for your needs. Perhaps you can get the count of objects by other means, such as executing an SQL COUNT request. If you absolutely need to access a generator as a list, you can coerce it to a list as follows:

objlist = list(get_objects())

But be aware that this removes all of the advantages that we’ve discussed here, since this causes all of the objects returned by the generator to be created at once and stored in the list. If you find yourself needing to do this, you should consider rewriting your code so that you don’t need to do so. Or perhaps generators aren’t the right solution for your particular problem.

Python combinations

In September 2007, Brian Adkins posted a simple Logo program to print all possible combinations of lists of items, and asked for alternatives in other languages. He provided a Ruby alternative which would translate fairly easily into Python. For my own Python implementation I decided to try a completely different approach:

prod1   = lambda elem, vec : map(lambda x : elem+' '+x, vec)
xprod   = lambda vec1, vec2 : reduce(list.__add__, map(lambda elem : prod1(elem ,vec2), vec1))
choices = lambda x : 'n'.join(reduce(xprod, x))+'n'

q = [['small', 'medium', 'large'],
     ['vanilla', 'ultra chocolate', 'lychee', 'rum raisin', 'ginger'],
     ['cone', 'cup']]

print choices(q),

Blocking ads

For awhile now I’ve been using a tool to block advertisements in my web browsing. Here’s how to install it:

  1. Do you use “Firefox” to surf the web? If not, I recommend it; you can get it here. This ad blocking tool only works with Firefox (though I understand there are ad blockers for other browsers).
  2. Once you have Firefox installed, visit Adblock Plus to install the ad blocker:
    1. Click on the “Install Adblock Plus” link.
    2. A bar will pop up indicating that the install was blocked.
    3. Click the “Options” button on the bar and enable installations for this website.
    4. Click on the “Install Adblock Plus” link again, then click on the “Install” button.
  3. When it asks you which blocking list to subscribe to, choose the “EasyList” blocking list. Once a week your computer will check this list to see if there are any new known ads to be blocked.

After you do this you should rarely see ads in Firefox.

Resume keywords

I’ve had my resume posted online and from time to time I receive a solicitation for job openings. Over time I’ve paid attention to my referral logs to see how folks find my resume. Almost every search includes the following keywords, and almost always they are required keywords:

+objective +education +experience

A large majority of searches also exclude the following keywords, which was a little surprising to me. I imagine that these keywords are meant to exclude job listings. If you post your resume online you should take care not to use these words!

-job -career

I have also seen the following keywords and exclusions. They are less common than the above, but it is worth considering them:

  • +senior +programmer
  • +engineer developer
  • engineer developer
  • intitle:Resume OR inurl:Resume
  • “application programmer” OR “application engineer”
  • -sample -example
  • -your -we
  • -opportunities
  • -apply -submit
  • -tips -guidelines
  • -interview
  • -service

It’s possible that I’m missing other common searches because my own resume doesn’t meet the search criteria! If you are aware of other common search terms please let me know.

Functional Python

A friend of mine is learning Python and was curious about functional programming, so I have written this brief tutorial. Python isn’t a full-fledged functional language, but it supports some very useful functional idioms.

It’s best to approach this tutorial by programming along at the Python interactive prompt. Try typing everything in to see what the results are.

Introduction

Imagine you have a list of numbers and want to filter out all even numbers:

q = [1,2,5,6,8,12,15,17,20,23,24]

def is_even(x) :
  return x % 2 == 0

result = filter(is_even, q)

(Remember that the “%” operator is the modulus or remainder operator. If the remainder when a number is divided by two is zero, then the number must be even.)

If we only use is_even once, it’s kind of annoying that we have to define it. Wouldn’t it be nice if we could just define it inside of the call to filter? We want to do something like “q = filter(x % 2 == 0, q)“, but that won’t quite work, since the “x % 2 == 0” is evaluated before the call to filter. We want to pass a function to filter, not an expression.

We can do this using the lambda operator. lambda allows you to create an unnamed throw-away function. Try this:

q = [1,2,5,6,8,12,15,17,20,23,24]
result = filter(lambda x : x % 2 == 0, q)

lambda tells Python that a function follows. Before the colon (:) you list the parameters of the function (in this case, only one, x). After the colon you list what the function returns. This function doesn’t have a name, and Python disposes of it as soon as filter is done with it.

You can, however, assign the function to a variable to give it a name. The following two statements are identical:

def is_odd(x) :
  return x % 2 == 1
 
is_odd = lambda x : x % 2 == 1

map

Here’s another example. The map function applys a function to every item in a list, and returns the result. This example adds 1 to every element in the list:

result = map(lambda x : x + 1, [1,2,3,4,5])

(At this point, result holds [2,3,4,5,6].)

The map function can also process more than one list at a time, provided they are the same length. This allows you to do things like add all of the elements in a pair of lists. Note that our lambda here has two parameters:

result = map(lambda x,y : x+y, [1,2,3,4,5], [6,7,8,9,10])

(At this point, result holds [7, 9, 11, 13, 15].)

reduce

Python also provides the reduce function. This is a bit more complicated; you provide it a list and a function, and it reduces that list by applying the function to pairs of elements in the list. An example is the best way to understand this. Let’s say we want to find the sum of all the elements in a list:

sum = reduce(lambda x,y : x+y, [1,2,3,4,5])

reduce will first apply the function to 1,2, yielding 3. It will then apply the function to 3,3 (the first 3 is the result of adding 1+2), yielding 6. It will then apply the function to 6,4 (the 6 is the result of adding 3+3), yielding 10. Finally, it will apply the function to 10,5, yielding 15. The result stored in sum is 15.

Similarly, we can find the cumulative product of all items in a list. The following example stores 120 (=1*2*3*4*5) in product:

product = reduce(lambda x,y : x*y, [1,2,3,4,5])

Conditionals

Beginning in Python 2.5 you can even express conditions within your lambdas, using Python’s conditional expressions. So, for example:

reciprocals = map(lambda x : 1.0/x if x !=  0 else None, [0, 1, 2, 3, 4, 5])

At this point, reciprocals holds the list [None, 1.0, 0.5, 0.333..., 0.25, 0.2]. The expression “1/x if x != 0 else None” is the conditional expression, and it is used to prevent division by zero. If x is not 0, then the result is 1/x, but otherwise it is None.

Background

One of the advantages of lambda-functions is that they allow you to write very concise code. A result of this, however, is that the code is dense with meaning, and it can be hard to read if you are not accustomed to functional programming. In our first example, filter(is_even, q) was fairly easy to understand (fortunately, we chose a descriptive function name), while filter(lambda x : x % 2 == 0, q) takes a little longer to comprehend.

If you are a masochistic mathematical geek, you’ll probably enjoy this (I do). If not, you might still prefer to break things into smaller pieces by defining all of your functions first and then using them by name. That’s ok! Functional programming isn’t for everyone.

Another advantage of functional programming is that it corresponds very closely to the mathematical notion of functions. Note that our lambda-functions didn’t store any results into variables, or have any other sort of side effect. Functions without side effects cause fewer bugs, because their behavior is deterministic (this is related to the dictum that you aren’t supposed to use global variables). It is also much easier to mathematically prove that such functions do what you think they are doing. (Note that it’s still possible to write a lambda function that has side effects, if the lambda function calls another function that has side effects; for example lambda x : sys.stdout.write(str(x)) has the side effect of printing its parameter to the screen.)

Gödel, Escher, Bach: an Eternal Golden Braid

Hofstadter, Douglas. Gödel, Escher, Bach: an Eternal Golden Braid. New York: Basic Books, 1999.

This book is an excellent and fun, if lengthy, romp through art (visual, literary, and musical), mathematics, logic, provability and computability, linguistics, cognitive science and artificial intelligence, and more. Hofstadter cleverly explores a myriad of amazing connections between all these fields. And while he ends up drawing no substantive conclusions about his final hypothesis of emergent intelligence, the journey is no less exciting.

I disagree with Hofstadter’s opinion of the nature of intelligence. But I thoroughly enjoyed this book for a number of reasons. First, despite its imposing size, it is accessible; Hofstadter presents mathematical proofs in an easily understood way, using examples, analogies, and much explanatory prose. Second, despite its imposing size, it is delightfully fun; this book is brimming with humor, wit, cleverness, and exciting coincidences. Lastly, it is an excellent introduction to a broad variety of topics.

I first read this book in eighth grade and deeply enjoyed it. In part this book served as my introduction to the exciting fields of logic, mathematics, and computer science.

Python odd word problem

I crafted this compact solution to Dijkstra’s OddWordProblem:

from sys import stdin, stdout
def even(char) : stdout.write(char); return select(even, 0, "");
def odd(char)  : q = select(odd, 0, ""); stdout.write(char); return q
def select(fn, skipspace, prefix) :
  char = stdin.read(1)
  if char.isspace() and skipspace : return select(fn, 1, prefix)
  elif char.isspace()             : return 0
  elif char.isalpha()             : stdout.write(prefix); return fn(char)
  elif char == "."                : return 1
  else                            : raise "Invalid input"

if not select(even, 1, "") :
while not select(odd, 1, " ") and not select(even, 1, " ") : pass
stdout.write(".")

Python list partition

I came up with these simple solutions to RonJeffriesCodingProblem. The first uses Python 2.2’s generators, the second uses the functional idiom, and the third uses list comprehensions. The third is my favorite:

def split1(list, num) :
  for index in range(num) :
    yield list[index * len(list) / num:(index + 1) * len(list) / num]

def split2(list, num) :
  return map(lambda x: list[x * len(list) / num:(x+1) * len(list) / num], range(num))

def split3(list, num) :
  return [list[x*len(list)/num : (x+1)*len(list)/num] for x in range(num)]