Rails pattern: trim spaces on input

Problem: Your Rails application accepts user input for a number of models. For many or most of these fields, leading and trailing spaces are a significant inconvenience — they cause problems for your validators (email address, phone number, etc.) and they cause normalization and uniqueness problems in your database.

Solution: Just as the Rails ActiveRecord class uses methods like belongs_to and validates_format_of to define model relationships and behaviors, create a new class method to express trimming behavior. There are a number of ways to do this; I will present one possibility that I have used in my own code. I created a file lib/trimmer.rb with the following contents:

module Trimmer
  # Make a class method available to define space-trimming behavior.
  def self.included base
    base.extend(ClassMethods)
  end

  module ClassMethods
    # Register a before-validation handler for the given fields to
    # trim leading and trailing spaces.
    def trimmed_fields *field_list
      before_validation do |model|
        field_list.each do |n|
          model[n] = model[n].strip if model[n].respond_to?('strip')
        end
      end
    end
  end
end

Then I write the following in my models:

require 'trimmer'
class MyModel < ActiveRecord::Base
  include Trimmer
  . . .
  trimmed_fields :first_name, :last_name, :email, :phone
  . . .
end

While this makes the behavior available to particular models explicitly, you may prefer to make this behavior available to all of your models implicitly. In that case, you can extend the ActiveRecord::Base class behavior by adding the following to config/environment.rb:

require 'trimmer'
class ActiveRecord::Base
  include Trimmer
end

If you do this, the trimmed_fields class method will be available to all of your models.

My experience with Django and Rails

I’ve had the opportunity to work on both Django and Rails frameworks recently as part of one project.  The core application for the church administrative tools that I am working on is written in Rails, while the church guest follow-up application that I am responsible for is written in Django.  Why two separate stacks?  GuestView began its life independently from Gospel Software, and I chose Django there because of my familiarity with Python.  Three developers are sharing responsibility for the core of Gospel Software, however, and we chose Rails as the most reasonable lingua franca.

What follows are my personal opinions and observations.  These are mostly aesthetic or other value judgments, and I offer them simply for your consideration.

Language

I’ve used the Python programming language for a number of years and like it a lot.  I particularly enjoy its functional aspects, although lately I’ve become more of a fan of using list comprehensions and generator expressions wherever possible compared to map() and filter() with lambdas.  Compared to Python, Ruby has much more powerful functional capabilities, although some things don’t feel natural to me (Ruby’s design choice to not require parentheses to denote function invocation means that you must use .call to call a lambda, which feels clunky).  There are also some cases in Ruby where choosing one of several alternative forms of an expression can have a significant impact on your performance.  Lambdas seem particularly costly in Ruby as of version 1.8.

Overall I think the languages are fairly on par.  Right now I prefer Python for aesthetic rather than technical reasons.  As I grow in familiarity with Ruby, and as it matures and its performance improves I think I may eventually grow to prefer it.

Object-Relational Mapping (ORM)

The Django ORM is very powerful and you can express complicated queries very efficiently using it.  Django queries are not executed until they are actually used, so you can construct your queries piecemeal, which helps in writing readable code.  Django also allows you some flexibility with adding custom SQL to your queries, but for anything complicated I’ve found that I have to break down and write my own SQL.

Rails 2.1 introduced the ActiveRecord named_scope functionality.  Prior to this Rails was significantly lacking compared to Django’s expressive power for query construction, but named_scope pretty much evens the playing field.  And for complicated queries, which you will surely face in any real-world project as you seek to tweak performance, ActiveRecord gives you a degree of control over your SQL that really puts Django to shame.

Both Django and Rails seem to have adequate support for PostgreSQL, my database of choice.

URLs

Django lets you express your URLs using regular expressions; Rails accomplishes this using routes.   I personally prefer Django’s method, but both work well enough.

Templates

While Rails’ Embedded Ruby allows you to include arbitrary code in your templates, Django’s template engine is much more spartan.  It provides ways of getting at variables passed to the template, including objects, dictionaries, lists, and even methods.  And it has some simle control structures, but not covering the full expressive power of Python.  I yearned for a more powerful template language in Django at first.  But I found over time that the discipline of  a simple template language was helpful to me, forcing me to move any complicated behaviors to the controller (or “view” as Django calls it) which was in most cases the right thing to do anyway.

There are still some areas where I think the Django template language is lacking.  However, there is an open-source alternative to the Django template engine that is similar but sufficiently more powerful to meet my needs: Jinja2.

For me it is a toss-up between Embedded Ruby and Jinja2.

Performance

I suspect it’s common knowledge that Rails has a little ways to go in performance.  For our own purposes, I didn’t find too much difference in time measurements between Django and Rails.  However, Rails clearly has a much larger memory footprint than Django.

I was surprised to learn that even with a FastCGI or WSGI model, Django still opens and closes a database connection for each request.  While there may be technical reasons that the Django architecture requires this, it was still a surprise to me.  Django performance still seems on par with Rails in spite of this.  Interestingly, having Django use pgpool to connect to PostgreSQL didn’t improve my performance at all, perhaps because my application and database are currently located on the same host.

Console

Both Django and Rails allow you to run a REPL session for your application.  The Rails script/console command beats out Django hands-down, because Rails’ internal magic automatically imports pretty much everything you need.  In Django you still need to import any models or framework modules before you can use them.

Debugging

The Rails built-in log is enabled out of the box and is very handy.  Django provides logging functionality but you have to do a little extra work to enable it.  Rails wins out on logging.  Django is better at in-browser rendering of exception tracebacks.  Overall the handiness its logging means a slight win for Rails here for me.

Admin Application

Django’s admin application is truly its crown jewel.  If you need a private admin interface to your web application, Django will give you a very attractive and powerful interface almost entirely for free.  I’m not aware of any equivalent for Rails that even comes close to this.

Deployment

I’ve deployed Django using FastCGI and Rails using Mongrel.  Right now I am using Nginx to proxy to Mongrel, and to connect directly to the Django FastCGI instance.  Neither Django nor Rails seems to have a unique advantage or disadvantage in deployment.

Summary

I’ve spent more cumulative time with Django than with Rails, so I feel subjectively more at home with Django.  If I were going to write a small toy project, I’d choose Django mainly for ease and efficiency.  In fact, I took this route for the meal and potluck scheduler application that I recently wrote for Google App Engine.  GAE has many similarities with Django, and even allows you to run much of the Django stack on it.

However, for larger projects my current framework of choice is Rails.  With the named_scope functionality in Rails 2.1, ActiveRecord is finally on par with Django’s ORM.   And for any complicated queries ActiveRecord is superior to Django’s ORM.  While Django’s admin application is handy, I don’t make much use of it.  And while Rails falls slightly behind in performance and storage characteristics, I believe that Ruby and Rails will both continue to improve in this regard.

PostgreSQL foreign keys and indexes

[PostgreSQL]If you’re a frequent user of MySQL, you may be familiar with the fact that all MySQL table constraints automatically create indexes for you.  This is true of the InnoDB foreign key constraints, for which “an index is created on the referencing table automatically if it does not exist.”

If you’re switching or considering a switch to PostgreSQL, you should be aware that not all PostgreSQL table constraints will automatically create indexes for for you.  In PostgreSQL, a UNIQUE or PRIMARY KEY constraint on one or more fields will implicitly create an index for you.  However, in PostgreSQL a FOREIGN KEY constraint will not automatically create an index for you.

For each of your foreign key constraints, you should evaluate whether you want to create an index.  You may want to do this for optimizing your own queries, but be aware that it can also help to speed up DELETE queries on the referenced table and UPDATE queries on the referenced field.  This is because any foreign key reference must be located to enforce whatever ON DELETE and ON UPDATE behavior is in effect for the constraint.

SmugMug uploader

[SmugMug]I’ve written a small Python script to upload pictures to a SmugMug gallery. I love SmugMug and use it extensively for family photos. I’m using this script for my personal use because it’s much simpler and much less of a resource hog than a browser-based uploader, and also because it was a fun exercise to try out the SmugMug API. You can run this script as follows to upload one or more files:

python upload.py gallery-name picture-file-name . . .

On Windows I’ve set up a desktop shortcut pointing to the script, and I can drag and drop a pile of picture files onto the icon and it will upload away. I’ve tested it using both Python 2.5 using simplejson, and also using Python 2.6 which has simplejson built in. Earlier versions of Python may require you to change the import of hashlib to md5, and change the hashlib.md5() invocation to a md5.new() invocation. You’ll also need to modify the script to contain your email address and SmugMug password, and obtain a SmugMug API key for your own development use, but this is a very painless process. Here is the script:

#!/usr/bin/python

##########
# Requirements: Python 2.6 or
#               simplejson from http://pypi.python.org/pypi/simplejson
##########

EMAIL='...'
PASSWORD='...'

##########
APIKEY='...'
API_VERSION='1.2.2'
API_URL='https://api.smugmug.com/services/api/json/1.2.2/'
UPLOAD_URL='http://upload.smugmug.com/photos/xmlrawadd.mg'

import sys, re, urllib, urllib2, urlparse, hashlib, traceback, os.path
try    : import json
except : import simplejson as json

if len(sys.argv) < 3 :
  print 'Usage:'
  print '  upload.py  album  picture1  [picture2  [...]]'
  print
  sys.exit(0)

album_name = sys.argv[1]
su_cookie  = None

def safe_geturl(request) :
  global su_cookie

  # Try up to three times
  for x in range(5) :
    try :
      response_obj = urllib2.urlopen(request)
      response = response_obj.read()
      result = json.loads(response)

      # Test for presence of _su cookie and consume it
      meta_info = response_obj.info()
      if meta_info.has_key('set-cookie') :
        match = re.search('(_su=S+);', meta_info['set-cookie'])
        if match and match.group(1) != "_su=deleted" :
          su_cookie = match.group(1)
      if result['stat'] != 'ok' : raise Exception('Bad result code')
      return result
    except :
      if x < 4 :
        print "  ... failed, retrying"
      else :
        print "  ... failed, giving up"
        print "  Request was:"
        print "  " + request.get_full_url()
        try :
          print "  Response was:"
          print response
        except :
          pass
        traceback.print_exc()
        #sys.stdin.readline()
        #sys.exit(1)
        return result

def smugmug_request(method, params) :
  global su_cookie

  paramstrings = [urllib.quote(key)+'='+urllib.quote(params[key]) for key in params]
  paramstrings += ['method=' + method]
  url = urlparse.urljoin(API_URL, '?' + '&'.join(paramstrings))
  request = urllib2.Request(url)
  if su_cookie :
    request.add_header('Cookie', su_cookie)
  return safe_geturl(request)

result = smugmug_request('smugmug.login.withPassword',
                         {'APIKey'       : APIKEY,
                          'EmailAddress' : EMAIL,
                          'Password'     : PASSWORD})
session = result['Login']['Session']['id']

result = smugmug_request('smugmug.albums.get', {'SessionID' : session})
album_id = None
for album in result['Albums'] :
  if album['Title'] == album_name :
    album_id = album['id']
    break
if album_id is None :
  print 'That album does not exist'
  sys.exit(1)

for filename in sys.argv[2:] :
  data = open(filename, 'rb').read()
  print 'Uploading ' + filename
  upload_request = urllib2.Request(UPLOAD_URL,
                                   data,
                                   {'Content-Length'  : len(data),
                                    'Content-MD5'     : hashlib.md5(data).hexdigest(),
                                    'Content-Type'    : 'none',
                                    'X-Smug-SessionID': session,
                                    'X-Smug-Version'  : API_VERSION,
                                    'X-Smug-ResponseType' : 'JSON',
                                    'X-Smug-AlbumID'  : album_id,
                                    'X-Smug-FileName' : os.path.basename(filename) })
  result = safe_geturl(upload_request)
  if result['stat'] == 'ok' :
    print "  ... successful"

print 'Done'
# sys.stdin.readline()

I am donating this script to the public domain. You are welcome to use and modify it as you please without conditions. I’d appreciate hearing about your experience with this script or any changes and improvements you’ve made; please leave a comment. Thanks!

Update 2010-07-20

Since I first posted this, I’ve updated it as follows:

  1. Add a Content-Type header of ‘none’. This is to workaround a bug in the SmugMug API.
  2. Use basename() to send only the file’s basename for X-Smug-FileName.
  3. Rewrite safe_geturl() to loop up to five times if the upload attempt fails. I’ve found that uploading is surprisingly unreliable, and re-attempting the upload generally works fine.
  4. Add a commented call to readline() at the end of the script. In my case, I run my script by dragging files onto an icon on my Windows desktop, which causes it to run in a DOS window and vanish when done. If you uncomment this line, it will wait for you to press Enter when it is done uploading. You’ll be able to see any files that weren’t uploaded successfully.

Update 2010-11-28

SmugMug made a recent change to their API’s login behavior which broke this script. While the new login behavior is not documented in the API docs, the fix is apparently to use a session cookie along with the session ID. While it’s a bit of a kludge, I’ve updated the script above to save this cookie in a global variable and submit it on subsequent requests.

Update 2011-06-24

I’ve fixed a bug in the script causing it to wrongly report a failure for certain requests that don’t send back the session cookie. The fix involves testing whether a set-cookie header was returned before accessing the header.

Update 2013-10-01

Version 1.2.0 of the SmugMug API has stopped working, so I have updated the script to use version 1.2.2 of the API.

Automatically minify your Javascript and CSS

For best performance, it is recommended that you minify the Javascript and CSS that your web application uses.  What this involves is removing all unnecessary whitespace and comments.  So, for example, the following CSS:

body
{
  margin: 5px 10px 10px 10px;
  font-family: arial;
}

would look like this after being minified:

body{margin:5px 10px 10px 10px;font-family:arial;}

And similarly for Javascript. It is common to configure your server to perform GZip compression on files that it serves, including Javascript and CSS, and this can significantly reduce the time that it takes for browsers to load your pages. But minification when used with GZip usually helps to compress the files just a little bit further. And unlike GZip, which only compresses the file only as it is sent over the internet, minification compresses the file as it is seen by a browser. This allows the browser to parse it faster; additionally, smaller files are more likely to be cached by the browser.

It is common to manually minify your Javascript and CSS as part of deploying your application, saving a minified copy on your server either manually or as part of an automatic deployment script. But it is also possible to create custom Apache output filters to perform the minification for you. This gives you the best of both worlds — you can edit your files directly without their being minified, but you don’t have to engineer a minification process for when you deploy your application. Here’s how to do it, first for Javascript and then for CSS.

Javascript

  1. Ensure you have the Apache mod_ext_filter extension installed.

  2. Download the jsmin.py Python script from Douglas Crockford’s website. (There are also other languages available.)  Save it in your Python installation’s site-packages folder (possibly /usr/lib/python2.x/site-packages/).

  3. Add the following lines to your main Apache config file (httpd.conf, apache2.conf, etc.):

    <IfModule mod_ext_filter.c>
      ExtFilterDefine jsmin 
                      mode=output 
                      intype=application/x-javascript 
                      outtype=application/x-javascript 
                      cmd="/usr/bin/python /usr/lib/python2.4/site-packages/jsmin.py"
    </IfModule>
    
  4. Add the following statement to the context where you would like to minify your Javascript files (you can place this in your server config, but also within a virtual host configuration, a directory directive, or even a .htaccess file if FileInfo overrides are allowed):

    AddOutputFilter jsmin js
    

    This will cause all files with extensions ending in .js to be run through the Javascript minify filter before being sent to a browser. If you have some Javascript without the .js extension, you can add additional extensions, or you can use the AddOutputFilterByType directive instead to apply the filter to any content with the application/javascript MIME type. With appropriate mod_expires directives you can cause these files to be cached for a long time by browsers, thereby ensuring that the minify filter is not run more than necessary.

For debugging purposes you should ensure that the minify filter is applied only to your production server and not to your development server. Until you have verified the correctness of your Javascript it will be harder to locate Javascript errors within minified code!

CSS

  1. Ensure you have the Apache mod_ext_filter extension installed, as above.

  2. Install the cssmin Ruby gem:

    gem install cssmin
  3. Add the following lines to your main Apache config file (httpd.conf, apache2.conf, etc.):

    <IfModule mod_ext_filter.c>
      ExtFilterDefine cssmin 
                      mode=output 
                      intype=text/css 
                      outtype=text/css 
                      cmd="/usr/bin/ruby -e 'require "rubygems"; require "cssmin"; puts CSSMin.minify(STDIN)'"
    </IfModule>
    
  4. Add the following statement to the context where you would like to minify your CSS files (you can place this in your server config, but also within a virtual host configuration, a directory directive, or even a .htaccess file if FileInfo overrides are allowed):

    AddOutputFilter cssmin css
    

    This will cause all files with extensions ending in .css to be run through the CSS minify filter before being sent to a browser. If you have some CSS without the .css extension, you can add additional extensions, or you can use the AddOutputFilterByType directive instead to apply the filter to any content with the text/css MIME type. With appropriate mod_expires directives you can cause these files to be cached for a long time by browsers, thereby ensuring that the minify filter is not run more than necessary.

Unobtrusive Javascript: Self-labeling text inputs

Some web sites have self-labeling text input boxes; for example, see the text box at the top right of the page on memberhub. When these self-labeling form fields are empty, they contain helpful text that labels or further explains their purpose, such as “Search” or “Enter your favorite color.” As soon as you click on these fields, the help text vanishes and you can type in a value.

Using unobtrusive Javascript (see introduction), we can add behavior to a text input element to automatically label it with this sort of help text contained within the element. We will take advantage of the title attribute of input elements to do this. The title element is already widely used in many browsers to provide help text in a tool-tip when you hover over an element with your mouse, and we will steal this title text from any input text field to use for self-labeling purposes. Here is a script that accomplishes that:

autolabel.js

Event.onReady(function() {
  $$('input[type="text"][title]').each(function(inputElement) {
    var e = inputElement;
    var color = e.getStyle('color');
    var fontStyle = e.getStyle('fontStyle');

    if(e.value == e.title) {            // FF reload behavior.
      e.value = '';
    }

    var blank = !$F(e);

    var blurHandler = function(ev) {
      blank = !$F(e);
      if(blank) {
        e.setStyle({ 'color'     : 'darkgray',
                     'fontStyle' : 'italic' });
        e.value = e.title;
      }
    }
    e.observe('focus', function(ev) {
      if(blank) {
        if($F(e) == e.title) {
          e.value = '';
        }
        e.setStyle({ 'color'     : color,
                     'fontStyle' : fontStyle });
      }
    });
    e.observe('blur', blurHandler);
    blurHandler(null);

    Event.observe(e.form, 'submit', function(ev) {
      if(blank) {
        e.value = '';
      }
    });
  });
});

Here’s how to use it. Note that you need to include the Prototype and Low Pro Javascript libraries:

example.html

<script src="/js/prototype.js" type="text/javascript"></script>
<script src="/js/lowpro.js" type="text/javascript"></script>
<script src="/js/autolabel.js" type="text/javascript"></script>
. . .
<input type="text" name="color" title="Enter your favorite color" />

How it works

The script first searches for all input elements in the document that are of type “text” and also have the title attribute. For each of these, it executes a function to add the self-labeling behavior to the element.  This function does a number of things:

  1. When you refresh a page in Firefox, as long as you don’t do a full reload, Firefox preserves whatever values were previously in form fields.  The function tests if the current value of the form field is equal to the title attribute (meaning that someone refreshed the page while the self-labeling description was present in the field, and if so, the function clears the form field.
  2. For all other purposes, the function uses the variable blank to track whether the field is blank and should have the label inserted.  We do this rather than comparing the field’s content to the title tag, in case the user actually types in the value of the title tag (for something like “Search”).
  3. The function adds a handler for the focus event (cursor enters field).  If the field is blank, it clears the label text so the user can enter their own text, and restores the field’s color and font style to their original values.  Otherwise, the field contains user text so it is left unchanged.
  4. The function adds a handler for the blur event (cursor leaves field).  If the field is blank, it remembers the fact that it is blank, sets the style of the field so that the text is italic and dark gray (you may modify this as you wish), and then inserts the title text.
  5. The function adds a handler for the submit event on the field’s form.  If the form is submitted and the field is blank, it will be cleared so that the correct value is submitted for the form contents.

Improving Firefox session restore

I use Firefox session restore regularly.  This saves all my open tabs whenever Firefox closes (or crashes) and restores them when it reopens.  Unfortunately, there are still cases where Firefox will forget my open tabs.  This happens, for example, when I close my main window, thinking that I’m closing Firefox, but then I realize that there is still a pop-up window open.  When I restart Firefox, that popup will be restored instead of my old tab list!

I have not yet found a way to recover the tabs.  What a sinking feeling this leaves you with!  My open tabs are sometimes the result of several weeks’ worth of browsing.  I’ve grown better at saving links using del.icio.us rather than holding them in open tags, and obviously I need to work harder at this.  But I often still have up to 20 tabs open at any given time, waiting to be read.

This happened to me again today.  So I resolved one more time to try to find a solution.  And I did!  I found and installed Tab Mix Plus (I installed the latest TMP development build).  This Firefox plugin remembers your sessions from more than just the most recently closed window.  So if you close your main window first but find a popup lingering, you can close the popup without worrying that you have lost the tabs from your main window.  What a relief!

Here are the settings that I chose in TMP:

  1. When you first enable Tab Mix Plus, it asks you whether you want to use the Firefox session restore feature.  The TMP session restore is much better than Firefox’s, so choose No.
  2. Go to Tools|Tab Mix Plus Options to choose other options.
    1. Click the “Events” icon.  On the “Tab Features” tab, un-check “Ctrl-Tab navigates tabs in the most recently used order.”  This will set Ctrl+Tab to behave the way it usually does in Firefox.
    2. Click the “Session” icon.
      1. Make sure that “Use Firefox’s built-in Session Restore feature” is not checked.
      2. Under “When Browser Starts”, select “Restore”.
      3. Under “When Browser Exits”, make sure “Save Session” is selected.

Tab Mix Plus provides additional control over your browser sessions.  Using the Tools|Session Manager menu, you can manually save sessions, and also restore older sessions.