Tuesday, August 28, 2012

jQuery/Raphael Virtual Card Punch

I've been brushing up on javascript and jQuery. I'm not really a web developer. These days I mostly work in the Java/Linux space deep inside the server, but sometimes a little web development is in the mix. I was working on a future post about backing up Flickr images and data to a static HTML slideshow. While setting up the slideshow, it occurred to me that I could use jQuery to create a Virtual Card Punch that would run entirely in the browser. So back to punch cards one final time.



A minimal version of the page is available here for anyone who wishes to crawl around the source and find out exactly how it was coded.
Programming using a card-punch was a noisy affair, you can hear what it was like at http://ibm-1401.info, listen to http://ibm-1401.info/IBM026KeyPunch.mp3.

The card-punch could also be programmed to do things like duplicate a deck. This was achieved by punching instructions onto a card and installing the card on the card-punch's program-cylinder.  The cylinder was installed in the card-punch and as it turned little cogs engaged with the holes and bumped little leavers signalling instructions to the card-punch. Duplicating a deck took the noise to a whole new level.


The following javascript libraries were used:
  • Raphael - SVG (Scalable Vector Graphics) javascript library.
  • jQuery - the write less, do more, javascript library.
  • Fancybox - floating lightbox.
The jQuery library grabs and edits the input text as it is typed. The Raphael library dynamically adds SVG elements to the page. The Fancybox library creates a popup window large enough to produce a scannable image of a card. I've run the code in chrome, firefox and IE8. IE8 seems a little buggy, but it's OK once you start typing. I used Raphael and SVG to learn a bit about something new. For true portability it would probably be better off using a table of precomputed images, one per character - that would also be very easy to code - jQuery could be used to dynamically update the visible images. Yet another approach can be seen at www.kloth.net - that site uses an http server to generate jpg's or png's for a wide variety of card encodings.
&-0123456789ABCDEFGHIJKLMNOPQR/STUVWXYZ:#@'="`.<(+|!$*);^~,%_>?


Wednesday, August 1, 2012

Punch Card Reader - the FAQ


Can I have a card to scan?
You could use a screen grab from my Javascript Virtual Cardpunch. Or you use the following python punchcardgen.py script that generates card images from text read from stdin (how about a t-shirt with a message punched into it):
#!/usr/bin/env python
#
# punchcardgen.py 
#
# Copyright (C) 2011: Michael Hamilton
# The code is GPL 3.0(GNU General Public License) ( http://www.gnu.org/copyleft/gpl.html )
#
import Image
import sys

CARD_COLUMNS = 80
CARD_ROWS = 12

# found measurements at http://www.quadibloc.com/comp/cardint.htm
CARD_WIDTH = 7.0 + 3.0/8.0 # Inches
CARD_HEIGHT = 3.25 # Inches
CARD_COL_WIDTH = 0.087 # Inches
CARD_HOLE_WIDTH = 0.055 # Inches IBM, 0.056 Control Data
CARD_ROW_HEIGHT = 0.25 # Inches
CARD_HOLE_HEIGHT = 0.125 # Inches
CARD_TOPBOT_MARGIN = 3.0/16.0 # Inches at top and bottom
CARD_SIDE_MARGIN = 0.2235 # Inches on each side

DARK = (0,0,0)
BRIGHT = (255,255,255)  # pixel brightness value (i.e. (R+G+B)/3)
REDUCE_IN_SIZE=8

IBM_MODEL_029_KEYPUNCH = """
    /&-0123456789ABCDEFGHIJKLMNOPQR/STUVWXYZ:#@'="`.<(+|!$*);^~,%_>? |
12 / O           OOOOOOOOO                        OOOOOO             |
11|   O                   OOOOOOOOO                     OOOOOO       |
 0|    O                           OOOOOOOOO                  OOOOOO |
 1|     O        O        O        O                                 |
 2|      O        O        O        O       O     O     O     O      |
 3|       O        O        O        O       O     O     O     O     |
 4|        O        O        O        O       O     O     O     O    |
 5|         O        O        O        O       O     O     O     O   |
 6|          O        O        O        O       O     O     O     O  |
 7|           O        O        O        O       O     O     O     O |
 8|            O        O        O        O OOOOOOOOOOOOOOOOOOOOOOOO |
 9|             O        O        O        O                         | 
  |__________________________________________________________________|"""

translate = None
if translate == None:
    translate = {}
    # Turn the ASCII art sideways and build a hash look up for 
    # column values, for example:
    #   A:(O, , ,O, , , , , , , , )
    #   B:(O, , , ,O, , , , , , , )
    #   C:(O, , , , ,O, , , , , , )
    rows = IBM_MODEL_029_KEYPUNCH[1:].split('\n');
    rotated = [[ r[i] for r in rows[0:13]] for i in range(5, len(rows[0]) - 1)]
    for v in rotated:
        translate[v[0]] = tuple(v[1:])

if __name__ == '__main__':

    scale = 1000
    margin = 200
    card_x_pixels = int(CARD_WIDTH * scale)
    card_y_pixels = int(CARD_HEIGHT * scale)

    img_size = (2 * margin + card_x_pixels, 2 * margin + card_y_pixels)

    side_margin_pixels = int(CARD_SIDE_MARGIN * scale)
    col_width_pixels = int(CARD_COL_WIDTH * scale)

    top_bot_margin = int(CARD_TOPBOT_MARGIN * scale)
    row_height_pixels = int(CARD_ROW_HEIGHT * scale)

    hole_width = int(CARD_HOLE_WIDTH * scale)
    hole_height = int(CARD_HOLE_HEIGHT * scale)

    card_area = (margin, margin, margin + card_x_pixels, margin + card_y_pixels)
    
    proto_img = Image.new('RGB', img_size, BRIGHT)
    proto_pix = proto_img.load()
    proto_img.paste(DARK, card_area)
    
    # Remove the top left corner (don't know the standard for this - guess)
    i = 0
    for x in xrange(margin, margin + side_margin_pixels):
        for y in xrange(margin, margin + top_bot_margin + hole_height - i):
            proto_pix[x,y] = BRIGHT
        i += 2
        
    card_number = 1
    for line in sys.stdin:
        img = proto_img.copy()
        x = margin + side_margin_pixels
        for char in line:
            if char in translate:
                values = translate[char] 
                y = margin + top_bot_margin
                for row in xrange(0, CARD_ROWS):
                    if values[row] == 'O':
                        img.paste(BRIGHT, (x, y, x + hole_width, y + hole_height))
                    y += row_height_pixels
            x += col_width_pixels
            if x > margin + card_x_pixels:
                break
        img = img.resize((img_size[0]/REDUCE_IN_SIZE, img_size[1]/REDUCE_IN_SIZE))
        filename =  "%010.10d.jpg" % ( card_number )
        print filename, line
        img.save(filename)
        card_number += 1

The script has no command line options, just feed it uppercase text, for example:
 
% python punchcardgen.py
PROGRAM FORTRAN; WRITE(*,*)'HELLO WORLD'; END PROGRAM
In this case the script produces a single image:
The full-sized image can be rescanned to text by using my original punchcard script, for example:
% python punchcard.py 0000000001.jpg > prog.f90
% gfortran prog.f90                    
% ./a.out 
 HELLO WORLD

If you have your own cards, you can just hold them up to an even light and take their picture, for example you might use a monitor displaying white or a cloudy sky - just make sure the resulting image background is smooth and the picture is straight and square - and hold it by the bottom corner, for example:

To get the best scan, try the python script with the -d or -i options for debug info, use -b N to change the threshold light levels.  Use a full sized image - if the image is too small the calculations introduce errors.

Why not use an auto-feed scanner with a straight through paper path?
That would be the way to go if I wanted to spend money on it – and didn't want to learn a little electronics.  

Why not add a motor and automate the feed?
I was worried about jams on older decks of cards. I think I could build a better feed by copying my photo printer's paper feed in Lego. (possible patent violation?)  

I did consider using my photo-printer as a feeder, that would probably have worked quite well.  I would have to figure out how to collect cards as they exit the printer.

Some way along I figured I could get the job done with stuff at hand without out buying anything.  Once I set that constraint, options narrowed considerably and decisions were easier to make.

There was also the case of the minicomputer with a crank fitted over the instruction single-step toggle-switch – variable speed debugging – consider my approach a homage to that earlier clever hardware hack.

Why not use an array of detectors connected to the Arduino and eliminate the camera?
That would be cool but - this is my first attempt at electronics. Advancing each column past a single column scan would seem hard to calibrate correctly. I imagine that could be solved with a grid/wheel of calibration holes moving with the card or moving with the scanner.  

If the card moved at a constant speed it might be possible to detect start and end of card, and from the timing figure out what went past and what row it belonged to. 

Why not just read the text printed at the top of the card?
I didn't think the text would be good enough for OCR. Some cards were quite worn. I did not want to manually read and enter each card. OCR seems a tough problem compared to reading the holes.

What is the Arduino for exactly?
The Arduino stops the card, detects that a card has stopped, signals the camera to focus and shoot, opens the servo to let the card go.  It plays a key role in keeping the cards in order, both the order of the images in the camera, and the order of the physical cards in the output bin.  It could do more, such as run a feed motor.   But really, the Card Reader is a integration with an Arduino in the mix - it's not a pure Arduino project.  

Why not use a webcam, Android camera?
The Canon S2 IS employed here is old, but produces reasonable distortion free images - with the CHDK firmware hack it seemed a shame not to use it.  It would be nice to feed directly to the PC - Android or a webcam would accomplish this.  Perhaps a wireless capable SD-card might also work.


Why would anyone want to go to this much time/effort?
For me, learning by doing works best. This was a well bounded problem that look solvable. It wasn't all that much effort, I just kept the problem in the back of my mind over the last year. There was only the occasional burst of activity when ideas solidified – I was not working to a deadline.

Why not tweak-it/finish-it/enhance-it in some respect?
I've scanned the cards I wanted to - some MIX, some FORTRAN.  In the process achieved my goal of learning a little about Arduino, electronics, CHDK, fritzing, and PIL.  I hope to apply some of what I've learnt to some of my other interests, for example, nature photography:





Thursday, July 26, 2012

Punch Card Reader - The Hardware

Having used PIL to create software that could scan a card, I spent quite some time mulling over possible ways to make a scanner.   At some point it occurred to me that I could recycle some old curtain rails and use gravity to do the card transport.   This lead to a final design that could be built from materials all ready to hand:  some old curtain rails, an old piece of shelving, tracing paper, a desk lamp, some masking tape, and Blu-Tack.  I made some cutouts in the rails so that the software scanner could identify the side edges of each card and automatically calibrate the card's width (the power drill grabbed as it made it through, so it's best to clamp the rail down securely).  I rounded off all cut edges with a file to minimise the chance of a card catching on entering the rails or while passing through the cutout.



The breadboard connected to the Arduino was wired up by combining aspects of the designs that came with the SparkFun Inventor's Kit for Arduino (I'm in New Zealand, so I ordered it from mindkits.co.nz).   I subsequently found the Fritzing circuit design tool which I have now used to document my design (I've not had the courage to dismantle and rewire the breadboard to test the design schematics, so the schematics are untried).  




The card feed was a initially a bit of a problem.  I had originally thought about making the whole reader from Lego, but then thought why torture myself?  I also considered photo printers with straight through paper paths, I tried mine and it worked quite well, but I still needed to figure out how to get the card from the printer's output tray into the rails - probably by tipping the printer on an angle to let gravity feed the output onto a shoot (perhaps with relatively empty ink cartridges).  In the end I went with a crude manual Lego feeder.  It has the advantage that jams can be dealt with before any cards are mangled.  It's not perfect, but it's adequate.




The camera was fairly easy, I have an old Canon S2 IS, I just put CHDK enhanced firmware on an SD card.  I cut open an old USB cable to attach the camera to the Arduino.  The hardest part of dealing with the camera was experimenting with the timing sequence on the USB shutter release.  The documentation is a bit technical, but I eventually settled on using a short half press delay to focus, followed by a full press delay to take the photo. Here is the Arduino-Sketch code for the controller:
// Shutter Controller
// by Michael Hamilton  
// The code is GPL 3.0(GNU General Public License)


#include  

int SERVO_PIN = 9;
int SERVO_CLOSE = 120;
int SERVO_OPEN = 15;
int SERVO_DELAY = 250;
int SERVO_CARD_MOVE_DELAY = 1000;

int PHOTO_RESIST_PIN = 0;
int PHOTO_RESIST_DIFF = 40;
int PHOTO_RESIST_DELAY = 1000;

int CAMERA_PIN = 11;
int CAMERA_FULL_PRESS_DELAY = 1000;
int CAMERA_HALF_PRESS_DELAY = 500;
int CAMERA_FOCUS_DELAY = 400;
int CAMERA_CAPTURE_DELAY = 1500;

Servo myservo;  // create servo object to control a servo 
                // a maximum of eight servo objects can be created 
 
int lastlevel = 0; 
 
void setup() 
{ 
  Serial.begin(9600);
  Serial.println("Card Rdr");
  myservo.attach(SERVO_PIN);  // attaches the servo on pin 9 to the servo object 
  myservo.write(SERVO_CLOSE); 
  delay(SERVO_DELAY);
  myservo.detach();
} 
 
 
void loop() 
{ 
   int lightlevel = analogRead(0);
   if (lastlevel != lightlevel) {  
      Serial.print("lightlevel="); 
      Serial.println(lightlevel);
      int diff = lastlevel - lightlevel;
      if (diff >= PHOTO_RESIST_DIFF) {  
          digitalWrite(CAMERA_PIN, HIGH);
          delay(CAMERA_HALF_PRESS_DELAY);      // press shutter -focus
          digitalWrite(CAMERA_PIN, LOW);
          delay(CAMERA_FOCUS_DELAY);
          digitalWrite(CAMERA_PIN, HIGH);
          delay(CAMERA_FULL_PRESS_DELAY);      // press shutter
          digitalWrite(CAMERA_PIN, LOW);
          
          delay(CAMERA_CAPTURE_DELAY);   // wait for photo to be taken
          myservo.attach(SERVO_PIN);   
          myservo.write(SERVO_OPEN);
          delay(SERVO_DELAY + SERVO_CARD_MOVE_DELAY);      // wait for servo to open and card to move
          myservo.write(SERVO_CLOSE); 
          delay(SERVO_DELAY);       // wait for servo to close
          myservo.detach();
      } 
      lastlevel = lightlevel;
      delay(PHOTO_RESIST_DELAY);
   }
} 

Wednesday, July 25, 2012

Punch Card Reader - The Software

Note: The FAQ now includes a script that can generate punch card images from text. I guess this new script could be used as a basis for a card maker - or to create punch card t-shirt logos?
When I originally started the Punch Card Reader project my first step was to obtain a few sample images by holding a camera in one hand, and a punch-card up to the window in the other. I then located detailed card specifications at http://www.quadibloc.com/comp/cardint.htm, a site that documents all the essential dimensions along with quite a bit of background history. Using these dimensions I was able to experiment with the sample images by using the Python Image Library (PIL - python 2.7). PIL makes it very easy to walk the x/y grid of an image inspecting each pixel's RGB values.
I tried to come up with a heuristic to recognise the card-edges and the punched-holes. Initially I accumulated brightness values across the entire surface and averaged them into discrete rows and columns.  This worked reasonably well but was quite slow. I soon realised that recognising the tall horizontal rows required far less precision than the smaller and more numerous vertical columns. I was able to shortcut the vertical scan by just examining a one pixel wide line across the estimated middle of each row. You can get some feel for the tolerances required from the following debug dump:




The faint red marks show where the scanning algorithm has decided it has found an edge or a hole. The faint blue rectangles plot where a holes were expected to located - you can see the the vertical drift isn't going to be much of a problem so long as the image is reasonably square and flat.  Notice that the red marks at the start of each horizontal row exhibit some drift from the true vertical and the script has to compensate for this to maintain an accurate allocation of holes to the correct columns.  On the other hand I found it adequate to calibrate the vertical height from one reading only - this is why the guide rails have holes cut in the middle - the middle reading is clearly marked on this image.

The final script accepts some parameters to help it adjust to the characteristics of the scanning hardware and to enable some debugging feedback, here is a summary of the script's parameters output by its help option:
 % python punchcard.py --help
Usage: punchcard.py [options] image [image...]
    decode punch card image into ASCII.

Options:
  -h, --help            show this help message and exit
  -b BRIGHT, --bright-threshold=BRIGHT
                        Brightness (R+G+B)/3, e.g. 127.
  -s SIDE_MARGIN_RATIO, --side-margin-ratio=SIDE_MARGIN_RATIO
                        Manually set side margin ratio (sideMargin/cardWidth).
  -d, --dump            Output an ASCII-art version of the card.
  -i, --display-image   Display an anotated version of the image.
  -r, --dump-raw        Output ASCII-art with raw row/column accumulator
                        values.
  -x XSTART, --x-start=XSTART
                        Start looking for a card edge at y position (pixels)
  -X XSTOP, --x-stop=XSTOP
                        Stop looking for a card edge at y position
  -y YSTART, --y-start=YSTART
                        Start looking for a card edge at y position
  -Y YSTOP, --y-stop=YSTOP
                        Stop looking for a card edge at y position
  -a XADJUST, --adjust-x=XADJUST
                        Adjust middle edge detect location (pixels) 
To assist with adjusting the scan for the best results the script can optionally display marked up images (seen above). Plus the script can produce an ASCII art dump, for example:
           SLAX 1            MOVE ALL CHARS ONE LEFT                            
 Card Dump of Image file: mix1/img_1961.jpg Format Dump threshold= 190
 123456789-123456789-123456789-123456789-123456789-123456789-123456789-123456789-
 ________________________________________________________________________________ 
/           SLAX 1            MOVE ALL CHARS ONE LEFT                            |
|.............O..................O.O...OOO.....O..OO.............................|
|............O................OO....OO....O..OO..O...............................|
|...........O..O................O..........O........O............................|
|.............O..O.................O.....O.......................................|
|...........O..............................O.....................................|
|............O......................OO.O.........O..O............................|
|.............................O..................................................|
|...............................OO............OO..O..............................|
|..............................O.............O.....O.............................|
|..............O.................................................................|
|.......................................O........................................|
|.........................................O......................................|
`--------------------------------------------------------------------------------'
 123456789-123456789-123456789-123456789-123456789-123456789-123456789-123456789-
That concludes this brief overview of the recognition script. The next post will describe the hardware in more detail. Full script code follows below.

The code (punchcard.py):

#!/usr/bin/env python
#
# punchcard.py 
#
# Copyright (C) 2011: Michael Hamilton
# The code is GPL 3.0(GNU General Public License) ( http://www.gnu.org/copyleft/gpl.html )
#
import Image
import sys
from optparse import OptionParser

CARD_COLUMNS = 80
CARD_ROWS = 12

# found measurements at http://www.quadibloc.com/comp/cardint.htm
CARD_WIDTH = 7.0 + 3.0/8.0 # Inches
CARD_HEIGHT = 3.25 # Inches
CARD_COL_WIDTH = 0.087 # Inches
CARD_HOLE_WIDTH = 0.055 # Inches IBM, 0.056 Control Data
CARD_ROW_HEIGHT = 0.25 # Inches
CARD_HOLE_HEIGHT = 0.125 # Inches
CARD_TOPBOT_MARGIN = 3.0/16.0 # Inches at top and bottom
CARD_SIDE_MARGIN = 0.2235 # Inches on each side


CARD_SIDE_MARGIN_RATIO = CARD_SIDE_MARGIN/CARD_WIDTH # as proportion of card width (margin/width)
CARD_TOP_MARGIN_RATIO = CARD_TOPBOT_MARGIN/CARD_HEIGHT # as proportion of card height (margin/height)
CARD_ROW_HEIGHT_RATIO = CARD_ROW_HEIGHT/CARD_HEIGHT # as proportion of card height - works
CARD_COL_WIDTH_RATIO = CARD_COL_WIDTH/CARD_WIDTH # as proportion of card height - works
CARD_HOLE_HEIGHT_RATIO = CARD_HOLE_HEIGHT/CARD_HEIGHT # as proportion of card height - works
CARD_HOLE_WIDTH_RATIO = CARD_HOLE_WIDTH/CARD_WIDTH # as a proportion of card width

BRIGHTNESS_THRESHOLD = 200  # pixel brightness value (i.e. (R+G+B)/3)

IBM_MODEL_029_KEYPUNCH = """
    /&-0123456789ABCDEFGHIJKLMNOPQR/STUVWXYZ:#@'="`.<(+|!$*);^~,%_>? |
12 / O           OOOOOOOOO                        OOOOOO             |
11|   O                   OOOOOOOOO                     OOOOOO       |
 0|    O                           OOOOOOOOO                  OOOOOO |
 1|     O        O        O        O                                 |
 2|      O        O        O        O       O     O     O     O      |
 3|       O        O        O        O       O     O     O     O     |
 4|        O        O        O        O       O     O     O     O    |
 5|         O        O        O        O       O     O     O     O   |
 6|          O        O        O        O       O     O     O     O  |
 7|           O        O        O        O       O     O     O     O |
 8|            O        O        O        O OOOOOOOOOOOOOOOOOOOOOOOO |
 9|             O        O        O        O                         | 
  |__________________________________________________________________|"""

translate = None
if translate == None:
    translate = {}
    # Turn the ASCII art sideways and build a hash look up for 
    # column values, for example:
    #   (O, , ,O, , , , , , , , ):A
    #   (O, , , ,O, , , , , , , ):B
    #   (O, , , , ,O, , , , , , ):C
    rows = IBM_MODEL_029_KEYPUNCH[1:].split('\n');
    rotated = [[ r[i] for r in rows[0:13]] for i in range(5, len(rows[0]) - 1)]
    for v in rotated:
        translate[tuple(v[1:])] = v[0]
    #print translate

# generate a range of floats
def drange(start, stop, step=1.0):
    r = start
    while (step >= 0.0 and r < stop) or (step < 0.0 and r > stop):
        yield r
        r += step

# Represents a punchcard image plus scanned data
class PunchCard(object):
    
    def __init__(self, image, bright=-1, debug=False, xstart=0, xstop=0, ystart=0, ystop=0, xadjust=0):
        pass
        self.text = ''
        self.decoded = []
        self.surface = [] 
        self.debug = debug
        self.threshold = 0
        self.ymin = ystart
        self.ymax = ystop
        self.xmin = xstart
        self.xmax = xstop
        self.xadjust = xadjust
        self.image = image
        self.pix = image.load()
        self._crop()
        self._scan(bright)
    
    # Brightness is the average of RGB values
    def _brightness(self, pixel):
        #print max(pixel)
        return ( pixel[0] + pixel[1] + pixel[2] ) / 3

    # For highlighting on the debug dump
    def _flip(self, pixel):
        return max(pixel)

    # The search is started from the "crop" edges.
    # Either use crop boundary of the image size or the valyes supplied
    # by the command line args
    def _crop(self):
        self.xsize, self.ysize = image.size
        if self.xmax == 0:
            self.xmax = self.xsize
        if self.ymax == 0:
            self.ymax = self.ysize
        self.midx = self.xmin + (self.xmax - self.xmin) / 2 + self.xadjust
        self.midy = self.ymin + (self.ymax - self.ymin) / 2

    # heuristic for finding a reasonable cutoff brightness
    def _find_threshold_brightness(self):
        left = self._brightness(self.pix[self.xmin, self.midy])
        right = self._brightness(self.pix[self.xmax - 1, self.midy])
        return min(left, right, BRIGHTNESS_THRESHOLD) - 10
        vals = []
        last = 0
        for x in xrange(self.xmin,self.xmax):
            val = self._brightness(self.pix[x, self.midy])
            if val > last:
                left = val
            else:
                break
            last = val
        for x in xrange(self.xmax,self.xmin, -1):
            val = self._brightness(self.pix[x, self.midy])
            if val > last:
                right = val
            else:
                break
            right = val
        print left, right
        return min(left, right,200)
        
        for x in xrange(self.xmin,self.xmax):
            val = self._brightness(self.pix[x, self.midy])
            vals.append(val)
        vals.sort()
        last_val = vals[0]
        biggest_diff = 0
        threshold = 0
        for val in vals:
            diff = val - last_val
            #print val, diff
            if val > 127 and val < 200 and diff >= 5:
                biggest_diff = diff
                threshold = val
            last_val = val
        if self.debug:
            print "Threshold diff=", biggest_diff, "brightness=", val
        return threshold - 10
    
    # Find the left and right edges of the data area at probe_y and from that
    # figure out the column and hole vertical dimensions at probe_y.
    def _find_data_horiz_dimensions(self, probe_y):
        left_border, right_border = self.xmin, self.xmax - 1
        for x in xrange(self.xmin, self.midx):            
            if self._brightness(self.pix[x,  probe_y]) < self.threshold:
                left_border = x
                break
        for x in xrange(self.xmax-1,  self.midx,  -1):
            if self._brightness(self.pix[x,  probe_y]) < self.threshold:
                right_border = x
                break
        width = right_border - left_border
        card_side_margin_width = int(width * CARD_SIDE_MARGIN_RATIO)
        data_left_x = left_border + card_side_margin_width
        #data_right_x = right_border - card_side_margin_width
        data_right_x = data_left_x + int((CARD_COLUMNS * width) * CARD_COL_WIDTH/CARD_WIDTH)
        col_width = width * CARD_COL_WIDTH_RATIO
        hole_width = width * CARD_HOLE_WIDTH_RATIO
        #print col_width
        if self.debug:
            # mark left and right edges on the copy
            for y in xrange(probe_y - self.ysize/100, probe_y + self.ysize/100):
                self.debug_pix[left_border if left_border > 0 else 0,y] = 255
                self.debug_pix[right_border if right_border < self.xmax else self.xmax - 1,y] = 255
            for x in xrange(1, (self.xmax - self.xmin) / 200):
                self.debug_pix[left_border + x, probe_y] = 255
                self.debug_pix[right_border - x, probe_y] = 255
                
        return data_left_x, data_right_x,  col_width, hole_width
 
    # find the top and bottom of the data area and from that the 
    # column and hole horizontal dimensions 
    def _find_data_vert_dimensions(self):
        top_border, bottom_border = self.ymin, self.ymax
        for y in xrange(self.ymin, self.midy):
            #print pix[midx,  y][0] 
            if self._brightness(self.pix[self.midx,  y]) < self.threshold:
                top_border = y
                break
        for y in xrange(self.ymax - 1,  self.midy, -1):
            if self._brightness(self.pix[self.midx,  y]) < self.threshold:
                bottom_border = y
                break
        card_height = bottom_border - top_border
        card_top_margin = int(card_height * CARD_TOP_MARGIN_RATIO)
        data_begins = top_border + card_top_margin
        hole_height = int(card_height * CARD_HOLE_HEIGHT_RATIO)
        data_top_y = data_begins + hole_height / 2
        col_height = int(card_height * CARD_ROW_HEIGHT_RATIO)
        if self.debug:
            # mark up the copy with the edges
            for x in xrange(self.xmin, self.xmax-1):
                self.debug_pix[x,top_border] = 255
                self.debug_pix[x,bottom_border] = 255
        if self.debug:
            # mark search parameters 
            for x in xrange(self.midx - self.xsize/20, self.midx + self.xsize/20):
               self.debug_pix[x,self.ymin] = 255
               self.debug_pix[x,self.ymax - 1] = 255
            for y in xrange(0, self.ymin):
               self.debug_pix[self.midx,y] = 255
            for y in xrange(self.ymax - 1, self.ysize-1):
               self.debug_pix[self.midx,y] = 255
        return data_top_y, data_top_y + col_height * 11, col_height, hole_height

    def _scan(self, bright=-1):
        if self.debug:
            # if debugging make a copy we can draw on
            self.debug_image = self.image.copy()
            self.debug_pix = self.debug_image.load()
            
        self.threshold = bright if bright > 0 else self._find_threshold_brightness()    
        #x_min, x_max,  col_width = self._find_data_horiz_dimensions(image, pix, self.threshold, self.ystart, self.ystop)
        y_data_pos, y_data_end, col_height, hole_height = self._find_data_vert_dimensions()
        data = {}
        
        # Chads are narrow so find then heuristically by accumulating pixel brightness
        # along the row.  Should be forgiving if the image is slightly wonky.
        y = y_data_pos #- col_height/8
        for row_num in xrange(CARD_ROWS):
            probe_y = y + col_height if row_num == 0 else ( y - col_height if row_num == CARD_ROWS -1 else y )  # Line 0 has a corner missing
            x_data_left, x_data_right,  col_width, hole_width = self._find_data_horiz_dimensions(probe_y)
            left_edge = -1 # of a punch-hole
            for x in xrange(x_data_left,  x_data_right):
                # Chads are tall so we can be sure if we probe around the middle of their height
                val = self._brightness(self.pix[x, y])
                if val >= self.threshold:
                    if left_edge == -1:
                        left_edge = x
                    if self.debug:
                        self.debug_pix[x,y] = self._flip(self.pix[x,y])
                else:
                    if left_edge > -1:
                        hole_length = x - left_edge
                        if hole_length >= hole_width * 0.75:
                            col_num = int((left_edge + hole_length / 2.0 - x_data_left) / col_width + 0.25)  
                            data[(col_num, row_num)] = hole_length
                        left_edge = -1
            if (self.debug):
                # Plot where holes might be on this row
                expected_top_edge = y - hole_height / 2
                expected_bottom_edge = y + hole_height / 2
                blue = 255 * 256 * 256
                for expected_left_edge in drange(x_data_left, x_data_right - 1, col_width):
                    for y_plot in drange(expected_top_edge, expected_bottom_edge, 2):
                        self.debug_pix[expected_left_edge,y_plot] = blue
                        #self.debug_pix[x + hole_width/2,yline] = 255 * 256 * 256
                        self.debug_pix[expected_left_edge + hole_width,y_plot] = blue
                    for x_plot in drange(expected_left_edge, expected_left_edge + hole_width):
                        self.debug_pix[x_plot, expected_top_edge] = blue
                        self.debug_pix[x_plot, expected_bottom_edge] = blue
            y += col_height

        if self.debug:
            self.debug_image.show()
            # prevent run-a-way debug shows causing my desktop to run out of memory
            raw_input("Press Enter to continue...")
        self.decoded = []
        # Could fold this loop into the previous one - but would it be faster?
        for col in xrange(0, CARD_COLUMNS):
            col_pattern = []
            col_surface = []
            for row in xrange(CARD_ROWS):
                key = (col, row)
                # avergage for 1/3 of a column is greater than the threshold
                col_pattern.append('O' if key in data else ' ')
                col_surface.append(data[key] if key in data else 0)
            tval = tuple(col_pattern)
            global translate
            self.text += translate[tval] if tval in translate else '@'
            self.decoded.append(tval)
            self.surface.append(col_surface)
           

        return self

    # ASCII art image of card
    def dump(self, id, raw_data=False):
        print ' Card Dump of Image file:', id, 'Format', 'Raw' if raw_data else 'Dump', 'threshold=', self.threshold
        print ' ' + '123456789-' * (CARD_COLUMNS/10)
        print ' ' + '_' * CARD_COLUMNS + ' '
        print '/' + self.text +  '_' * (CARD_COLUMNS - len(self.text)) + '|'
        for rnum in xrange(len(self.decoded[0])):
            sys.stdout.write('|')
            if raw_data:
                for val in self.surface:
                    sys.stdout.write(("(%d)" % val[rnum]) if val[rnum] != 0 else '.' )
            else:
                for col in self.decoded:
                    sys.stdout.write(col[rnum] if col[rnum] == 'O' else '.')
            print '|'
        print '`' + '-' * CARD_COLUMNS + "'"
        print ' ' + '123456789-' * (CARD_COLUMNS/10)
        print ''
         
            
if __name__ == '__main__':
    
    usage = """usage: %prog [options] image [image...]
    decode punch card image into ASCII."""
    parser = OptionParser(usage)
    parser.add_option('-b', '--bright-threshold', type='int', dest='bright', default=-1, help='Brightness (R+G+B)/3, e.g. 127.')
    parser.add_option('-s', '--side-margin-ratio', type='float', dest='side_margin_ratio', default=CARD_SIDE_MARGIN_RATIO, help='Manually set side margin ratio (sideMargin/cardWidth).')
    parser.add_option('-d', '--dump', action='store_true', dest='dump', help='Output an ASCII-art version of the card.')
    parser.add_option('-i', '--display-image', action='store_true', dest='display', help='Display an anotated version of the image.')
    parser.add_option('-r', '--dump-raw', action='store_true', dest='dumpraw', help='Output ASCII-art with raw row/column accumulator values.')
    parser.add_option('-x', '--x-start', type='int', dest='xstart', default=0, help='Start looking for a card edge at y position (pixels)')
    parser.add_option('-X', '--x-stop', type='int', dest='xstop', default=0, help='Stop looking for a card edge at y position')
    parser.add_option('-y', '--y-start', type='int', dest='ystart', default=0, help='Start looking for a card edge at y position')
    parser.add_option('-Y', '--y-stop', type='int', dest='ystop', default=0, help='Stop looking for a card edge at y position')
    parser.add_option('-a', '--adjust-x', type='int', dest='xadjust', default=0, help='Adjust middle edge detect location (pixels)')
    (options, args) = parser.parse_args()
    
    for arg in args:
        image = Image.open(arg)
        card = PunchCard(image,  bright=options.bright, debug=options.display, xstart=options.xstart, xstop=options.xstop, ystart=options.ystart, ystop=options.ystop, xadjust=options.xadjust)
        print card.text
        if (options.dump):
            card.dump(arg)
        if (options.dumpraw):
            card.dump(arg, raw_data=True)

Punch Card Reader - The Movie


Last year I bought an Arduino micro controller and spent some time building the experiments that came with the kit. Having rediscovered some old punch cards, I wondered if I could combine the Arduino, the CHDK firmware for Canon cameras, and my Linux desktop, and read in these old card decks. This is the result: