Wednesday, August 1, 2012

Punch Card Reader - the FAQ


Can I have a card to scan?
You could use a screen grab from my Javascript Virtual Cardpunch. Or you use the following python punchcardgen.py script that generates card images from text read from stdin (how about a t-shirt with a message punched into it):
#!/usr/bin/env python
#
# punchcardgen.py 
#
# Copyright (C) 2011: Michael Hamilton
# The code is GPL 3.0(GNU General Public License) ( http://www.gnu.org/copyleft/gpl.html )
#
import Image
import sys

CARD_COLUMNS = 80
CARD_ROWS = 12

# found measurements at http://www.quadibloc.com/comp/cardint.htm
CARD_WIDTH = 7.0 + 3.0/8.0 # Inches
CARD_HEIGHT = 3.25 # Inches
CARD_COL_WIDTH = 0.087 # Inches
CARD_HOLE_WIDTH = 0.055 # Inches IBM, 0.056 Control Data
CARD_ROW_HEIGHT = 0.25 # Inches
CARD_HOLE_HEIGHT = 0.125 # Inches
CARD_TOPBOT_MARGIN = 3.0/16.0 # Inches at top and bottom
CARD_SIDE_MARGIN = 0.2235 # Inches on each side

DARK = (0,0,0)
BRIGHT = (255,255,255)  # pixel brightness value (i.e. (R+G+B)/3)
REDUCE_IN_SIZE=8

IBM_MODEL_029_KEYPUNCH = """
    /&-0123456789ABCDEFGHIJKLMNOPQR/STUVWXYZ:#@'="`.<(+|!$*);^~,%_>? |
12 / O           OOOOOOOOO                        OOOOOO             |
11|   O                   OOOOOOOOO                     OOOOOO       |
 0|    O                           OOOOOOOOO                  OOOOOO |
 1|     O        O        O        O                                 |
 2|      O        O        O        O       O     O     O     O      |
 3|       O        O        O        O       O     O     O     O     |
 4|        O        O        O        O       O     O     O     O    |
 5|         O        O        O        O       O     O     O     O   |
 6|          O        O        O        O       O     O     O     O  |
 7|           O        O        O        O       O     O     O     O |
 8|            O        O        O        O OOOOOOOOOOOOOOOOOOOOOOOO |
 9|             O        O        O        O                         | 
  |__________________________________________________________________|"""

translate = None
if translate == None:
    translate = {}
    # Turn the ASCII art sideways and build a hash look up for 
    # column values, for example:
    #   A:(O, , ,O, , , , , , , , )
    #   B:(O, , , ,O, , , , , , , )
    #   C:(O, , , , ,O, , , , , , )
    rows = IBM_MODEL_029_KEYPUNCH[1:].split('\n');
    rotated = [[ r[i] for r in rows[0:13]] for i in range(5, len(rows[0]) - 1)]
    for v in rotated:
        translate[v[0]] = tuple(v[1:])

if __name__ == '__main__':

    scale = 1000
    margin = 200
    card_x_pixels = int(CARD_WIDTH * scale)
    card_y_pixels = int(CARD_HEIGHT * scale)

    img_size = (2 * margin + card_x_pixels, 2 * margin + card_y_pixels)

    side_margin_pixels = int(CARD_SIDE_MARGIN * scale)
    col_width_pixels = int(CARD_COL_WIDTH * scale)

    top_bot_margin = int(CARD_TOPBOT_MARGIN * scale)
    row_height_pixels = int(CARD_ROW_HEIGHT * scale)

    hole_width = int(CARD_HOLE_WIDTH * scale)
    hole_height = int(CARD_HOLE_HEIGHT * scale)

    card_area = (margin, margin, margin + card_x_pixels, margin + card_y_pixels)
    
    proto_img = Image.new('RGB', img_size, BRIGHT)
    proto_pix = proto_img.load()
    proto_img.paste(DARK, card_area)
    
    # Remove the top left corner (don't know the standard for this - guess)
    i = 0
    for x in xrange(margin, margin + side_margin_pixels):
        for y in xrange(margin, margin + top_bot_margin + hole_height - i):
            proto_pix[x,y] = BRIGHT
        i += 2
        
    card_number = 1
    for line in sys.stdin:
        img = proto_img.copy()
        x = margin + side_margin_pixels
        for char in line:
            if char in translate:
                values = translate[char] 
                y = margin + top_bot_margin
                for row in xrange(0, CARD_ROWS):
                    if values[row] == 'O':
                        img.paste(BRIGHT, (x, y, x + hole_width, y + hole_height))
                    y += row_height_pixels
            x += col_width_pixels
            if x > margin + card_x_pixels:
                break
        img = img.resize((img_size[0]/REDUCE_IN_SIZE, img_size[1]/REDUCE_IN_SIZE))
        filename =  "%010.10d.jpg" % ( card_number )
        print filename, line
        img.save(filename)
        card_number += 1

The script has no command line options, just feed it uppercase text, for example:
 
% python punchcardgen.py
PROGRAM FORTRAN; WRITE(*,*)'HELLO WORLD'; END PROGRAM
In this case the script produces a single image:
The full-sized image can be rescanned to text by using my original punchcard script, for example:
% python punchcard.py 0000000001.jpg > prog.f90
% gfortran prog.f90                    
% ./a.out 
 HELLO WORLD

If you have your own cards, you can just hold them up to an even light and take their picture, for example you might use a monitor displaying white or a cloudy sky - just make sure the resulting image background is smooth and the picture is straight and square - and hold it by the bottom corner, for example:

To get the best scan, try the python script with the -d or -i options for debug info, use -b N to change the threshold light levels.  Use a full sized image - if the image is too small the calculations introduce errors.

Why not use an auto-feed scanner with a straight through paper path?
That would be the way to go if I wanted to spend money on it – and didn't want to learn a little electronics.  

Why not add a motor and automate the feed?
I was worried about jams on older decks of cards. I think I could build a better feed by copying my photo printer's paper feed in Lego. (possible patent violation?)  

I did consider using my photo-printer as a feeder, that would probably have worked quite well.  I would have to figure out how to collect cards as they exit the printer.

Some way along I figured I could get the job done with stuff at hand without out buying anything.  Once I set that constraint, options narrowed considerably and decisions were easier to make.

There was also the case of the minicomputer with a crank fitted over the instruction single-step toggle-switch – variable speed debugging – consider my approach a homage to that earlier clever hardware hack.

Why not use an array of detectors connected to the Arduino and eliminate the camera?
That would be cool but - this is my first attempt at electronics. Advancing each column past a single column scan would seem hard to calibrate correctly. I imagine that could be solved with a grid/wheel of calibration holes moving with the card or moving with the scanner.  

If the card moved at a constant speed it might be possible to detect start and end of card, and from the timing figure out what went past and what row it belonged to. 

Why not just read the text printed at the top of the card?
I didn't think the text would be good enough for OCR. Some cards were quite worn. I did not want to manually read and enter each card. OCR seems a tough problem compared to reading the holes.

What is the Arduino for exactly?
The Arduino stops the card, detects that a card has stopped, signals the camera to focus and shoot, opens the servo to let the card go.  It plays a key role in keeping the cards in order, both the order of the images in the camera, and the order of the physical cards in the output bin.  It could do more, such as run a feed motor.   But really, the Card Reader is a integration with an Arduino in the mix - it's not a pure Arduino project.  

Why not use a webcam, Android camera?
The Canon S2 IS employed here is old, but produces reasonable distortion free images - with the CHDK firmware hack it seemed a shame not to use it.  It would be nice to feed directly to the PC - Android or a webcam would accomplish this.  Perhaps a wireless capable SD-card might also work.


Why would anyone want to go to this much time/effort?
For me, learning by doing works best. This was a well bounded problem that look solvable. It wasn't all that much effort, I just kept the problem in the back of my mind over the last year. There was only the occasional burst of activity when ideas solidified – I was not working to a deadline.

Why not tweak-it/finish-it/enhance-it in some respect?
I've scanned the cards I wanted to - some MIX, some FORTRAN.  In the process achieved my goal of learning a little about Arduino, electronics, CHDK, fritzing, and PIL.  I hope to apply some of what I've learnt to some of my other interests, for example, nature photography:





No comments:

Post a Comment

These days we're only getting spam, so comments are now disabled.

Note: Only a member of this blog may post a comment.