DIY book scanner
DIY book scanner. English translation of notes made post-build.
There were no building plans.
__/\_/\__ /_(*_l_*)_/\ ASCII art test. If you see the cat in a box, you will likely see text drawings //______W_/\/ in this document correctly. | || |_________|/MCbx
Scanner is a device to digitize printed material into computer-readable form.
Digitizing data allows quick building of searchable databases, data mining or
easy and fast research. Scanning books allows to quickly search, recall and
share information, especially if someone has many, many books on the shelf. Many companies try to show book digitizing as theft - it isn't. I have books which were published in 100 or 500 copies. Ask any publisher for such book you e.g. lost in flood and you'll see how worth are all licenses to them.
Here is one of scanned pages after processing with ScanTailor and thresholding
to monochrome. Because of image size, it is resized from ca. 4900pix to 2048 pix
in height, keeping ratio.
|DIMENSIONS (usually unneeded, as you will likely have different pieces of wood):
Single plate (base) board: 50x50cm
Vertically worplace is ca. 30cm because: a. platten, b. stator mounting, c. Cameras range.
Base board: 45x110cm. Stator goes a bit to the front.
Cameras heights have been discovered experimentally. Start with lower positions and go upwards until you get more or less straight view.
As sources of components, Author used remains of wardrobe, bookshelf and some steel junk (C-shaped bars). All components except cameras have been taken from recycling junk. Complete building time: Two weekends.
It is very important to keep plates angle exactly 45 degrees. In the document I will call sliding piece of plates also "Stator" contrary to constantly moving glass "platten".
Every hole for screw has to be cut to fit screw's head in the surface. OK, maybe except topmost stator mounting, in which screw goes 45 degrees to board.
DO NOT use rubber pads under base board or you will have to wait for cameras to stop shaking.
Scanner under construction
Camera used: Any 10MPix or better. Should support taking photo using PTP (gPhoto2) by USB cable. Or CHDK, Canon cameras were too expensive for this project. My unit is made around Nikon Coolpix L20 cameras. In fact, very good cameras have German or Japanese lenses.
1. Stator right-hand side construction
2. Stator left-hand side construction
3. Stator shifting mechanism
4. Camera column
5. Crank mechanism
6. Mounting camera to column
8. Right-hand camera column
10. Electric part
14. Appendices and scripts
STATOR RIGHT-HAND SIDE CONSTRUCTION:
/ /| <---Cut 1 / | /v | =================v - wooden part cut (in diameter) like this:
__ / | |___|Cut 1 and its mounting:
/| / / | / | | T/| | | // | ...... // | |_| / | | ======== | |Screw's head (T) limits available vertical space to ca. 43cm. A whole platten goes below it.
__________ | | | || | <--Cut 1 is in the center of stator part | || | | || | |vvvv||vvvv| ============Both vertical and horizontal parts should be fastened from underside to the base plate. Horizontal part in 3 points, vertical part in 2 points to prevent rotating. Horizontal part should be mounted to stator plate between screws connecting it from base. All holes should be cut to fit screw's heads - the plate must be smooth in its bottom!
\ |\ | \ |_v\ ==============Use sliding part only if you plan to scan VERY thick books, like magazine yearly archives. If you want to scan typical 40-400page books, it's not needed and malicious sometimes.
Up to Cut 1 \\ | | \\ | | __\\ | |---------------------| \\\ |__|_____________________|__| \\ <--"Stator" plateFix horizontal base to vertical part using two screws coming from the left, then one screw from the right through "V" part, then fix plate to V-part and Cut 1.
LEFT-HAND STATOR SHIFTING MECHANISM
Fix (preferably from bottom) two flat parts with spacing exact as width of base of the triangle. It should slide like in a slipstick. Allow ca. 15cm of plate-plate drawback. Seen from top through plates:
_________________________ | | v||v | | | v||v | | | === v||v | | |X======v||v X | | | === v||v | | | v||v | |______|_______v||v_______| <--Still part Moving partWhere "=" are vertical flat parts. "===" are fixed to base making track for "======".
To shift the part before scanning, a screwed rod and crank is used in conjunction with camera column. For most books V-like is enough, for some very thick ones, like magazine yearly archives, it is needed to apply some drawback to make platten go into book smoothly.
In both photos there is sliding mechanism, camera column and unfinished crank mechanism. It is needed to use rod which doesn't deform on its sides when squeezed in its width. It is not needed to cover the "slipstick" from the top, as it slides smoothly and doesn't go upwards (in fact I had to stick a few layers of tape below to make it go a bit upwards to correspond with right side stator part).
CAMERA COLUMN (LEFT)
In my first model, only left-hand camera was used. I had opportunity for second camera and added right-hand camera to shoot two photos at once.
In the left, the vertical camera beam is mounted perpendicularly to base. It is needed to mark the geometrical center of stator (in depth), and center of beam, then mark center line on base and move the beam on this line to proper distance. Then holes can be drilled and 2 screws fastened from bottom of base. This will make the column exactly in the center of plate.
To prevent vibration and keep everything firmly in place, you should add another piece of wood to keep column from wiggling back and forth. See the picture:
| | | | \ / | |\ /| /| | \ / | / | |==\/ | ==============This piece of wood is cut special way (45 degree) and fastened to base and column, two screws each side:
|| /||| / ||| <---Fasten / / || / / || /_/ || ==============Above this piece of wood, drill a hole with a long drill, then drill a corresponding hole in vertical part of sliding piece of plate.
|| || \\ |ZZZZ||PN===========X|=PN \\ <--Sliding part of stator | || || \\ --| /J|| || \\ // || || \\ // || || \\ // || ||=============V\\ =======================================From the left:
|- The screw on the rod starts from "PN" - P is a washer, N is a nut applied with force to the end of screw, to make crank spin freely but not go back and forth (in the picture right-left). This nut should not unscrew during normal work.
- === is a screw on the rod. Rod is ended with "PN" - Washer and nut.To make nut not unscrew use hydraulic teflon-like tape.
- "X" is a specially embedded nut:
First, find the largest nut with hole corresponding to screw on the rod. If you see it from its front, it looks like (hole omitted):
_..... / \ |<-- a - nut's height \_/....Now drill part of hole in slider with a drill of diameter a. It should look like:
|_ | | |__| a __ | _| | | | |-| bb is a thickness of your nut.
| | | _W_ | | / \ | | | O | | | W\___/W | | |W - screws. We can see O - hole in the washer, nut is pressed under the washer. Screws secure the washer in place.
It is usually needed to do it few times to get proper camera height. Then after getting proper height fasten bolts. It should look like this:
|| |||_ <--A hinge, the angle of its opening is different than in this picture || \\ || \\/ <-Bolt for camera I=||===\\ || I===== - Screw to precisely set angle.
The base is a small wooden block, about 5x20cm, the hinge is fastened to its shorter side. In the end of this block a hole for camera bolt is drilled and camera bolt is screwed. A rubber washer (like in a water tap valve) is applied to secure the camera and mount it in proper rotation.
This part is used to keep pages opened at 90 degrees. My platten is made of glass from a bookshelf. Platten should be a bit longer than stator plate. It is made of two glass sheets aligned in "V" letter.
We have our glass sheet:
____ | | | | | | <-B |____| AWe need C-shaped bar of length a bit smaller than 4*A+2*B. C-shaped bar looks in diameter like this:
__ | | \ |__ | / spacing between armsIn spacing between arms, you have to fit:
\ / \ / \/We need to cut a rectangular part in one side of channel, and triangle in the "Comb" of channel. Then it's only needed to bend the bar. The bar becomes much less durable in its place!
One arm seen in front. Dots are edge of the arm in far. Comb in the bottom.
_____ ____ _____\/____In the top you can see the cut from Step 1.
Now try it to glasses laying on stator plates (check 90-degree angle using set square!). Mark ends of glass sheets on the bar, cut arms in this height preserving comb. One side in, second out, bend 90 degrees to wrap edges of glasses.
So 2 glass sheets seen from the top with single metal bar looks like this:
_______ | | | | | | || | || || | || ||===|===||Lines "||" and "=" mean a C-shaped bar wrapping glass sheets.
The second bar is placed on the other side.
To fix the glass to bar, use the following "sandwich"
====== |DDDDDDD <-A wooden strip (e.g. from floor panel) |XXX <-Window foam pad |ZZZZZZZZZZZZZZZZ <-GLASS SHEET |XXX <-Window foam pad ======To make a full frame, join two bars using joints:
====== |DDDDD <- A wooden block protected with insulation tape |DDDDD |ZZZZZZZZZZZZZZZZ <-GLASS SHEET |XXX <-Window foam pad ======The wooden block goes to both C-shaped bars. Drill the bar in its place and fix using screws.
The platten should be squeezed to keep 90-degree angle using wide cable ties. In the angle sharp glass edge can easily cut the ties, so use piece of aluminium (e.g. from cans) to protect it.
_____________ <--Cable tie goes around \\ // \\ // O----------O <--U-shaped handle kept with cable tie \\ // \\ // \\// \/ <--Cable tie goes tot he edge of glass, protect it!Such platten is quite durable and allows to be used without lifting mechanism - just lean it its the back to lift it.
Remember to glue foam pads on stator, the way that they won't collide with a book. They prevent platten glass from scratching.
CAMERA COLUMN (Right one)
I found that there is no space for support beam. I used a piece of wood (XX) to fasten it to stator support beam:
|| // || //|XX|| // |XX|| // |XX|| // | || ===========LIGHTING
X X / \ / \ | | |> <| | | | \ /| | \ / | _/-|---|\ /| | | |_\/ | | ================
|Using normal 40W light bulbs (X), fastened on metal pipes bent about 45 degrees, mounted in hole drilled in the top of camera columns. Cable goes through pipe, hole in column and hole in a side of column:
Pipe --- --- | | | | |_| | | _____| | <-Hole for cable | | | | From top: _____ | : | <- Hole for cable (unseen, drilled in wood) | O | <-Pipe nest |_____|
|OK, High-voltage is running here. Do not connect cables by twisting them, use insulation tape, cover critical parts, don't let any wire to be exposed and it'll be fine. If you have a power cable with plug, the safest way to service the electrical installation is to
keep the plug in your pocket.
3 wall surface switches are used. 2 are single, they turn light bulbs on and off, one is double. It turns on the power supply for cameras and a socket for computer.
230V--+---[SW]---Light 1 | +---[SW]---Light 2 ----Monitor | / | |SW|---------------[Socket]------Computer +---| | |SW|---------------[TRAFO+G]----- Regulator ----<>-----Camera 1 \ ---- Regulator -------<>------Camera 2 ConnectrRegulator (Trimmer's slider is connected to its edge connector):
3 _______ 2 +>--| LM317 |--+-------+---->Out ___ |_______| | | | O | |1 |-| |+ |---| | |_|240R === |___| +--+---| --- 470uF/16V ||| | | |- 123 |-|/ | |/| 5K === /|_| | ===
|Fasten separte or insulated heatsinks to LM317 circuits. For two cameras it's needed to have 2 regulators because:
1. Current from one regulator is only 1,5A
2. 2 different cameras may require 2 different voltages.
In many cases it is needed to add a small fan to cool linear regulators, in my unit it's running from linearily-regulrate 5V supplied from transformer. It's a small-power 12V fan, under 5V it runs quietly but efficiently.
Remember, the power disspiated by the linear regulator is ca.: P = (Uin-Uout)*I Where: Uin - input voltage, Uout - output voltage, I - current drawn.
It's easy with CHDK, if you have Canon camera or with high-end cameras with external shutter. I don't recommend ripping cameras to get their shutter switches, as you may find button with 12 tracks going to it, which need to be crossed in a special (and unknown) way to focus and take a photo. If there is no way to trigger camera using external switch, it's needed to use PTP and Linux shell scripts with GPhoto2 program.
Generally, you need a switch which presses Return (Enter). There are 2 ways to do it:
1. Get wires which connect Return key from your keyboard. It requires to dismount keyboard and use e.g. small socket for it. In my small PoS keyboard there was a space for small Jack socket. And it is needed to figure out which pins of IC are shorted when Return is pressed. "Shorted" means even 130 Ohms between them, because foil has its resistance. For protecting chip ports, if you have measured some significant resistance, add some to your connector.
2. Use PCB from old keyboard, connect proper wires and connect this keyboard as second one (two USB or one USB and one PS/2 keyboards may be used the same time).
Button Connectr ____PCB \--------------<>--|____|---------
Thanks to connector, we can remove platten for cleaning.
It is useful to add some measuring scales to plates. It allows to monitor book position during scanning and measure resolution easily.
Measuring scales can be made of freely available paper mesauring tapes from building/hobby shops. They can be fixed using a wide transparent tape. It is handy to add the following measuring tapes:
1. Resolution-marking - with marked inches (1in=2.54cm) to measure pixels per inch. Vertical scale goes in the center of plates, with zero in the bottom, horizontal one goes from user outwards, few centimeters above the bottom. This scale is also used to check won't book moved too much during scanning.
2. Additional scales showing maximum range
3. Plate drawback scale.
___________ ___________ | | | | | ---I---| |---I--- | | | I | | I | | | =====I===| |===I===== | | | I | | I | | | ---I---| |---I--- | |___________|~|___________|Where: ==== - vertical inch scale, I - horizontal inch scale, --- and | - range scales, ~ - drawback scale (below stator).
Any small PC with USB ports (better USB 2.0) and Linux OS with gPhoto2 program. It's good to have network adapter too, to send developed files by network. I've started with a small embedded system with Cyrix Kahlua (like National Semiconductor GX1) 300MHz (like Pentium MMX 266MHz) and 64MB of RAM, but if you have 256MB or more you can afford a simple GUI. If you have more than 2 cores and 1GB of RAM, you can even process with ScanTailor in PC (but it's rather slow). Currently I'm using an embedded Celeron 1.5GHz and 512MB of RAM to run Debian with software.
In general, scripts for losslessly rotating 10MPix JPEG files (not to be confused with EXIF rotating which is changing few bytes in JPEG contrary to matrix transpositions and lossless recompression) are very slow on 64MB of RAM. Linux is a system with memory management different than Windows, DOS or some older Unix systems. If there is a memory unused, Linux will use it to speed things up, storing frequently accessed data in special buffers. If some program wants to use more memory, buffers are just freed - memory is used, but system runs slowly. So if you add more memory, you'll in most cases just get faster system. 10MPix photo requires 30-40MB of memory to be processed, if it's processed optimal way, and in Open Source tools it may not be. When 64MB of RAM are installed, most of it (ca. 50MB) is used by system services (Display, console, network) and processing program, so not much is left for photo. Buffers, in such situation, do not exist. That's why the system must use disk-based space (swap) which is much slower. Generally prefer >=128MB computers, because 64MB is really slow.
To operate scanner, special scripts are needed. I've made Bash
scripts to use both in general-purpose machine or in specialized
embedded PC configured for scanning only.
Before installing these scripts, get the following packages (Debian Jessie):
aptitude install dialog mc fbi libjpeg-turbo-progs gphoto2 beep imagemagick sudo htop rsync ncftp
In Debian Wheezy:
aptitude install dialog mc fbi libjpeg-progs gphoto2 beep imagemagick sudo htop rsync ncftp
You can also install usbmount (USB drive
automounter) if it won't collide with PTP.
WARNING: In Debian Jessie there may be a problem with gPhoto2. The symptoms are following: Single photo is taken (or two, each from one camera), the second photo hangs camera (or both), request times out and gPhoto2 cannot shoot next pictures. I've found that the only way to fix it is to go to Debian Wheezy.
If you use GUI, make sure to permanently disable PTP camera automounting/detection or you won't be able to access them from gPhoto2. In LXDE, you usually have to forcefully remove gvfs packages, or it won't work, as options in PCManFM are not working for cameras (Tested under Wheezy).
Remember to look (gphoto2 --list-ports) how Your cameras are identified and change CAM1= and CAM2= lines from usb:001,... to your own - there may be e.g. usb:004 in your mainboard, it depends on USB bridge construction.
On-demand scripts package
Contains operate.sh - script to operate cameras, join.sh - script to join two directories and rotate images. Also contains Windows batch to rotate images, it requires JPEGTran (http://jpegclub.org/jpegtran/).
Software used to process scans in high-resource computer:
- ScanTailor: Requires Qt and returns nice, processed scans. Doesn't build easily in CPUs other than Intel.
- SkanKromsrator - Not developed anymore, complicated, used as "test site" for ScanTailor algorithms.
- LizardTech DjvuSolo - Runs in Wine, makes djvu files smaller than Linux tools because it uses segmentation algorithms not accessible in Linux programs.
- ImageMagick - To generally process images.
- JPEGTran - Lossless JPEG rotation.
- DjVu Small and DjVu Imager - Works with Wine. Makes small DJVU files. It is recommended to install WinDJView under Wine with them.
- Tesseract OCR and OCRFeeder - Best free OCR.
- DjVu Toy - Small GUI thing to join Djvu files. Works under Wine.
Do not blindly use compression from DjVuLibre, because it lacks pre-processing. Pre-processing is the main thing which makes Djvu files small. DjVu Solo is a bit outdated, but it still generates files usually smaller than DjVu Libre. Linux DjVu tools cannot easily compress merged DjVu, and this makes compression even worse.
Environment scripts package
Things to customize first:
- As always, usb:001,... to your needs, sometimes it may be usb:002 or other number. In operate.sh and acquire.sh
- In join.sh, there is a function called SHIT (SHow Image in Terminal) which should be modified for usage in text-mode only or with X viewer (feh by default)
- Buffer is in home directory by default.
- Welcome.msg and Goodbye.msg are to user customization.
- Last used password for FTP is saved on disk in dangerous non-encrypted form. I think that if You can set up Debian, You will also be able to make a limited account on Your network drive/NAS.
Start with bootstrap.sh. It will check for Dialog presence, make Buffer directory in Your home folder and start operation greeting with WELCOME.MSG contents. Then it will display menu, as shown in the picture.