Using photos to 'scan' documents has become more popular because of the speed and ease of snapping a photo versus going to a scanner somewhere in the campus library. The problem is that these photos are less professional than a scanned copy because they are often shot at an angle and sometimes include things other than the document itself, like the table that it was sitting on. That is what prompted me to build an app that would be able to crop such an image and produce output that could not be distinguished from something out of a scanning machine.

What it does

The app has multiple intended functionality. One way to extract the document from an image is to let the computer do it completely by itself, which involves first detecting the edges and then the prominent lines in the photo. Alternatively, the user can select the vertices of the page, and then let the computer handle the cropping. Either way, the intended output can then be sent via SMS or perhaps uploaded to Dropbox.

How I built it

Using Android Studio and with great advice from the people at Stack Overflow, I was able to resolve most of my issues with setting up the camera and working with bitmaps programatically. Some of the algorithms that I used I was aware of earlier, such as Canny Edge detection and the Hough transform, and others I developed during the course of the hackathon, like the affine transform.

Challenges I ran into

Coming up with a good algorithm is always much faster on paper than actually implementing it. That's why even though the affine transform seems to work in my C++ example, I had trouble with it when I ported it over to Java. Also, the Android camera turns out to be very difficult to use; there is a simple version which returns a small thumbnail and a much harder version that returns a high-resolution image - for the purposes of this hackathon I used a scaled up version of the thumbnail image.

Accomplishments that I'm proud of

I implemented the edge detection and affine transformation algorithms, even though the second one is buggy in the Java implementation. I was also pleasantly surprised with the performance of the custom view that I used to contain the photograph. It was done entirely programmatically (no XML!) and displayed the images quite well.

What's next for Page Scanner

Getting it to work 100%, and then utilizing the full capabilities of the camera, and not just the thumbnail that I've been working with so far.

Built With

Share this project: