User drops pages into drag and drop uploader
popup after uploading images - choose service level
page thumbnails show pages are now in typing queue
Typist logs in and chooses a page to type
Typist enters typing mode and types the pages
Typist submits and confirms that pages are accurate
User attention items, pages are awaiting user's approval
User can quickly read and approve the pages in a row if there is a lot to approve.
User sees full size image next to text determine if the typing was accurate
The pages were marked as approved, as page thumbnails now indicate
Journal can be synced to familysearch to match up tags with Familysearch person memories
User can share a link to anyone to read the journal in the reading and sharing area.
Tags are indexed and added to the tag search database
If there were a fire in your house, and you could save one possession, what would it be? For me it's always been an obvious answer: my journals. I've learned that this is common with people who write frequently and seriously in personal journals and for people who store the journals of deceased ancestors.
Throughout my life I've filled many journals with hundreds of pages, and as the stacks of them grew, a sense of worry that I might lose them also grew, along with a desire to digitize and protect them. Immediately after I got back from my mission I started scanning the pages of all my journals. It was a tedious process, but every page scanned and uploaded to dropbox was a small weight lifted off my shoulders.
Soon, others started asking if I could do theirs. People even brought me the journals of their deceased ancestors that they'd been storing. Because I now had my journal pages in the cloud, It was possible to share the experiences of them with friends or family by sending them a direct link to the image, or emailing them a copy of it. As I did this from time to time, I realized quickly how hard it was to find the right image to send, as well as for them to read my bad handwriting. I decided if I could type my journals, it would help.
After about 3 hours and only 15 pages in, I decided to look for a typist to do it for me. After seeing the high prices of manuscript typists I started looking on outsourcing sites like Elance to find someone who could do it cheap and well. I found someone from the Philippines named Anthony that looked promising. $60 dollars and several days later I had hundreds of pages typed in one large text file. I could keyword search my whole journal in a split-second, and I could copy and paste text to send to friends. It was awesome!
After going to college for computer science, and working professionally as a web developer for a few years I decided to revisit this concept of journal digitization as a business model. I talked to lots of people who had an interest and came up with a design for a platform to automate the process, and leave people with a much better final result than a folder on dropbox and a huge text file.
I created Legacy Scribes LLC, bought the domain, got the repo, and started coding.
What it does
Legacy Scribes takes handwritten journals, and turns them into text. It doesn't do it automatically through a magical handwriting converter algorithm (that doesn't exist), but it may as well for all the user knows. They simply drop their journal image files into our site, and a few days later they're typed and tagged as accurately as any professional typist can do. The digitized text is attached to the journal image in the database, and can be shown side by side with the image when searched for or shared. It can also be exported into different formats, including a pdf to be printed into a book. The text isn't only typed, but is also tagged. Our typists use special characters while they type to specifically tag a name, a date, or a place. These tags are indexed separately and our entire database of journals can be searched instantly for any tag. Through these tags we can programatically find the Familysearch persons they belong to and attach the pages as sources or memories to that person through the FamilySearch API. The tags can be used for any researchers to find handwritten journal sources by area, time, or name. Users are in charge of the privacy levels of their pages and can use passwords or special links for private content. They can share entire journals or specific pages on social networks or other mediums to give someone access. Using these tools someone may have us digitize their great grandfather's journals, and then send everyone in the family an invitation link where they can all access the journals from a computer, tablet, or phone at any time. Then the family can read the journals together, to gain a greater understanding of their ancestor. Later, perhaps someone researching world war 1 will find the journals using the tag search and request access to read that ancestor's journals about the war.
There are already apps built to help people do this work themselves. Our app is built for someone who doesn't want to do the work themselves, or doesn't have the time or skills for it. We use crowdsourcing to quickly get the work done. When a typist logs into the platform they are given a page to type. When a user submits their pages to be typed they choose if they would like to approve the pages themselves for accuracy, or have us arbitrate it (recommended). When a typist finishes a page and submits it, it goes to the arbitrator, who either approves or rejects it (if it has errors). The typist receives their rating from their approval percentage and is suspended or removed if they have too many errors. Before a journal comes into the queue, we take a quick look and rate it's difficulty level, and pay the typists more or less depending on that difficulty level ($0.0002 per character avg). Higher rated typists are given higher rated journals.
Because file storage is so cheap now, it isn't a high cost for us to store the files for free. We keep all of the image files in Amazon's S3 and don't charge a monthly price to keep them in the cloud. Currently we charge $1.00 per page (or image) that we type. But we've been able to discount people to $.40 cents per page and still make a profit. This price pays for the typing and tagging service, and a lifelong hosting of the images and text. This model may change in the future, but for now it seems to work.
For people who don't want to scan their pages themselves, we can scan them for a small fee. If we have a scanner in the area (currently only St. George, UT) we can go to their house to scan the pages. We use portable high definition overhead scanners to scan quickly and effectively. If we don't have a scanner in the area, they can send us the journals and we will take museum-quality care of them to scan them, and send them back.
We've built the platform to be a one stop solution to organize and digitize the entire collection of journals from someone's life. There are thousands of boxes of journals sitting in attics and closets. We want to help take those down from the attic and put them where they will be useful.
How I built it
Before building anything I had customers that wanted me to do the work for them. I was using Odesk to manage my typists and I was a middle man. The platform was designed step by step to replace me as the middle man. As the platform progressed I made my customers users on the system, and slowly shimmied myself out of the picture, until they were using the platform to manage their things without needing me. As I signed up new people I paid careful attention to the questions they had, and refined the process and UI to reduce the questions to almost zero. For my typists, most of them were from the Philippines and it took a lot of training at first to get them used to the system, and doing everything correctly. Every question they had, or mistake they made gave me a new project to work on. Slowly I was able to shimmy myself out of the picture with them them too to the point I could sign up a new typist with a few paragraphs of an explanation, and the system would get them through the rest.
As with any product there is still a lot to be done. But right now it is being used by a handful of people in production with hardly any communication or training on my part.
Challenges I ran into
One of the hard challenges was dealing with bad typists. There were great typists that would get every page almost perfect, but there were some that would get everything wrong. We were having our users arbitrate the pages themselves which looked bad on us when they would receive a poorly typed page. This is what prompted us to come up with the typist rating system, as well as providing our own arbitration, which has turned out to be a great solution.
Another challenge is page numbering. If someone sends us 300 pages as image files, but they're not numbered in the file names, or the page number isn't numbered in handwriting, we have no way to know how to order them in the system. When that happens we can only hope they were in order when they were dropped into our system. We do let the user rearrange them afterward if they want, but that is not easy if they are all shuffled. So far our solution to that is promptings to the user before they drop the files in, to make sure they're ordered correctly. When we do the scanning ourselves (which is most of the time now) we spend a few seconds and hand write a small number on the corner of each page if there isn't already one. This way when the typist does the work, they provide the page number, and the pages can easily be arranged automatically.
Another challenge has been the question, "Will people actually send their journals in the mail"? The answer to that is sometimes. Some people are too nervous to place the journals in a box and send them away. We understand how they feel, and simply encourage people to use tracked and trustworthy shipping. For those that are too nervous to send them to us, we point them toward helpful scanning references, or to their nearest family history center for help scan to the journals.
In getting opinions about it, many people say "why would somebody want to share their journals or their deceased family member's journals? Aren't those meant to be private? Is it okay to make public the journal of a deceased ancestor"? The answer is sometimes. A journal can contain many different kinds of content. It could be the diary of a 14 year old girl writing about her feelings of boys. Or writings of deep sadness felt by a mother after a miscarriage. Or a middle aged man writing about how he's been playing too many video games. Or the feelings of a proud new father after having a son. Or the sadness of an 87 year old man who's lost his lifelong wife. In the right context, anything can shareable, and beneficial to someone. Reading the journal of someone who has experienced something similar to you can be profoundly helpful and meaningful. Privacy will be an evolving, ongoing development with this project. It is up to the discretion of whoever possesses the hard copy of a journal to decide how much they will share or make public, whether it is their own, or their deceased great grandfather's. In reality they are already in that position and can share it or retain from whomever they wish.
Accomplishments that I'm proud of
Finding our first non-related, non-friend customers was exciting. We used flyers and business cards to try drumming up some traffic to test out the St. George area, and received a few calls from that. I also did a presentation to the managers of the FamilySearch center in St. George and in addition to selling some of them, they started sending some referrals our way. We've done social media marketing and PPC to gain a few customers online.
All of the coding was done between the hours of 11am and 3am for the most part. Working full time and taking care of a family makes it difficult to put much time into a project during normal hours, so I had a lot of late nights.
As a career I lead the development of software that is accessed daily by millions of users around the world. I have much experience in polishing software made for both administration and end-user engagement.
What I learned
I have learned a lot about journals. In doing this project I have read hundreds of pages of journals from all kinds of people in all kinds of situations. That is my favorite part. My hope is that if people are given an easy place to find journals to read, they will do it more, and it will enlarge our understanding of humanity.
What's next for Legacy Scribes
We are currently using the sandbox api for Familysearch, so we're not yet a Familysearch partner. That is our next obvious step to get some publicity. After that, we will do interviews and news articles to put ourselves in front of people. We will continue to work the St. George area with traditional advertising of signs, flyers, and cards as well as with the Familysearch center for local customers for a few months until we have everything polished where we will branch out to SLC. We are in talks with Mortuaries and Funeral Directors to make our services an add-on package to their service.
We will actively seek funding throughout this coming year to bring Legacy Scribes to everyone by the end of the year.