Search the Moravia Blog

Life Sciences Blog

How to Avoid Headaches Translating Scanned Medical Documents

Posted by Lee Densmer on April 20, 2017 at 2:40 PM

4 Things Life Sciences Project Managers Need to Know About Working with Scans

A bad scan of a handwritten document? These can be impossible to read! Even worse when it’s in a doctor’s handwriting. But in Life Sciences localization projects, sometimes it’s the best you’ll get for a source file.

Translation companies frequently have to work with scanned documents such as clinical documentation, patient questionnaires, doctors’ notes, pathology reports, death certificates, hospital release forms, documents about patient health, and many other patient- or clinical-related documents.

There are significant challenges when working with these source documents, because they weren’t created for localization in the first place. What do Life Sciences project managers need to know about working with scans?

1. Documents must be recreated

When an LSP receives a scan, they have to make a choice about how to work with it.

The best practice is to recreate a scan in an electronic format. This generates the source file for a regular translation project: an electronic file that can be prepared for localization and translated using technology such as CAT tools with Translation Memories (TM), and automated quality checks that either look for errors or compare source to target. The use of these tools ensures the highest possible quality.

The three choices for recreation and translation are:

  1. The translator works directly from the scanned documentwhether handwritten or notcreating an electronic translated file—but this process takes place entirely outside of translation tools. This process yields the lowest quality, and the translations aren’t stored in a TM for future reuse.

  2. A translator or production person recreates the original source scan as an electronic file. The translator can then use CAT tools to create a bilingual file during translation. This takes time but can produce the highest quality.

  3. The scan is converted to a source document via Optical Character Recognition technology (OCR). The result is an electronic document that the translator can use to perform the translation, but OCR often doesn’t capably handle images or poor handwriting. And some source languages are tricky for OCR, like Arabic or Hebrew. The translator may need to refer to the original to include any elements the OCR didn’t pick up accurately. However, if the scan was mostly clear and the source language is an easy one to process, this method can produce adequate quality.

Regardless of what method you choose, formatting must be completed by a second-party production specialist so the document has the same layout and formatting as the source. Layout can be important when documents are submitted to authorities.

Knowing the advantages and disadvantages of each process allows you to choose with your eyes wide open.

2. Costs for recreation are high, but worth it

Most often these documents are only translated into one language, which means that the upfront costs of recreating the source are high relative to the scope of the whole project. Sometimes the recreation is more than 50% of the localization cost.

But remember, this is how you’re getting the highest quality. And at times, TM leverage can save more than the cost of recreation, especially on larger projects. When a project involves more than two target languages, the up-front costs are spread across them; done once, used for all.

Be sure to weigh the cost against the benefit, both in the short and longer terms.

3. Quality is variable and must be defined

When working with such difficult source material, there has to be a conversation about quality requirements. Factors impacting the final quality include the quality of the source file, quality of the scan, legibility of handwriting, and the process chosen. What are the expectations? For example, if the handwriting is terrible, there must be an agreement on how to handle unreadable text.

Also, the need for quality may vary between deliverables: when is top quality required? When do you only need a gist?

Once the quality target is agreed upon, be sure to map the process to the quality requirement.

4. References can help

Make sure to gather all the references you can and provide them to the translators.

The original document can help both the production person recreating the document and the translator; they can compare their results with the original. Also, often the original document was a form, filled out from a template. Providing a blank version of that template can help to speed up the process of source recreation.

And of course, providing translation memories will help the translators’ productivity and boost consistency.


Regardless of which process you choose, the expectations must be clear for outputs and quality levels. The key is that all sides understand the factors involved and agree on the process. By knowing what you’re dealing with, understanding production realities, and following a few best practices, scans can be successfully translated and meet the quality requirements of Life Sciences enterprises and their customers worldwide.

Thanks to Milos Tlustak from Moravia Life Sciences for his contributions to this blog.

Topics: Life Sciences Translation