Text Detection and Recognition for Images of Medical Laboratory Reports With a Deep Learning Approach


The adoption of electronic health records (EHRs) is an important step in the development of modern medicine. However, complete health records are not often available during treatment because of the functional problem of the EHR system or information barriers. This paper presents a deep-learning-based approach for textual information extraction from images of medical laboratory reports, which may help physicians solve the data-sharing problem. The approach consists of two modules: text detection and recognition. In text detection, a patch-based training strategy is applied, which can achieve the recall of 99.5% in the experiments. For text recognition, a concatenation structure is designed to combine the features from both shallow and deep layers in neural networks. The experimental results demonstrate that the text recognizer in our approach can improve the accuracy of multi-lingual text recognition. The approach will be beneficial for integrating historical health records and engaging patients in their own health care.

View this article on IEEE Xplore