Optical character recognition (OCR) aims at recognizing the text content in a fixed image area. It is one of the key applications in the field of computer vision (CV).
OCR has been widely used in many industry scenarios such as ticket information extraction, manufacturing product traceability, and government medical document processing.
Text recognition is a sub-task of OCR. It’s the step after text detection in OCR’s two-stage algorithm which converts image information into text information.
In this Learning Path, you will learn how to apply deep learning (DL) to the OCR text recognition task and setup a development flow from model training to application deployment.
You will learn how to:
This project is a collaboration between Arm and Baidu to improve PaddlePaddle model deployment to Arm Cortex-M devices. This gives developers more choices by increasing the number of deep learning models supported on Cortex-M.
PaddleOCR provides an OCR system named PP-OCR. This is a practical, ultra-lightweight OCR system created by the Baidu Paddle Team.
It is a two-stage OCR system, in which the text detection algorithm is called DB , and the text recognition algorithm is called CRNN .
As seen in Figure 2, the overall pipeline of PP-OCRv3 is similar to PP-OCRv2 with some further optimizations to the detection model and recognition model.
For example, the text recognition model introduces SVTR (Scene Text Recognition with a Single Visual Model) based on PP-OCRv2. The model also uses GTC (Guided Training of CTC) to guide training and model distillation. For more details, please refer to this PP-OCRv3 technical report .
In the next section, you will deploy a trained PP-OCR text recognition model on the Arm Corstone-300 FVP.