Deploy PaddlePaddle on Arm Cortex-M with Arm Virtual Hardware

Overview of optical character recognition (OCR)

Optical character recognition (OCR) aims at recognizing the text content in a fixed image area. It is one of the key applications in the field of computer vision (CV).

OCR has been widely used in many industry scenarios such as ticket information extraction, manufacturing product traceability, and government medical document processing.

Text recognition is a sub-task of OCR. It’s the step after text detection in OCR’s two-stage algorithm which converts image information into text information.

Image Alt Text:Sample image showing the word GREENGUARD in uppercase letters on a textured background, demonstrating a typical OCR text recognition input used for testing English character recognition modelsExample of English text recognition

In this Learning Path, you’ll learn how to apply deep learning (DL) to the OCR text recognition task and set up a development flow from model training to application deployment.

You’ll learn how to:

  • Use PaddleOCR to obtain a trained English text recognition model
  • Export the Paddle inference model
  • Compile the Paddle inference model with TVMC for the target device
  • Build a text recognition application and deploy it on the Corstone-300 FVP with Arm Cortex-M55 .

This project is a collaboration between Arm and Baidu to improve PaddlePaddle model deployment to Arm Cortex-M devices. This gives developers more choices by increasing the number of deep learning models supported on Cortex-M.

PP-OCRv3

PaddleOCR provides an OCR system named PP-OCR. This is a practical, ultra-lightweight OCR system created by the Baidu Paddle Team.

It is a two-stage OCR system, in which the text detection algorithm is called DB , and the text recognition algorithm is called CRNN .

As seen in Figure 2, the overall pipeline of PP-OCRv3 is similar to PP-OCRv2 with some further optimizations to the detection model and recognition model.

For example, the text recognition model introduces SVTR (Scene Text Recognition with a Single Visual Model) based on PP-OCRv2. The model also uses GTC (Guided Training of CTC) to guide training and model distillation. For more details, refer to the PP-OCRv3 technical report .

Image Alt Text:Architecture diagram showing the PP-OCRv3 pipeline with text detection and recognition stages, including SVTR-based recognition model improvements over PP-OCRv2PP-OCRv3 pipeline diagram

In the next section, you’ll deploy a trained PP-OCR text recognition model on the Arm Corstone-300 FVP.

Back
Next