Abstract:The format and content of items such as product names and specifications in the detailed section of VAT invoices are highly flexible and complex, lacking complete gridlines to separate information fields. Existing methods for all-element structural recognition of VAT invoices face issues like low element recognition rates and high computational complexity. A new method is proposed, based on computer morphological operations, for the structural recognition of full invoice information. This method employs morphological operations to detect invoice gridlines, crops and identifies text in different regions of the invoice. It then leverages the implicit layout rules of the commodity details section in VAT invoices, combines this with the text-connected areas obtained through morphological operations, to construct a complete table structure. Lastly, DBNet and CRNN are utilized for text detection and recognition. Tested on a dataset of 49 VAT invoices across three formats, the proposed method achieved element recognition rates of 99.9%, 97.4%, and 98.8% respectively, with average processing times per invoice of 0.90 seconds, 0.47 seconds, and 0.82 seconds. The full-ticket recognition performance of proposed method surpasses multiple comparative table recognition models and methods reported in literature.