Proceedings:
No. 3: AAAI-22 Technical Tracks 3
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 36
Track:
AAAI Technical Track on Computer Vision III
Downloads:
Abstract:
In recent years, tree decoders become more popular than LaTeX string decoders in the field of handwritten mathematical expression recognition (HMER) as they can capture the hierarchical tree structure of mathematical expressions. However previous tree decoders converted the tree structure labels into a fixed and ordered sequence, which could not make full use of the diversified expression of tree labels. In this study, we propose a novel tree decoder (TDv2) to fully utilize the tree structure labels. Compared with previous tree decoders, this new model does not require a fixed priority for different branches of a node during training and inference, which can effectively improve the model generalization capability. The input and output of the model make full use of the tree structure label, so that there is no need to find the parent node in the decoding process, which simplifies the decoding process and adds a prior information to help predict the node. We verified the effectiveness of each part of the model through comprehensive ablation experiments and attention visualization analysis. On the authoritative CROHME 14/16/19 datasets, our method achieves the state-of-the-art results.
DOI:
10.1609/aaai.v36i3.20172
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 36