PictoBERT: Transformers for Next Pictogram Prediction

Photo by Authors

Augmentative and Alternative Communication (AAC) boards are essential tools for people with Complex Communication Needs (e.g., a person with down’s syndrome, autism, or cerebral palsy). These boards allow the construction of messages by arranging pictograms in sequence. In this context, a pictogram is a picture with a label that denotes an action, object, person, animal, or place. Predicting the next pictogram to be set in a sentence in construction is an essential feature for AAC boards to facilitate communication. Previous work in this task used n-gram statistical language models and knowledge bases. However, neural network literature suggests they can cope better with the task, and transformers-based models like BERT (Bidirectional Encoder Representations from Transformers) are revolutionizing this field. In this paper, we present PictoBERT, an adaptation of BERT for the next pictogram prediction task. We changed the BERT’s input embeddings to allow word-sense usage instead of words, considering that a word-sense represents a pictogram better than a simple word. The proposed model outperforms the n-gram models and knowledge bases. Besides, PictoBERT can be fine-tuned to adapt to different users’ needs, making transfer learning its main characteristic.


David Macêdo, PhD
David Macêdo, PhD
Deep Learning

My interests include everything related to deep learning.