Vision Transformer-Based Emotion Detection in HCI for Enhanced Interaction Book Chapter

Soni, J, Prabakar, N, Upadhyay, H. (2024). Vision Transformer-Based Emotion Detection in HCI for Enhanced Interaction . 14531 LNCS 76-86. 10.1007/978-3-031-53827-8_8

cited authors

  • Soni, J; Prabakar, N; Upadhyay, H


  • Emotion recognition from facial expressions is pivotal in enhancing human-computer interaction (HCI). Spanning across diverse applications such as virtual assistants, mental health support, and personalized content recommendations, it promises to revolutionize how we interact with technology. This study explores the effectiveness of the Vision Transformer (ViT) architecture within the context of emotion classification, leveraging a rich dataset. Our research methodology is characterized by meticulous preprocessing, extensive data augmentation, and fine-tuning of the ViT model. For experiment purposes, we use the Emotion Detection FER-2013 dataset. Rigorous evaluation metrics are meticulously employed to gauge the model’s performance. The research underscores the potential for enhancing user experiences, facilitating mental health monitoring, and navigating the ethical considerations inherent in emotion-aware technologies. Our model achieved a testing accuracy of 70%. As we chart new horizons in HCI, future endeavors should focus on fine-tuning model accuracy across various emotion categories and navigating the complexities of real-world deployment challenges.

publication date

  • January 1, 2024

Digital Object Identifier (DOI)

International Standard Book Number (ISBN) 13

start page

  • 76

end page

  • 86


  • 14531 LNCS