August 2023
Volume 23, Issue 9
Open Access
Vision Sciences Society Annual Meeting Abstract  |   August 2023
Comparison of human observers and a deep learning model in recognition of static robot facial expressions
Author Affiliations & Notes
  • Dongsheng Yang
    Kyoto University
    Guardian Robot Project, RIKEN
  • Wataru Sato
    Kyoto University
    Guardian Robot Project, RIKEN
  • Takashi Minato
    Guardian Robot Project, RIKEN
  • Shushi Namba
    Guardian Robot Project, RIKEN
  • Shin’ya Nishida
    Kyoto University
    NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Atsugi, Japan
  • Footnotes
    Acknowledgements  This work was supported in part by JST, the establishment of university fellowships towards the creation of science technology innovation, Grant Number JPMJFS2123.
Journal of Vision August 2023, Vol.23, 5033. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Dongsheng Yang, Wataru Sato, Takashi Minato, Shushi Namba, Shin’ya Nishida; Comparison of human observers and a deep learning model in recognition of static robot facial expressions. Journal of Vision 2023;23(9):5033.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Visual emotion recognition, one of the most critical skills connecting the relationship among human society, aroused extensive attention in human-robot interaction (e.g., Matteo, 2020). Nowadays, some deep neural network (DNN) based models trained with human facial expression images can recognize basic human facial emotions with high accuracy (see Li & Deng, 2018). For both engineering and vision science, it is interesting to clarify the differences between machine recognition models and humans in recognizing facial expressions made by artificial agents. Our study used Nikola, a FACS-based robot with 35 degrees of freedom on its face, to make a 3D stimulus generating human-like facial expressions. We chose a set of Nikola's action-unit (AU) parameters for each of the seven basic facial expressions (Anger, Disgust, Fear, Happiness, Sadness, Surprise, Neutral) in two ways. (A) We adjusted the AU parameters for prototype expressions based on Ekman's theory (Sato et al., 2022). (B) Using a Bayesian Optimization algorithm, we found AU parameters that the corresponding expression image received the maximum rating by Py-Feat, a DNN model for human expression classification. We then asked forty human participants, aged between 18 and 40 years, to evaluate the robot expressions made by the two methods using a 7-point rating task. The results showed that Py-Feat-based optimization outperformed prototype expressions for Anger, Disgust, Sadness, and Surprise (p<0.001), which suggests the effectiveness of the DNN-based model in expression recognition on Nikola's facial expressions. However, Py-Feat-based optimization could not achieve higher ratings for Happiness and Fear expressions, despite the Py-Feat rating predicting significant improvements. The observed difference between Py-Feat scores and human scores in evaluating certain facial expressions made by a human-like agent reveals a difference in processing strategy in facial expression recognition between humans and a popular DNN-based model.


This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.