Abstract
Because of the COVID-19, online lectures have become very popular. The lack of interactivity in online lectures makes it difficult for instructors to estimate how much attention students pay to his/her lectures. The aim of this study was to develop a method to estimate students' attentional state from facial features while participating in an online lecture. We conducted an experiment to measure attention levels while watching a video lecture, using a reaction time (RT) measurement to detect stimuli (white noise) irrelevant to the lecture. We assume that RT to such a stimulus would be longer when participants are focusing on the lecture than when they are not, so we could estimate how focused learners are on the lecture when white noise is presented from the RT measurement. During the experiment, learner's face was recorded by a video camera for the purpose of predicting RTs. We applied a machine learning method (light GBM) to estimate the RTs from facial features extracted as action units (AU) by an open-source software (OpenFace). The model obtained from light GBM showed that RT to the irrelevant stimuli can be estimated in some amount from AUs. This suggests that facial expressions are useful for predicting attention state or concentration level while learning. An alternative interpretation of RT lengthening is the decrease of arousal level. Some participants sometimes appeared sleepy and we estimated sleepy faces from AUs related to blink. We re-analyzed the data without RTs when the AUs defined the face as sleepy and found similar results. This supports the conclusion that facial expressions are useful for predicting concentration level while learning.