Addressing the question whether pupil dilation may decode decisions on a single trial level, a few investigations using support vector machine classifiers (SVMs) have been put forward over the past years (
Bednarik et al., 2012;
Jangraw, Wang, Lance, Chang, & Sajda, 2014;
Medathati et al., 2020). Although several articles have found random forest classifiers to be particularly useful for predictions also based on pupil dilation (
Kootstra et al., 2020;
Pasquali et al., 2020), no prior work has focused on predicting intentions only using pupil size.
Jangraw et al. (2014) used electroencephalography and ocular parameters to infer objects that were of subjective interest to users in a virtual reality environment and found both to be predictive. During informational intent, defined as search for an object under the assumption that it exists, the pupil was found to be larger than during scanning of a scene. Together with other gaze features, such as fixation durations, an accuracy of over 85% could be reached using an SVM (
Jang, Mallipeddi, Lee, Kwak, & Lee, 2014).
Bednarik et al. (2012) explored the potential of pupil dilation and other gaze characteristics as a means to infer the intention to select one out of the possible puzzle tiles in an eight-tile slide puzzle. A classification performance of about 60% was reached using models with SVMs when considering pupil dilation only; by including further gaze characteristics, classification performance reached up to 80% (
Bednarik et al., 2012). Using pupil sizes obtained from a variety of tasks, a classification of about 60% was reported for changes in cognitive state, which in most (but not all) cases likely reflected a binary decision on a target’s presence in visual search. Hereby, models using a SVM for rolling windows of 1 to 2 seconds achieved the best results when data were locally z-standardized (
Medathati et al., 2020). However, the considerable differences between the tasks leave unanswered what classification could be reached for binary decisions in particular. Deriving with valid conclusions from pupil dilation usually requires highly controlled experiments, as also motor execution, such as a key press (
Richer & Beatty, 1985), changes in gaze position, if not corrected for
Hayes and Petrov (2016), and brightness affect pupil dilation. In the foregoing investigations most closely related to recognizing user intention, understood as the outcome of a binary decision on a stimulus’ relevance from pupillometric data, changes in gaze position were not controlled for (
Bednarik et al., 2012;
Medathati et al., 2020). More important, however, key presses were necessary to inform about the intent to select. Pressing a key alone leads to strong pupil dilations (see
Figure 3 for the effect of a key press in this investigation), usually exceeding effect sizes of binary decisions (
Richer & Beatty, 1985;
Strauch, Greiter, & Huckauf, 2018). Thus, the pupil does not only pick up on a change in activation elicited by the intention (
de Gee et al., 2014), but also on the substantially larger change elicited by motor execution. In other words, it cannot be fully excluded that the observed effects were not exclusively driven by the decision regarding a stimulus’ relevance, but the motor execution that was associated with the very same decision in the analyzed tasks. Hence, previous studies using SVM algorithms (
Bednarik et al., 2012;
Medathati et al., 2020) remain inconclusive as to whether
intention alone can be decoded from the pupillary signal. Machine learning features solely derived from pupil size can further foster the understanding which signal characteristics are most informative and thus most promising at the center of pupillometric methods and investigations.