Abstract
In recent years, deep learning (DL) has made breakthroughs in many fields. However, the success of DL requires constant re-training with large amounts of data, which is often costly, causing data paucity problems in the medical space. Active learning (AL), on the other hand, aims to reduce the amount of training data while still retaining similar performance through the exploration of the added value of data.
In this study, we designed an AL system for training Artificial Intelligence (AI) models to explore faster learning curves while minimizing the need for new data. The model was designed for vessel object detection, using a Deep Neural Network (DNN), for scanning laser ophthalmoscopy (SLO) retinal images. Image embedding vectors were used to proactively select the most informative data. We established a baseline of k-fold cross validation frameworks to measure model performance, where 809 annotated images were randomly selected from our database and divided into multiple regions of interest with minimal selection bias. Using this validation framework, we observed the detection model for vessel junctions could achieve a mean average precision (mAP) of 0.648, precision of 0.645, and recall of 0.703; tested on 135 images. We then tested our active learning system by dividing the dataset into two clusters using image embedding vectors and machine learning clustering algorithms. We observed one cluster converged faster to the desired mAP performance than the other by 2.3x.
 Funding: Funding: This research was supported by grants from the National Institutes of Health TL1 TR 001871 (C.K.S.) and R41 NS100222-01A1 (A.J.G. and C.K.S.), and That Man May See (A.J.G. and C.K.S.).