Abstract
Many studies on animals, children and adults advocate for an inherent mechanism for number perception in humans and animal brains. There are some computational models for visual number perception. The on-centre off-surround Recurrent Neural Network(RNN) From Sengupta et al.(2014) accounts for some major behavioural findings regarding number perception. The RNN entails an abstract form of input and output. In our current work, we have devised a biologically interpretable method to use real images as the network's input and retrieve numerical estimation from the network's output(mean activation), resulting in an end-to-end model for enumeration of images. The model uses information from Saliency maps of images as input current for the neurons for a small duration(presentation time). The activation of the neurons changes based on the internal dynamics of the network, characterised by self-excitation and mutual inhibition. When mean activation increases monotonically with numerosity, it can be standing as a basis for magnitude comparison for enumeration tasks in the brain. Based on this hypothesis, we have formulated a method to retrieve numerosity from the mean activation of the network at a steady state. The algorithm compares the network's mean activation with the known relation between numerosity, inhibition value, and mean activation for each image, which gives us the corresponding numerosity. Using a set of images, we tested the model's accuracy with different saliency maps(AIM(2009), RARE2012 and others) and different network inhibition values(.01 - .15). A network with a lower inhibition is found to be suitable for estimating large numbers and vice versa, which corroborates the claim from the paper. Our model follows a bottom-up approach for enumeration and gives favourable results, mainly for the images with a small number of spatially well-separated objects. The model's design has scope for a top-down mechanism for a more accurate and biologically relevant visual enumeration.