**Objects have a variety of different features that can be represented as probability distributions. Recent findings show that in addition to mean and variance, the visual system can also encode the shape of feature distributions for features like color or orientation. In an odd-one-out search task we investigated observers' ability to encode two feature distributions simultaneously. Our stimuli were defined by two distinct features (color and orientation) while only one was relevant to the search task. We investigated whether the irrelevant feature distribution influences learning of the task-relevant distribution and whether observers also encode the irrelevant distribution. Although considerable learning of feature distributions occurred, especially for color, our results also suggest that adding a second irrelevant feature distribution negatively affected the encoding of the relevant one and that little learning of the irrelevant distribution occurred. There was also an asymmetry between the two different features: Searching for the oddly oriented target was more difficult than searching for the oddly colored target, which was reflected in worse learning of the color distribution. Overall, the results demonstrate that it is possible to encode information about two feature distributions simultaneously but also reveal considerable limits to this encoding.**

*SD*, and shape) were held constant during learning streaks. In different conditions we tested the influence of a secondary feature and, moreover, also the possible internal representation of that task-irrelevant feature. The detailed procedure and the design of the stimulus display are described below.

*SD*of 15° (distribution range: 60°; values outside this range of the Gaussian distribution were removed). All parameters were based on previous research using uniform and Gaussian distractor distributions of orientation and color (Chetverikov et al., 2016, 2017a, 2017b). All distractor lines on the test trial were drawn from a Gaussian distribution with an

*SD*of 10°. The color space was based on 48 isoluminant (in DKL color space) hues. Adjacent hues were approximately separated by one average just-noticeable-difference, JND (based on data provided by Witzel, from Witzel & Gegenfurtner, 2013, 2015). That is, the color space was perceptually linearized with respect to the average differences in color discrimination thresholds. This color space has already successfully been used to examine feature distribution learning (Chetverikov et al., 2017b). The color distribution during learning streaks was either uniform or Gaussian with an

*SD*of 6 JND and 3 JND during test trials. The distractor mean was chosen randomly at the beginning of a block and kept constant during a learning streak.

*SD*were removed. To assess the influence of distribution shape and the effects of repetition within a learning streak, we conducted two-way repeated-measures ANOVAs, with Greenhouse–Geisser corrections, where applicable, after testing for sphericity using Mauchly tests. ANOVAs were conducted in the open source software R (R Development Core Team, 2012) using a random effects model from the

*ez*package (Lawrence, 2016). We compared the shapes of the RT CT-PD function using a segmented regression in R (Muggeo, 2008). Confidence intervals are presented on the nonlog data, but all statistical tests are done on log-transformed search times.

*SD*= 119, accuracy = 0.96,

*SD*= 0.02) than when features were drawn from a uniform distribution (RT = 797 ms,

*SD*= 148, accuracy = 0.95,

*SD*= 0.02). Search times within a learning streak decreased rapidly after the first repetition (Figure 2b and 2d). A two-factor (distribution shape × trial number within learning streak) repeated-measures ANOVA revealed a main effect of distribution shape,

*F*(1, 9) = 68.9,

*p*< 0.001,

*η*= 0.05, and a main effect of trial number within learning streaks,

^{2}*F*(1.36, 12.22) = 226.98,

*p*< 0.001,

*η*= 0.3. We found a small, but significant interaction between the distribution shape and trial number within learning streak,

^{2}*F*(3, 27) = 6.49,

*η*= 0.003. Searching for an oddly oriented line yielded similar results. Observers were faster and more accurate when orientations were drawn from a Gaussian distribution (RT = 940 ms,

^{2}*SD*= 166, accuracy = 0.90,

*SD*= 0.05) compared to a uniform distribution (RT = 1002 ms,

*SD*= 189, accuracy = 0.86,

*SD*= 0.06). Search times also decreased rapidly after the first search trial within a learning streak (Figure 2a and 2c). A two-factor (distribution shape × trial number within learning streak) repeated-measures ANOVA revealed a main effect of distribution shape,

*F*(1, 9) = 43.86,

*p*< 0.001,

*η*= 0.03, and a main effect of trial number in the learning streak,

^{2}*F*(3, 27) = 80.72,

*p*< 0.001,

*η*= 0.084. We found no significant interactions between the distribution shape and the trial number within learning streaks.

^{2}*p*< 0.001 (the Davies test tests the hypothesis that the segmented regression provides a better fit compared to a simple linear model). We tested the slope before and after the breakpoint against zero. The slope, b = 2.73, CI = [−9.72, 15.19], of the first part did not differ significantly from zero. The slope after the breakpoint was significantly negative: b = −11.60, CI = [−13.70, −9.51]. Conversely, search times following a Gaussian distractor distribution did not reveal any significant breakpoints: Davies'

*p*> 0.05. Search times as a function of the target to distractor distance monotonically decreased, and the slope was significantly negative: b = −9.33, CI = [−10.65, −8.01].

*SD*= 6, a linear model and a “uniform with decrease model,” which contains a flat part within the distribution range and a linear decrease outside the distribution range. Each model includes a Gaussian-distributed error term (see Chetverikov et al., 2017b, for equations for the models). We fitted the different models to our data and obtained the best fitting parameters using Maximum Likelihood Estimation and used the Bayesian Information Criterion for comparison. Figure 4b shows participants data and the resulting fits. Following a Gaussian distribution, the best fit we obtained was with the linear model (BIC = 418.18), followed by the “uniform with decrease model” (ΔBIC = 24.81). We also fitted the same models to individual participants. Following the Gaussian distribution, the best fit was provided by the linear model (

*N*= 6 subjects) and for four subjects the uniform with decrease model provided better fits. Following a uniform distribution, the best fits were obtained with the “uniform with decrease” model (BIC = 515.33), followed by the linear model (ΔBIC = 2.78). When the models were fitted for individual subjects, the best fit was provided by the “uniform with decrease” model (

*N*= 6 subjects) and for four subjects the linear (

*N*= 3) or the uniform (

*N*= 1) model provided better fits.

*p*= 0.047. We tested the slope before and after the breakpoint against zero. The slope, b = 2.49, CI = [−2.37, 7.71], of the first part did not differ significantly from zero. The slope after the breakpoint was significantly negative: b = −1.00, CI = [−1.59, −0.42]. Search times following a Gaussian distractor distribution did not reveal any significant breakpoint: Davies test:

*p*> 0.05. Search times as a function of the target to distractor distance monotonically decreased. The slope was significantly negative: b = −0.95, CI = [−1.53, −0.38].

*SD*= 15, a linear model and a “uniform with decrease model,” which contains a flat part within the distribution range and a linear decrease outside the distribution range. Each model includes a Gaussian-distributed error term. We fitted the different models to our data and obtained the best fitting parameters using Maximum Likelihood Estimation and used the Bayesian Information Criterion for comparison. Figure 5b shows participants' data and the resulting fits. Following a Gaussian distribution shape, both the linear (BIC = 1096.05) and the “uniform with decrease” model (BIC = 1096.04) provided equally good fits (ΔBIC = 0.0039). We also fitted the same models to individual participants, and for the majority of subjects a Null model provided the best fit (

*N*= 7). This suggests that for a majority of participants the orientation search was difficult and did not yield distribution shape learning, or that the results for individual participants contain too much noise. Following a uniform distribution, the best fit was provided by the “uniform with decrease” model (BIC = 920.44), followed by the linear model (ΔBIC = 2.45). However, fitting these models to individual participants revealed that again most participants yielded best fits with a Null model (

*N*= 8), that does not presume any distribution shape learning.

*irrelevant*feature distribution. Figure 6a shows the response times plotted against the distance between the target on the test trial and the mean of that feature distribution on the learning trials. Figure 6 contains only trials of condition three where participants searched for an oddly oriented line during learning streaks and for an oddly colored line during test trials. Overall search times for targets within the range of the previous distractor distribution were slower than search times of targets outside the previous distractor distribution. However, participants also responded faster when the target was close to the mean of that feature distribution in the preceding learning streak.

*p*< 0.001 for both the uniform and the Gaussian distributions. We tested the slopes preceding and following the breakpoint against zero. Following a uniform distribution, the slope, b = 7.84, CI = [−3.52, 19.20], of the first part did not differ significantly from zero. The slope following the breakpoint was significantly negative: b = −4.45, CI = [−7.17, −1.90].

*N*= 4) the “uniform with decrease” model provided the best fit for both distribution shapes (Gaussian: BIC = 688.19; uniform: BIC = 583.19), whereas for the remainder of participants (

*N*= 6) the null model yielded the best fit, showing that the distribution shape of the irrelevant feature during learning was not encoded (Figure 6).

*p*> 0.05. Following a Gaussian distribution, search times as a function of the target to distractor distance also did not significantly decrease: b = −0.095, CI = [−0.66, 0.47]. However, we found a significant negative slope for the uniform distribution: b = −0.61, CI = [−1.11, −0.11].

*SD*= 106) than in our experiment (RT = 940 ms,

*SD*= 166), although comparisons between different studies must be made with caution because of different samples and testing conditions.

*Psychological Bulletin*, 142 (12), 1–32.

*Trends in Cognitive Sciences*, 15 (3), 122–131.

*Psychological Science*, 19 (4), 392–98.

*Psychological Science*, 12, 157–162.

*Vision Research*, 35 (22), 3131–3144, https://doi.org/10.1016/0042-6989(95)00057-7.

*Attention, Perception, & Psychophysics*, 77 (4), 1116–1131.

*Journal of Experimental Psychology: Human Perception and Performance*, 40 (4), 1440.

*Spatial Vision*, 10, 433–436.

*Psychological Science*, 25 (7), 1394–1403.

*Cognition*, 153, 196–210, https://doi.org/10.1016/j.cognition.2016.04.018.

*Journal of Vision*, 17 (2): 21, 1–15, https://doi.org/10.1167/17.2.21. [PubMed] [Article]

*Psychological Science*, 28 (10), 1510–1517, https://doi.org/10.1177/0956797617713787.

*Vision Research*, 140, 144–156, https://doi.org/10.1016/j.visres.2017.08.003.

*Spatial learning and attention guidance*. New York, NY: Neuromethods, Springer Nature.

*Vision Research*, 43 (4), 393–404.

*Vision Research*, 45, 891–900.

*Vision Research*, 37 (22), 3181–3192, https://doi.org/10.1016/S0042-6989(97)00133-8.

*Biometrika*, 74 (1), 33–43.

*Perception & Psychophysics*, 70 (5), 789–94.

*Perception & Psychophysics*, 70 (6), 946–954.

*Cognitive Psychology*, 8 (1), 98–123.

*Journal of Experimental Psychology: General*, 107 (3), 287–308.

*Cognitive Psychology*, 1 (3), 225–241.

*Nature Neuroscience*, 14 (7), 926–932.

*Human Perception and Performance*, 35, 718–734.

*Psychonomic Bulletin & Review*, 18 (5), 855–859.

*From perception to consciousness: Searching with Anne Treisman*(pp. 1–21). Oxford, UK: Oxford University Press.

*Current Opinion in Psychology*, 29, 71–75.

*Attention, Perception, & Psychophysics*, 72 (1), 5–18.

*Vision Research*, 48 (10), 1217–1232.

*Memory & Cognition*, 22 (6), 657–672, http://www.ncbi.nlm.nih.gov/pubmed/7808275.

*JOSA A*, 31 (4), A93–A102.

*Journal of Experimental Psychology*, 81, 16–21.

*Tutorial in Quantitative Methods for Psychology*, 4 (2), 61–64.

*Journal of Vision*, 8 (11): 9, 1–8, https://doi.org/10.1167/8.11.9. [PubMed] [Article]

*R News*, 8 (1), 20–25.

*Journal of Vision*, 15 (4): 3, 1–14, https://doi.org/10.1167/15.4.3. [PubMed] [Article]

*Nature Neuroscience*, 4, 739–744.

*Psychological Monographs: General and Applied*, 74 (11), 1–29.

*Current Opinion in Neurobiology*, 6 (2), 171–178.

*Philosophical Transactions: Biological Sciences*, 353 (1373), 1295–1306.

*Cognition*, 152, 78–86.

*Vision Research*, 32 (5), 931–941.

*Perception & Psychophysics*, 60 (2), 191–200.

*Vision Research*, 29 (1), 47–59.

*JOSA A*, 31 (4), A283–A292.

*Annual Review of Psychology*, 69, 105–129, https://doi.org/10.1146/annurev-psych-010416-044232

*Journal of Vision*, 13 (7): 1, 1–33, https://doi.org/10.1167/13.7.1. [PubMed] [Article]

*Journal of Vision*, 15 (8): 22, 1–33, https://doi.org/10.1167/15.8.22. [PubMed] [Article]

*Nature Human Behaviour*, 1 (3), 1–8.