We measured human ability to detect texture patterns in a signal detection task. Observers viewed sequences of 20 blue or yellow tokens placed horizontally in a row. They attempted to discriminate sequences generated by a random generator (“a fair coin”) from sequences produced by a disrupted Markov sequence (DMS) generator. The DMSs were generated in two stages: first a sequence was generated using a Markov chain with probability, *p _{r}* = 0.9, that a token would be the same color as the token to its left. The Markov sequence was then disrupted by flipping each token from blue to yellow or vice versa with probability,

*p*the probability of disruption. Disruption played the role of noise in signal detection terms. We can frame what observers are asked to do as detecting Markov texture patterns disrupted by noise. The experiment included three conditions differing in

_{d}—*p*(0.1, 0.2, 0.3). Ninety-two observers participated, each in only one condition. Overall, human observers’ sensitivities to texture patterns (

_{d}*d′*values) were markedly less than those of an optimal Bayesian observer. We considered the possibility that observers based their judgments not on the entire texture sequence but on specific features of the sequences such as the length of the longest repeating subsequence. We compared human performance with that of multiple optimal Bayesian classifiers based on such features. We identify the single- and multiple-feature models that best match the performance of observers across conditions and develop a pattern feature pool model for the signal detection task considered.

*d′*, but this measure alone does not determine the observer's response. In normative equal-variance Gaussian SDT, the response is also affected by choice of a sensory criterion

*c*(Figure 1A).

*c*is set to maximize expected gain given the penalties and rewards associated with each outcome and the prior probability that a signal is present.

*p*, the probability that each successive token, left to right, would be the same color as the preceding. The leftmost token—lacking a predecessor—was as equally likely to be blue as yellow. In Figure 2B,

_{r}*p*= 0.9, and the sequence tends to alternate less often than a random sequence.

_{r}*p*, we independently flipped each of the tokens in the Markov sequence from blue to yellow or vice versa (Figure 2C). If

_{d}*p*is 0, the resulting DMS is just a Markov sequence similar to those used by (Lopes, 1982; Lopes & Oden, 1987). If

_{d}*p*is 0.5, the resulting DMS is effectively a random sequence: the disruption process with

_{d}*p*equal to 0.5 removes any pattern in the disrupted sequence.

_{d}*p*we can generate sequences for which the Gambler's fallacy is not a fallacy (

_{r}*p*< 0.5) and the hand can be—from time to time—truly hot (

_{r}*p*> 0.5). The level of disruption

_{r}*p*affects the difficulty of the task and will help us in investigating how human observers carry out the task and in discriminating between candidate models of human performance.

_{d}*generated*by the random generator or the patterned (DMS) generator, not to judge whether the observed sequence was intrinsically random or patterned. There is a rich literature (Bar-Hillel & Wagenaar, 1991; Tversky & Kahneman, 1971; Tversky & Kahneman, 1974; Wagenaar, 1970a; Wagenaar, 1970b; Wagenaar, 1972) on observers’ judgments of the intrinsic randomness of binary sequences. Given two sequences produced by independent tosses of a fair coin such as HHHHTTTT and HTTHTHTH, the first sequence is judged to be less random or less probable by many, even though they have equal probability of occurrence. Parallel work by Solomonoff, Chaitin, and Kolmogorov permits assigning an objective intrinsic randomness to any sequence (Chaitin, 1987; Kolmogorov, 1963; Kolmogorov, 1965; Solomonoff, 1960; see Li & Vitányi, 2019 for a review). By framing the task in SDT terms, we obviate any need to decide whether patterns are truly random or truly patterned. Even so, our results may prove relevant to determining the features of texture sequences that lead human observers to classify them as intrinsically random or nonrandom. We return to this point in the Conclusion.

*p*, which was 0.1 in LD, 0.2 in MD, and 0.3 in HD. The repetition probability

_{d}*p*used in generating all the DMS sequences was

_{r}*p*= 0.9 for all conditions.

_{r}*S*was produced by the DMS generator or the random generator,

*L*” and the likelihood ratio criterion

*L*is estimated from data. Here, however, we are considering an ideal Bayesian rule that maximizes the proportion of correct responses, and for such as rule

*L*= 1 necessarily as the prior odds (random or patterned) are 1. In effect we compute the probability that the sequence came from the RANDOM generator and the probability that it came from the PATTERN generator and choose whichever generator is more probable given the sequence. The details of computing λ are described in the Appendix.

*φ*(

*S*) applied to a sequence

*S*that returns a number, for example, the “length of the longest repeating subsequence” in the sequence

*S*. Our choice of potential features is defined in Figure 3. These features were chosen based on observers’ own reports of how they carried out the task (see the Supplement). We do not assume observers are aware of the features they use in carrying out the task or that they used the features they claimed they used. We eliminated features whose use is not supported by the data.

*K*features would be

_{k},

*k*= 1, ⋅⋅⋅,

*K*and estimates

*E*[ ] denotes mathematical expectation and ψ( ) is the logistic function

*β*that represent the contribution of each feature ψ

_{k}_{k}(

*S*) to the decision. The features may be highly correlated (it is likely they are), but the GLM analysis compensates for any correlation. We base our analyses on these weights. We compare human performance with performance predicted by the full model, by GLM models based on single features, and by the GLM models based on two or more features.

*p*= 0.1, 0.2, and 0.3 applied to sequences generated by a Markov Generator with Markov parameter

_{d}*p*= 0.9. In Figure 4, we plot sensitivity

_{r}*d′*versus

*p*for each observer (white circles) and mark, with a blue square, the mean

_{d}*d′*of the observers in each condition plotted versus

*p*. The value of sensitivity

_{d}*d′*was estimated using the equal variance Gaussian signal detection model (Figure 1A) (Green & Swets, 1966/1974). The heavy black contour is the expected performance (

*d′*) of the full BDT model, the maximum possible performance. The dashed contours are the expected performances for the four of the feature models based on single features shown in Figure 3. The performance of one of the features (F3: number of repetitions) is only slightly below that of the full model. A human observer who used only this feature would do almost as well as the full BDT model.

*p*and

_{d}*d′*), F(1,182) = 42.436,

*p*< 0.0001. It was significantly greater than 0. Indeed, 92 out of 92

*d′*values are all greater than 0 and 91 out of 92

*d′*are less than the prediction of the full model.

*logit(Y).*

*d′*across conditions (Figure 5).

*d′*value in that condition to the

*d′*value of the full BDT model in the same condition.

*p*= 0.3) observers do not seem to depend on some features to the exclusion of others.

_{d}*n*-feature models based on the four features in Figure 3 and all possible subsets of these four features. We denote each model by listing the indices of the features in Figure 3 that are included in it; for example, 124 is the model based on F1, F2, and F4.

*p*. In three conditions,

_{d}*p*was set to 0.1, 0.2, or 0.3.

_{d}*d′*values for each single feature model to that of the full model. These features were the features observers spontaneously claimed to be using (Supplement). We then matched human performance to that based on single features and found no satisfactory match for any of the single feature models considered when their performance was compared to models based on multiple features (Figure 7). The pattern of results in Figure 6 for the more efficient observers in LD indicate that many observers are using more than a single feature. The pattern is present in MD, but not obviously in HD.

*k*feature is weighted by a weight

^{th}*β*and then the weighted values are summed.

_{k}*Art of Programming*devoted to pseudo-random number generators and tests of randomness amplified Reichenbach's point: “If we were to give some man a pencil and paper and ask him to write down 100 random decimal digits, chances are very slim that he will give a satisfactory result. People tend to avoid things which seem nonrandom, such as pairs of equal adjacent digits…. And if we were to show someone a table of truly random digits, he would quite probably tell us they are not random at all; his eye would spot certain apparent regularities” (Knuth, 1971, p. 34ff). Knuth is outlining a model of generation based on avoiding certain features (“apparent regularities”), those associated with various kinds of patterns.

*C*or counterclockwise \(\bar{C}\). The motion seen and its direction were illusory with clockwise and counterclockwise equally likely overall. However, observers tended to see sequences of repetitions such as

*CCCC*or \(\overline {CCCC} \) or sequences of alternations \(\bar{C}C\bar{C}C\) or \(C\bar{C}C\bar{C}\) more often than would be expected if successive trials were statistically independent. Given ambiguous stimuli, observers effectively hallucinated repeating and alternating patterns. We conjecture that we could use analogous methods to the methods we use here to identify the regularities people avoid and compare them with the features we identify here.

*Introduction to mathematical learning theory*. Hoboken, NJ: Wiley.

*Memory & Cognition,*32(8), 1369−1378. [PubMed]

*Psychology of Sport and Exercise*, 7(6), 525−553.

*Advances in Applied Mathematics,*12(4), 428−454.

*Journal of Behavioural Decision Making,*23(1), 117−129.

*Journal of Experimental Psychology: Animal Learning and Cognition*, 40(3), 280−286. [PubMed]

*Journal of Vision,*6(2), 2−2.

*Model selection and multimodel inference: A practical information-theoretic approach,*2nd edition. New York: Springer.

*Stochastic models for learning*. Hoboken, NJ: Wiley.

*Algorithmic information theory*. Cambridge, UK: Cambridge University Press.

*Cognitive, Affective, & Behavioral Neuroscience,*2(4), 283–299. [PubMed]

*Die beginnende Schizophrenie; Versuch einer Gestaltanalyse des Wahns*[

*The onset of schizophrenia: An attempt to form an analysis of delusion*] (in German). Stuttgart: Georg Thieme Verlag.

*Behavioral and Brain Sciences*, 26(1), 26–27.

*An introduction to generalized linear models,*2nd edition. Boca Raton, FL: Chapman & Hall.

*Proceedings of the Royal Society B,*276(1654) 31−37.

*Psychological Review*, 96(2), 267–314. [PubMed]

*Simple heuristics that make us smart*. Oxford, UK: Oxford University Press.

*Adaptive thinking; Rationality in the real world*. Oxford, UK: Oxford University Press.

*Annual Review of Psychology*, 62, 451−482. [PubMed]

*Cognitive Psychology*, 17(3), 295−314.

*How we know what isn't so: The fallibility of human reason in everyday life*. New York: Free Press.

*Signal detection theory and psychophysics*(Reprint with corrections). Huntington, NY: Krieger (Original work published 1966).

*Contemporary developments in mathematical psychology: Learning, memory, and thinking*(Vol. 1; pp. 1−43). San Francisco: Freeman.

*Quarterly Journal of Experimental Psychology*, 8, 163–171.

*The theory of probability,*3rd edition (p. 432). Oxford, UK: Oxford University Press.

*Cognitive Psychology*, 3(3), 430−454.

*The art of computer programming: Volume 2, seminumerical algorithms*. Boston: Addison-Wesley.

*Sankhyā: The Indian Journal of Statistics, Ser. A*, 25(4), 369−376.

*Problems of Information Transmission*, 1(1), 3−11.

*The visual neurosciences*(Vol. II, pp. 1106−1118). Cambridge, MA: MIT Press.

*Vision Research*, 35(3), 389−412. [PubMed]

*Sensor Fusion III: 3D Perception and Recognition,*1383, 247−254.

*Nature,*5(6), 605−609.

*An introduction to Kolmogorov complexity and its applications (texts in computer science),*4th ed. New York: Springer.

*Journal of Experimental Psychology: Learning, Memory and Cognition,*8(6), 626−636.

*Journal of Experimental Psychology: Learning, Memory and Cognition*,13(3), 392−400.

*Proceedings of the SPIE: Visual Communications and Image Processing IV*, 1199, 1154−1163.

*Proceedings of the National Academy of Sciences of the United States of America,*102(8), 3164−3169.

*Visual Neuroscience*, 26(1),147−155. [PubMed]

*Vision Research*, 50(23), 2362−2374. [PubMed]

*Generalized linear models,*2nd edition. Boca Raton, FL: Chapman & Hall.

*American Statistician*, 58(3), 218−223.

*Schizophrenia Bulletin*, 36(1), 9−13. [PubMed]

*Journal of Experimental Psychology: General*, 115(1), 62−75.

*Psychological Bulletin,*135(2), 262−285. [PubMed]

*Perception*, 14(2), 97–103. [PubMed]

*Artificial intelligence and the eye*(pp. 21–77). New York: Wiley.

*The theory of probability: An inquiry into the logical and mathematical foundations of the calculus of probability*. Berkley: University of California Press.

*Journal of Experimental Psychology,*82(2), 205−257. [PubMed]

*Thinking & Reasoning*, 15(2), 197–210.

*Proceedings of the 35th Annual Conference of the Cognitive Science Society*. Austin, TX: Cognitive Science Society.

*Pattern recognition; Theory, experiment, computer simulations, and dynamic models of form perception and discovery*(pp. 339–348). Hoboken, NJ: Wiley .

*Scientific American*, 299(6), 48. [PubMed]

*Journal of Experimental Psychology*, 30(6), 495–503.

*Journal of Experimental Psychology: Human Perception and Performance*, 11(5), 598–616.

*Journal of the Acoustical Society of America,*33, 1046–1054.

*Science*, 193, 1142−1146. [PubMed]

*Markov learning models for multiperson interactions*. Stanfrod, CA: Stanford University Press.

*Journal of the Optical Society A, Optics, Image Science & Vision,*20(7), 1419−1433.

*Spatial Vision,*16(3–4), 255−275. [PubMed]

*Trends in Cognitive Science*, 12(8), 291−297.

*Chance: New Directions for Statistics & Computing,*2(1), 16−21.

*Chance: New Directions for Statistics & Computing*, 2(4), 31−34.

*Psychological Bulletin*, 76(2), 105−110.

*Science*, 185, 1124−1131. [PubMed]

*Pattern recognition; Theory, experiment, computer simulations, and dynamic models of form perception and discovery*. Hoboken, NJ: Wiley.

*Pattern recognition: Theory, experiment, computer simulations, and dynamic models of form perception and discovery*(pp. 349–364). Hoboken, NJ: Wiley.

*Acta Psychologica*, 33, 233−242.

*Acta Psychologica*, 34(2–3), 348−356.

*Psychological Bulletin*, 77(1), 65−72.

*S*=

*S*

_{1}

*S*

_{2}…

*S*denote the sequence of blue and yellow tokens with

_{n}*S*taking values 0 (yellow) or 1 (blue) and

_{i}*n*= 20. There were two generating processes for sequences, RANDOM and DMS. On each trial one of the two generators was chosen at random with equal probability. We use Bayes theorem in odds form to compute the posterior odds in favor of DMS

*P*[

*S*∣

*RANDOM*] is just 2

^{–}

*and we rewrite Equation (A2) as*

^{n}*P*[

*S*|

*DMS*]. We are given the probability of repetition

*p*and the probability of disruption

_{r}*p*that characterize the DMS.

_{d}*D*=

*D*

_{1}

*D*

_{2}⋅⋅⋅

*D*to be a vector of 1's and 0's with a 1 in each disrupted location. The sequence length is as before

_{n}*n*= 20. If we knew the disruption vector of a DMS we could “undisrupt” it and recover the underlying Markov sequence (Figure 2B). Let

*M*=

*M*

_{1}

*M*

_{2}⋅⋅⋅

*M*denote that (unknown) Markov sequence.

_{n}*S*,

*D*, and

*M*we can represent the disruption operation as

*S*=

*D*⊕

*M*and recover the Markov Sequence from the DMS by

*M*=

*S*⊕

*D*.

*S*when the generator is DMS, we must sum the probabilities of all the possible ways it might have come about. The key insight is that, whatever, the output of the Markov chain, there is a disruption vector that would transform it into any target sequence

*S*. As an example, let's consider sequences of length

*n*= 3 and assume we observe a sequence yellow–blue–yellow coded as 101. Then it could have come about because the Markov chain produced 101 and there were no disruptions. The disruption vector is 000. Or the Markov chain could have produced 001 with disruption vector 100. Or the Markov chain could have produced 111 with disruption vector 010. We must consider all the possible combinations of Markov chain and disruption in computing the probability of the generation of a particular sequence

*S*by the DMS:

*n*and

*MS*denotes generated by the Markov sequence generator. Each term in the sum is a combination involving a Markov pattern

*M*and the unique disruption

*D*such that

*S*=

*M*⊕

*D*. Each term in the sum is mutually exclusive and we are justified in adding them. For any disruption pattern

*D*, the probability

*P*[

*D*] =

*P*[

*S*⊕

*M*] is determined by the probability of disruption

*p*. If σ(

_{d}*D*) is the number of disruptions in the disruption vector (the sum of the 1s and 0s in the vector), then

*P*[

*M*∣

*MS*(

*p*)] that a sequence

_{r}*M*is generated by a Markov process

*MS*(

*p*) with probability of repetition

_{r}*p*. The entries in the sequence

_{r}*M*=

*M*

_{1}

*M*

_{2}⋅⋅⋅

*M*are not statistically independent but we can recode the sequence in terms of its transitions which are independent. Let

_{n}*V*=

_{i}*M*

_{i − 1}∧

*M*for

_{i}*i*= 2, ⋅⋅⋅,

*n*where ∧ denotes the logical operation AND.

*V*is TRUE (1) if the current token is the same color as its predecessor otherwise 0 (FALSE). We recode the sequence

_{i}*M*=

*M*

_{1}

*M*

_{2}⋅⋅⋅

*M*as

_{n}*M*′ =

*M*

_{1}

*V*

_{2}⋅⋅⋅

*V*with

_{n}*M*

_{1}the initial state and the

*V*specify how to generate the succeeding states from each preceding state:

_{i}*V*= 1 (repetition) or

_{i}*V*= 0 (alternating). These random variables are independent. Let

_{i}*V*denote the

*n*– 1 length binary vector of alternations/repetitions

*V*

_{2}⋅⋅⋅

*V*and σ(

_{n}*V*) the number of 1s in the vector (the number of repetitions). Then

*k*feature is

^{th}*λ*> 1, where

_{k}*k*feature and

^{th}*L*, a likelihood criterion. Because the prior odds of the two generators is 1,

_{K}*(*

_{k}*S*) can take on only a finite number of integer values

*v*

_{1},

*v*

_{2},⋅⋅⋅,

*v*. For example, the longest repeating subsequence in a sequence of length 20 must be an integer between 1 and 20. We can compute

_{p}*P*[φ

_{k}(

*S*) =

*v*∣

*DMS*] by Monte Carlo simulation. We generate a large number of DMS sequences and observe what proportion have φ

_{k}(

*S*) =

*v*for each value of

*v*. This proportion approximates

*P*[φ

_{k}(

*S*) ∣

*DMS*].