Repeated visual sequence presentations usually reinforce short-term memory, but sometimes cause a problem in recalling order. This is called the midstream order deficit (MOD). For example, Holcombe et al. (2001) reported that it was easier to recall the relative order of a 4-letter sequence from a single presentation than a repeated presentation. Using Kanji characters, which are logographic symbols in Japanese and have many homophones, Yokosawa (VSS, 2005) examined whether the MOD occurred for phonological encoding of relative order. The MOD was not found with stimulus sets of phonologically identical characters and written responses, while the MOD was obtained with stimulus sets of phonologically different characters and verbal responses. These results suggest an influence of phonological encoding on the MOD, although the manipulation of stimulus set and response method was confounded. In Experiment 1, sets of four phonologically different Kanji characters were used with written responses. After each sequence presentation, participants reported the order by connecting the printed characters on the response sheet using arrows. The results showed that the accuracy of recall of order was significantly lower in the cycling presentation trials than in the single presentation trials. This shows that the MOD was independent of the reporting method and need for phonological encoding. In order to determine more precisely the influence of phonological encoding, Experiment 2 added an articulatory suppression task, and MOD disappeared. Thus MOD disappears when phonological encoding of relative order is prevented, suggesting that phonological encoding is a critical determinant for the MOD.