Cues in phonological learning: adults

Showing posts with label adults. Show all posts

Saturday, December 8, 2007

probabilities

Hi all,
I have been rereading Matt's recent article on phonotactic learning with phonotactic probabilities and thinking about the upcoming workshop on cues in phonological learning and I am just wondering about the different weightings of cues in infants than adults and the different kinds of cues that we will discuss at the workshop: distributional, articulatory, acoustic, etc... In Matt's paper he found that adults were pretty insensitive to the probability variation of /s/ in onset position and I am wondering whether we might not expect different things from infants. Let me see what you all think of my predictions. While infants are clearly *very* sensitive to probabilities in the input they also might be less sensitive to articulatory ease since they don't really articulate yet (this would not be true for toddlers of course), so they might show a different performance pattern here than adults. This also begs the the question of why adults show a lack of sensitivity (perhaps something that Matt can help me with). Is it because they are less sensitive to probabilities over time or because they show a heightened sensitivity to ease of articulation?
-Amanda

Friday, November 30, 2007

Some questions on the ideas put forward by y'all

Hello everyone,

I was re-reading everyone's abstracts and thought that there were some interesting coincidences and intriguing mismatches.

For example, I might be reading Susan's wrong, but it seems that she predicts that, with a 1-to-1 mapping from articulation to acoustics, the magnitude of the change in the articulation predicts basically two levels of structure: one corresponding to slower and larger movements and one to more abrupt changes. Only the latter would, according to my interpretation of her abstract, provide acoustic cues. This matches Ying and Jeff's finding that an automatic feature extractor operating on the raw acoustic signal is good at finding manner features, but not so good with place features. On the contrary, motor cues are most useful for place of articulation, but not so great with manner.

Susan also suggests that it is not only the level of fine acoustic cues that affects speech perception, but also the other level. May this be related to Jessica, Robert and Matt (JRM)'s learner of phonotactic constraints and segments? If I get their point right, both infants and adults would rely on acoustic cues in context to discover/perceive phonological categories, and this process may go from a holistic chunk (perhaps syllable-based?) to phone-sized elements. In relation to this, I was thinking that while one may expect that articulatory experience would aid the formation of prosodic categories (like syllabic/word templates), it was at first surprising for me that it is also important for learning of segments. Now I see that this fits really well with JRM's proposal that we may learn phonotactics and segments at the same time, don't you think?

Further, this learning strategy (of phonotactics+segments) must be flexible and still active in adulthood, given that Lisa D's adults are able to learn a minimal pair of words that relies on a non-native phonotactic contrast, but her results also suggest that the presence of a minimal pair (which, through semantic cues, forces listeners to focus on the relevant acoustic cues) is a necessary condition for learning in adults. On the other hand, the presence of minimal pairs cannot be a necessary condition in toddlers (cf. Pater et al. 2004), as Ying and Jeff point out, while already in childhood semantic information is actually helpful, according to Lisa G's results. Is there a developmental trend here? May it be geared by a difference in processing abilities? (If so, would we predict that adults in a high-processing load task would behave like Pater's toddlers and have a *harder* time with the minimal pair than with the non-minimal pair?)

Another question that follows from Susan's hypothesis that these two levels interact in perception is how language experience affects this interaction. Grant's paper seems to suggest that this interaction is mediated by specific experience, such that sheer experience with one of the categories (AmEng speakers must have experience with retroflex sounds, and even with the retroflex fricative in words like 'shrill') is not enough, but speakers must have been exposed to the particular contrast that relies on the acoustic/higher level features under study. This is particularly important for early category formation: Chandan's corpus analysis looks at the correlation of pitch and VOT, which could be interpreted as respective examples of the larger and finer categories proposed by Susan. If IDS takes advantage of the apparent predilection of infants for pitch, will it have its voiced segments be primarily marked with pitch? However, given young infants' very limited articulatory abilities, we'd expect them to rely primarily on acoustic-feature based contrasts, which -now following Ying and Jeff - would predict a primacy of VOT.

Monday, July 30, 2007

Integrality of acoustic cues depends on language- and contrast-specific experience

By Grant McGuire

Phonetic categories are generally assumed to have multiple cues available for listeners to reliably use in identification. Specifically, it has been shown that fricatives are identified through static spectral information resulting from the narrow fricative constriction and dynamic formant transition information provided by movement into or out of a preceding or following vowel (Harris 1958, Heinz and Stevens 1961, Whalen 1991) and that these cues are perceived linguistically, but not psycho-acoustically, as an integral unit (Tomiak et al. 1986). The weighting of these cues varies such that formant transitions are weighted more heavily for contrasts between spectrally similar fricatives while fricative noise is the primary cue for spectrally distinct fricatives (Harris 1958, Wagner et al. 2006). This difference in weighting changes over time as young children listen preferentially to dynamic cues (Nittrouer et al. 1993, Nittrouer and Miller 1997) leading to the general hypothesis that young children preferentially listen to large-scale dynamic changes in the speech signal (Developmental Weighting Shift, Nittrouer and Miller 1997).

This paper presents results from two experiments exploring the relative weighting and integration of sibilant fricative cues by non-native speakers. Specifically, Mandarin and American English listeners discriminated and labeled examples of Polish alveopalatal [ɕ] and retroflex [ʂ] sibilants. These two language groups were chosen as this contrast is extremely similar to a native Mandarin one (Stevens et al. 2004) but is difficult for English listeners to discriminate as they generally assimilate the two sounds to their native palato-alveolar category (Lisker 2001). The first experiment used a same~different paradigm to assess both language groups' discrimination of (A) modified naturally produced pairs of syllables having fully correlated fricative noise and formant transition cues, (B) pairs differing in a single dimension (fricative noise, formant transition), or (C) fully conflicting cue pairs. In the second experiment the same listeners labeled a two – dimensional continuum of the same modified naturally produced Polish sounds varying in fricative noise and formant transition cues where syllables had differing degrees of correlated and conflicting cues.

An analysis of subjects' reaction times demonstrates that Mandarin listeners were significantly slowed by conflicting cue stimuli in both experiments, unlike English listeners who showed no such effect. Moreover, in the labeling experiment, English listeners relied solely on formant transition information and ignored fricative noise variation. This contrasts with Mandarin listeners who use both dimensions for categorization. These results demonstrate that English listeners do not integrate the two cues and, although they are able to use fricative noise information in the discrimination experiment and they rely heavily on such cues for their native sibilant contrast, they do not use static spectral information for this contrast. This suggests that listeners do not generalize knowledge of cues beyond the contexts in which they experience them natively and that cue integration is a learned phenomenon heavily dependent on local context. The English listeners' reliance on formant transition for a novel contrast also supports Nittrouer's Developmental Weighting Shift hypothesis as a general perceptual strategy not limited to language acquisition.

Incentive to focus: Word learning helps listeners distinguish native and non-native sequences

By Lisa Davidson

Previous research in cross-language perception has shown that non-native listeners often assimilate both single phonemes and phonotactic sequences to native language categories (e.g., Best 1995, Kuhl and Iverson 1995, Dupoux et al. 1999, Flege et al. 2003). These findings suggest that it would be difficult for second language learners to overcome these phonetic barriers to learning new sounds or sequences. To study whether higher-level cues can assist learners, two experiments examined whether associating meaning with unfamiliar words assists listeners in distinguishing the phonotactically possible and unattested sequences (see also Hayes- Harb 2007 for phoneme discrimination).

In Experiment 1, American English listeners were trained on word-picture pairings of words containing a phonological contrast between CC and CəC sequences, but which were not minimal pairs (e.g., [ftake], [fətalu]). In Experiment 2, the word-picture pairings specifically consisted of minimal pairs (e.g., [ftake], [fətake]). In the test phase, listeners saw a picture and
heard words spoken by a new speaker and had to indicate which of the words matched the picture. Results showed that participants chose the accurate CC or CəC form more often when they learned minimal pairs as opposed to phonological contrast alone. Nevertheless, there was a significant asymmetry in accuracy in Experiment 1: listeners were more accurate on CəC word- picture pairings than on CC pairings. Subsequent investigation of individual listeners revealed that the participants could be divided into a high performing and a low performing group: the high performers were much more capable of learning the contrast between native and non-native words, while the low performers remained at chance. Results for high performers are shown in Figure 1 (attached).

These findings suggest that at least for high performers, learning minimal pairs provides greater incentive to distinguish non-native sequences like CC from native counterparts like CəC. These experiments can be compared to a previous AX discrimination task using the same stimuli which did not include any training on the stimuli beforehand (Davidson to appear). In the AX task, listeners were at chance in discriminating between CC and CəC tokens. Unlike evidence from the infant literature suggesting that infants may encounter processing limitations in tasks requiring them to assign meaning to contrastive sounds (Pater, Stager and Werker 2004), adults have ample resources allowing them to use meaning cues to better learn the distinction between native and non-native sequences. Furthermore, greater accuracy on phonotactically legal CəC sequences may be due to the ability to establish a more robust phonetic representation. For words learned as CC, participants seem to accept a greater variety of productions, suggesting that the native language phonological prohibition on the CC sequences used in this study hampers a detailed phonetic encoding of these items.

Attention to cues and phonological categorization: Motor contributions

By Lisa Goffman

The acquisition of phonological units relies on perceptual and motor experiences and biases. Perceptual factors have been the emphasis of much research, with infants responding to prosodic and segmental cues in the input. Such perceptual sensitivities are thought to provide a mechanism for parsing important language units, such as words and sentences.
Motor contributions to the acquisition of production units have been less well investigated than perceptual. MacNeilage and Davis (2000) have argued that motor primitives associated with the syllable emerge in the context of canonical babbling and that the segmental content of these oscillatory open-close movements is detailed over the course of development. In adult speakers, Browman and Goldstein (1992) have developed the theory of Articulatory Phonology, in which movement primitives associated with each individual articulator (e.g., tongue body, lips, velum) are coordinated, with phonological units emerging from this coordination of gestures. Current models of language production (e.g., Levelt et al., 1999) include Articulation as an element, but do not develop how this processing level interacts with the phonological level, especially during acquisition.
The research discussed in this paper attempts to bridge this gap by measuring articulatory movement output as children and adults produce various language units. The working hypothesis is that motor capacities interact with phonological units and that these interactions change over the course of development. I incorporate methodologies from speech motor control and from psycholinguistics to assess how grammatical, lexical, and phonological processing levels are linked to articulatory output.
In this work I recorded oral articulatory movements (using the Optotrak system) while children and adults produce different segments, syllables, words, and sentences. The patterning of oral movements is assessed across these different speech production tasks, as well as the variability of such patterning. Three groups of results (i.e., segmental, prosodic, and semantic) are summarized. In the segmental domain, changes in a single phoneme influence the oral movement patterning of a single articulator (Goffman & Smith, 1999). Further, changes in a single segment exert broad coarticulatory effects that cross word and phrase boundaries. Another group of findings focuses on prosody, demonstrating that children produce late developing (in English) prosodic structures with relatively stable and small and short articulatory movements. Early developing trochees are, counter-intuitively, produced with relatively poorly controlled and variable articulatory patterns (e.g., Goffman, 1999; Goffman, Heisler, & Chakraborty, 2006). That is, stress patterns that are acquired relatively early for English-learning children are actually produced with less motor precision. It may be that children either omit syllables or produce equal stress in their attempts to produce more precise iambs. Finally, data will be presented showing that 4-year-old children produce more stable articulatory movement patterns when a novel phonetic form is provided with a visual or functional referent. All of this work together provides clues into how articulatory cues contribute to children’s developing phonological units.

Cues in phonological learning