Description
Speech Perception What is the meaning behind the speech sounds? How do we determine the meaning? Speech Perception ∙ If speakers differ in the acoustic properties of their vowels, then how is it that a listener knows which vowel a given speaker is trying to produce? ∙ Because speech perception involves a listener, we need to know something about the hearing mechanism. ∙ Listener hears a speech signal and perceives its meaning. ∙ Audition relates to hearing the speech signal and perception relates to the actual decoding of the stream of sound.
Hearing ∙ If a child is born with some type of disorder that affects speech perception, can the child hear? ∙ Pathway for hearing: o Outer ear, middle ear, cochlea (inner ear), auditory nerve, brain.Anatomy of Hearing ∙ The Outer Ear o Pinna and ear canal ∙ The Middle Ear o Tympanic membrane (eardrum) and ossicles o Eustachian Tube ∙ The Inner Ear o Cochlea o Semicircular canals Outer Ear ∙ Outer Ear (Acoustic energy) o Pinna or Auricle: funnels sound to the ear canal and helps localize sound o External auditory canal (meatus): leads to the tympanic membrane Cerumen: wax Cilia: small hairs
o Sound is captured by auricle/pinna o Funneled to external auditory meatus (ear canal) o Ear canal is a tube open at one end closed at another Approximately 2.5 cm. long, boosts high frequencies Middle Ear ∙ Middle Ear (Mechanical energy) o Air filled cavity o Outer and middle ears are separated by the eardrum or tympanic membrane o Tympanic membrane: thin, tough, elastic, cone shaped membrane. Vibrates as a whole for low frequencies. Different areas responsive to high frequencies. o Ossicular chain: three small bones Malleus (hammer) Incus (anvil) Stapes (stirrup) o Eustachian tube (auditory tube): connects middle ear to the nasopharynx Equalizes pressure and aerates middle ear Tube can be opened by yawning and swallowing (lets fresh air into the middle ear) o Why can’t we just have cochlea attached directly to the middle ear? o Impedance mismatch Resistance to transmission Liquid has higher impedance If sound pressure travels through air and reaches liquid; most of the signal will be reflected back Since the cochlea is fluid filled; need transformero Transformer increases pressure so more is admitted into the fluid o What is the transformer for the ear?
Inner Ear ∙ Two primary structures: o Three semicircular Canals: help maintain balance o Cochlea: primary inner ear structure of hearing ∙ Vestibular System o Three semicircular canals, help control balance, posture, and movement ∙ Inner Ear Fluids o Cochlea: snail shaped; coiled tunnel that is filled with fluid. The inner ear: system of interconnecting tunnels called labyrinths- the tunnels are filled with fluid Perilymph: fluid that fills the scala vestibule and scala tympani Endolymph: fluid that fills the scala media Scala media: middle of cochlea, contains the sensory organ of hearing (organ of corti) Scala vestibule: peripheral cavity of the cochlea that communicates with middle ear via the oval window Scala tympani: peripheral cavity of the cochlea that communicates with middle ear via the round window Oval window: point where inner ear begins. Allows communication between scala vestibule and middle ear space Round window allows communication between scala tympani and middle ear Vibrations to oval window cause pressure waves in the fluid filled tunnels of the cochlea ∙ Round and oval windows allow movement of fluid in cochlea. Oval moves in, round moves out Basilar membrane: floor of the cochlea∙ Organ of Corti (bathes in the endolymph): contained in basilar membrane. The inner ear’s most important structure of hearing. Contains thousands of cilia (hair cells) that respond to sound. The membrane is thinner at the base than at the tip High frequency sounds stimulate the base, those of low frequency stimulate the tip Mechanical impulses from the middle ear are converted to electric impulses in the inner ear
∙ Inner Ear
o Sound /i/ produces traveling wave along basilar membrane with two points of maximum displacement One near the apex for the low resonance (Formant 1) One near the base for the higher resonance (Formant 2)
If person said /si/; membrane displace would initially be closer to base because of high frequency /s/ Anatomy of Hearing
∙ Acoustic nerve (cranial nerve VIII) o Carries electrical impulses from the cochlea to the brain Primary auditory area in the temporal lobe 2 branches – vestibular branch and the acoustic branch Normal Hearing Air is conducted into the inner ear by: ∙ Air conduction sound waves travel through the air to the outer ear and are transmitted through the middle ear to the cochlea∙ Bone conduction vibrations of bone cause movement in the fluids of the inner ear ∙ Larger bones of the skull conduct sound because they vibrate in response to airborne sound waves ∙ We normally hear our own voice through a combination of air-bone conduction sounds Physiology of Hearing 1. Sound waves are directed by the pinna into the ear canal 2. Waves strike the eardrum, cause it to vibrate 3. The eardrum is connected to the malleus, moves the ossicles back and forth 4. As the stapes moves, it pushes the oval window in and out 5. The movement of the oval window makes waves in the fluid 6. Pressure of the waves causes the basilar membrane to vibrate, moving the hair cells in the Organ of Corti 7. The movement of the hair generates nerve impulses 8. The nerve impulses are passed on to the 8th nerve, and transmitted to the auditory area of the brain ∙ Transformation of sound wave o Air disturbances converted to mechanical vibrations by ossicles in the middle ear. o Mechanical vibrations transformed to fluid vibrations in the cochlea. o Fluid vibrations converted to electrochemical changes by the cilia and nerve endings in the cochlea. Electrochemical changes are sent to brain as nerve endings. Review 1. What is speech perception? 2. Where does it occur in the speech chain and why? 3. Identify the energy transitions in the hearing mechanism and what part of the ear performs each transition. 4. Describe what portions of the cochlea amplify certain frequencies; where would /si/ be heard? 5. Name some disorders that have a speech perception component and why. 6. How would you access them? 7. How would you treat them?Speech Perception ∙ Ability to seek and recognize acoustic patterns ∙ Speech sounds not often discrete or separate ∙ Context and Suprasegmentals used to help decode message Vowels ∙ F1 and F2 used primarily to distinguish vowels ∙ One theory is that we use patterns and not absolute formants for distinguishing vowels ∙ Another theory states that we use the point vowels to normalize formant frequencies to help with identification ∙ Problems with normalization o Studies show that there is no scaling formula that helps a listener normalize frequencies o Listeners may not have to normalize to identify; remember- listeners can identify vowel of unknown speaker without ever having heard the speaker before ∙ What is consensus? o Unknown; there are many theories. o Formant frequencies, u, and patterns play a role in identification o Information about articulation coded in acoustic signal Other Speech Sounds ∙ Diphthongs o Listener perceives rapid changes in formants ∙ Semivowels o Frequency changes in F2 and sometimes F3 o Fast formant transitions contribute to their perception as consonant-like not diphthongs o F3 distinguishes /r/ from /l/ ∙ Nasals o Listener needs to decide two things: nasal or not? Labial, alveolar, or palatal-velar? o Formant transitions of vowels preceding nasal distinguishes nasals as a class o Other distinguishing features: Weakening of upper formant intensity because of antiresonances (antiformants) Additional resonance below 500 Hz (usually 250 Hz) called nasal formant o Transitions to and from /m/ are lowest in frequency and shortest in duration /n/ higher in frequency and longer in duration /ŋ/ highest and most variable frequency and longest in duration ∙ Stops o Stop gap release burst voice onset time formant transition Bilabials – F2 increases from stop release to vowel Alveolar – F2 decreases from stop release to vowel except for high front vowels Velar – F2 decrease from stop release to vowel ∙ Fricatives o Most distinguishable feature is aperiodic noise. o Stridents /s, z, 3, ʃ/ have high frequency spectral peaks; nonstridents /f, v, h, ð, θ/ have flat diffuse spectra o / ð/ most confused; because of small t nonexistent resonating cavity ∙ Affricates o Characteristics of a stop with the addition of a fricative o Listener listens for durational spectral cues in production Development of Speech Perception ∙ Infant Speech Perception o Fetuses may respond to sound stimuli in their normal environment o Fetal auditory system is functional after 24 weeks GA and improves during last trimester o Large number of speech components (mostly prosodic) can be transmitted through amniotic fluid o Fetus has been able to perform speech-relevant acoustic discrimination towards end of gestation o These responses have been tested using electrophysiological (microphones, auditory evoked potentials), neurochemical (measures fetal brain activity using cerebral glucose utilization – animal models), and mostly behavioral responses (startle response –ultrasound, heart rate changes) o Prefer: maternal voice, familiar story, musical sequences, speech sequence sung by mother, maternal language o Initially, maternal vocalizations gain and maintain infant’s attention and arousal state o Prosodic elements are used to communicate maternal affect to infant (approval, happiness, anger, warning) expanded intonation may facilitate infant identification of the mother o Prosodic patterns could contribute to the development of speech perception skills o Discrimination tasks tested by changes in sucking rate ∙ Theories of Speech Perception o Motor Theories Listener refers to articulation to address variations in acoustic signal Speech is perceived by referencing it to how it is produced. Speech is heard listener then accesses his or her own knowledge of how phonemes are articulated If it fits, then it is correct. McGurk Effect: visual and auditory information may both play a role in speech perception. We store visual as well as auditory memories of phonetic gestures. o Auditory Theories Listener identifies acoustic patterns and matches them to learned and/or stores acoustic phonetic features Speech production plays minimal role. Listeners are sensitive to distinctive patterns of speech wave Auditory features of phonemes are detected