New User Special Price Expires in

Let's log you in.

Sign in with Facebook


Don't have a StudySoup account? Create one here!


Create a StudySoup account

Be part of our community, it's free to join!

Sign up with Facebook


Create your account
By creating an account you agree to StudySoup's terms and conditions and privacy policy

Already have a StudySoup account? Login here


by: Marco Wolf


Marco Wolf
GPA 3.56

David Gilden

Almost Ready


These notes were just uploaded, and will be ready to view shortly.

Purchase these notes here, or revisit this page.

Either way, we'll remind you when they're ready :)

Preview These Notes for FREE

Get a free preview of these Notes, just enter your email below.

Unlock Preview
Unlock Preview

Preview these materials now for free

Why put in your email? Get access to more of this material and other relevant free materials for your school

View Preview

About this Document

David Gilden
Class Notes
25 ?




Popular in Course

Popular in Psychlogy

This 72 page Class Notes was uploaded by Marco Wolf on Monday September 7, 2015. The Class Notes belongs to PSY 341K at University of Texas at Austin taught by David Gilden in Fall. Since its upload, it has received 75 views. For similar materials see /class/181794/psy-341k-university-of-texas-at-austin in Psychlogy at University of Texas at Austin.

Similar to PSY 341K at UT




Report this Material


What is Karma?


Karma is the currency of StudySoup.

You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 09/07/15
7160 The Journal ofNeurosiene August 6 2003 Z3187160 77168 Behavioral Systems Cognitive The Statistical Structure of Human Speech Sounds Predicts Musical Universals David A Schwartz Catherine Q Howe and Dale Purves Department of Neurobiology and Center for Cognitive Neuroscience Duke University Medical Center Duke University Durham North Carolina 27710 The similarity of musical scales and consonance judgments across human populations has no generally accepted explanation Here we present evidence that these aspects of auditory perception arise from the statistical structure of naturally occurring periodic sound I39CL 11 1 l stlmn I mam thntthe probability distribution of amplitudeifrequency combinations in human utterances predicts both the structure of the chromatic scale A A TL L u I so that what we hear 39 quot t quot bythe 39 mu Lip L stimuli and their naturally occurring sources rather than by the physical parameters of the stimulus per se Key words audition auditory system perception music scales consonance tones probability Introduction All human listeners perceive tones in the presence of regularly repeating patterns of sound pressure uctuation over a wide range of frequencies This quality of audition forms the basis of tonal music a behavioral product characteristic of most if not all human populations The widely shared features of tonal music that are deemed to be musical universals include 1 a division of the continuous dimension of pitch into iterated sets of 12 intervals that de ne the chromatic scale Nettl 1956 Deutsch 1973 Kallman and Massaro 1979 Krumhansl and Shepard 1979 2 the preferential use in musical composition and per formance of particular subsets of these 12 intervals eg the in tervals of the diatomic or anhemitonic pentatonic scales Budge 1943 Youngblood 1958 Knopoff and Hutchinson 1983 and 3 the similar consonance ordering of chromatic scale tone combinations reported by most listeners Malmberg 1918 Krumhansl 1990 Although the response properties of some auditory neurons to musical tone combinations Tramo et al 2001 and other complex timeevarying signals Escabi and Schreiner 2002 are now known as are some neuroanatomical correlates of music perception Peretz et al 2001 Ianata et al 2002 these perceptual phenomena have no generally accepted explanation in either physiological or psychological terms Thus the basis for tonal music one of the most fascinating and widely appreciated aspects of human audition remains obscure Here we explore a conceptual framework for understanding musical universals suggested by recent work on human vision for review see Knill and Richards 1996 Purves et al 2001 Rao et al 2002 Purves and Lotto 2003 A fundamental challenge in understanding what we see is that the physical source of a retinal image cannot be derived directly from the information in the RetewedApul102003revuedMay212003atteptedMay222003 rant Fuhmlnrm humNumi to a henhmd r opyught 2003SooetytovNemoxoente 02707647403237160709315 000 stimulus a quandary referred to as the inverse optics prob lem In audition as well the physical characteristics of the stime ulus at the ear cannot uniquely specify the generative source to which the listener must respond Gordon et al 1992 Hogden et al 1996 Driscoll 1997 presenting a similar inverse acoustics problem Acoustical stimuli are inherently ambiguous because a given Variation in sound pressure can arise from many different combinations of initiating mechanical force resonant properties of the body or bodies acted on and qualities ofthe medium and structural environment intervening between the source and the listener Even though the physical sources of sound stimuli are not specified by the sound pressure variations at the receptor surface it is these sources that the auditory brain must decipher to generate successful behavior This fundamental problem sug gests that the auditory system like the visual system may genere ate percepts statistically ie based on the relative number of times different possible sources have been associated with a par ticular stimulus in the history of the species and the individual The hypothesis examined here is therefore that tonalpercepts and the musical universals that characterize this aspect of audie tory perception are determ ed by the statistical relationship be tweenpci 39 the ear and lllcil I 1 Because this statistical linkage derives from the structure of Hate uidily uccuuiugpei imuli the wide J I of music perception such as musical scale intervals and consoe nance ordering should be predicted by the statistical structure of the periodic stimuli to which human beings have always been exposed Although periodic sound energy derives from many different natural sources including nonhuman animal calls Fletcher 1992 and circumstances in which mechanical forces generate periodic stimuli eg blowholes or other resonant structures that occasionally produce periodic sounds from the action of wind or water human vocalization is a principal source of periodic sound energy to which human beings have been chronically exposed over both evolutionary and individual time Accordingly speech sounds provide a first approximation of the 1 Schwartz et al 0 Speech Sounds and Musical Universals universe of tone evoking stimuli for humans We therefore ex amined a database of recorded human speech to ask whether the relative likelihood of different amplitude frequency combina tions in these utterances predicts the phenomenology of musical scale structure and consonance ordering Materials and Methods The primary source of speech sounds we analyzed was the Texas Instrur mentsMassachusetts Institute of Technology TIMIT Acousticr Phonetic Continuous Speech Corpus Garofolo et al 1990 This core pus created for linguistics and telecommunications research comprises 6300 utterances of brief sentences by native English speakers The corpus was generated by having 441 male and 189 female speakers representing eight major dialect regions of the United States each utter the same set of 10 sentences Garofolo et al 1990 Technical speci cations regarding the selection of speakers construction of the sentence list recording conditions and signal processing can be found in Fisher et al 1986 and Lamel et al 1986 or obtained from the Linguistic Data Consortium at the University of Pennsylvania httpwwwldcupennedu To ensure that any conclusions reached on the basis of the TIMIT analysis were not dependent on the speci c features of English or any particular language we also analyzed the Oregon Graduate Institute of Science and Technolr ogy OGI Multirlanguage Telephone Speech Corpus Muthusamy et al 1992 This second corpus comprises 1000 utterances in Farsi French German Hindi Japanese Korean Mandarin Chinese Spanish Tamil and Vietnamese respectively Figure 1A shows the change in sound pressure over time for a repre sentative spoken sentence in the TIMIT corpus For each speaker we randomly sampled 40 01 sec speech sounds from each of the 10 utterr ances To avoid silent intervals segments in which the maximum amplir tude was lt5 of the maximum aInplitude in the utterance as a whole were excluded eliminating 8 of the initial sample A discrete fast Fourier transform Nyquist frequency 8000 was applied to each of the remaining segments To identify patterns of spectral energy common to the large number of individual spectra in the corpus these data were normalized by expressing all amplitude and frequency values in a given spectrum as ratios with respect to the amplitude maximum of the spec trum and the frequency at that maximum Thus the abscissa and ordir Hate in Figures 2 and4 are respectively Fn FFm andAn AAm where Am and Fm are the maximum amplitude and its associated frequency A and F are any given amplitude and frequency values in the spectrum and An and FH are the normalized values This method of normalization avoids any assumptions about the structure of human speech sounds eg that such sounds should be conceptualized in terms of ideal hare monic series For each speaker a probability distribution of amplitudeifrequency pairs was generated by summing the number of occurrences of all possir ble amplitudeifrequency combinations in the randomly sampled speech sounds a normalized spectrum for the speaker was then obtained by taking the average amplitude at a given frequency ratio The normalized spectrum for the corpus as a whole was generated by plotting the group mean amplitude values for all 630 individual speakers at each frequency ratio value The functions in Figure 2 thus represent the relative concenr tration of sound energy at different frequency ratios relative to the max imum amplitude of a spectrum in the stimuli generated by spoken Amer ican English The same procedure was used to generate normalized spectraforthe other languages studied except thatthe OGI data were not analyzed by individual speaker File conversion and acoustic analyses were performed using Praat Boersma and Weenink 2001 and Matlab Mathworks 1996 software running on a Macintosh G4 computer The robustness of the relative consonance rankings across the seven different empirical studies reviewed in Malmberg 1918 was assessed by reliability analysis in which we treated each musical interval as an item and each of the studies as a measure see Fig 7 To quantify the relative concentration of power at each of the maxima in the normalized spec trum we performed regression analysis and obtained the residual mean normalized amplitude values at each maximum from a logarithmic funcr J Neurosci August 6 2003 23187160 7168 7161 A 0055 8 a 1 E 0 D E lt1 005 0 B 0065 5 E Q g i E O Q E lt 005 O 01 Time s C 40 3399 20 m 390 2 E lt O 39 20 ll l 0 8000 Frequency Hz Figure 1 Analysis of speech segments A Variation of sound pressure level overtime for a representative utterance from the TIMIT corpus the sentence in this example is quotShe had your dark suit in greasy wash water all year B Blowup of a 01 sec segment extracted from the utterance inthisexamplethe vowel sound in dark The spectrum ofthe extracted segment in B generated by application of a fast Fouriertransform tion t to the data r2 097 A second measure of power concentration was obtained by calculating the slope of each local maximum Results The statistical structure of American English speech sounds Figure 2A shows the probability distribution of amplitude fre quency pairs in the speech sounds sampled from the TIMIT cor pus over three octaves the mean amplitude values over this same range are shown in Figure 2B The blowup of the normalized spectrum over the single octave range 1 2 in Figure 2C shows statistical concentrations of power not only at integer multiples of the global maximum as would be expected for any set of periodic stimuli but also at frequency ratios that are not simply integer multiples of the maximum Figure 2D shows this portion of the spectrum separately for male and female speakers The variation in the normalized amplitude values is least at frequency 7161 J NeuioizlAuguit 6120034308 716077163 Xttwanzetal Xpeelth oundxandMuxltalUnlvelxak A B 2 o A 5 in 3 20 E 3910 g 2 2 o E 8 E 5 2 3 4 2 n 5 m E 1 30 g a g a g 40 19 3980 4 5 6 395 1 3 4 5 7 8 Frequency ratio Frequency ratio C D o u a 5 male 3 5 m lemale 390 a 3 10 E m 393 15 B i E 20 o i 5 a 25 2 3o 1 2 14 16 a 2 1 r 15 2 Frequency ratio Frequency ratio Figure l p axi nun nun own Mean a R HM folmale llue andfemale ed speakel ratios where power is concentrated for both male and female utterances eg 20 15 and greatest at frequency ratios where 39 1 1 see below The structure of the normalized data in Figure 2 is a direct consequence of the acoustics ofhuman vocalization 1n physica ms the human vocal apparatus has been modeled as a source and filter Lieberman and Blumstein 1988 Stevens 1999 Durr folds into sustained harmonic oscillation which generates a roughly triangular sound pressure wave that is approximately periodic over short time intervals Ladefoged 1962 This com plex waveform has maximum wer at the fundamental free H t r n 11 c frequency values approximating integer multiples ie harmony 39 ihe po e 39 l 1 form decreases exponentially with increasing frequency such that the pow monic number n is equal to 1111 the power at the fundamental accounting for the exponential decrease in quot L 39 39 Fi me n in u qn ii are determined by both the length and shape of the vocal tract a e power of the laryngeal pressure wave will be least attenuated n t t 1 t r sounds the length ofthe vocal tract for a given speaker is relay tively fixed Consequently the resonance related to tract length is J L t 1 For an ideal pipe closed at one end and measuring 17 cm the approximate length of the vocal tract in adult human males resonancesoccuratsoo Hz anditsodd harmonics eg 1500 Hz 2500 Hz etc The human vocal tract is not of course an ideal in voiced speech sounds range from 44071000 Hz Hillenr brand et al 1995 Considering the vocal tract as an ideal pipe however is a useful simplification here given that the power of the complex sound signal generated by the laryngeal source de 1 t t through the vocal tract it is modi ed ltered by the natural ii uei I o 39 39 pi o e quency associated with much of the power in any speech sound Schwartz et al 0 Speech Sounds and Musical Universals A 03 025 02 015 Probability 1 2 3 4 5 6 7 8 9 1O Harmonic number Harmonic number of amplitude maxiumum 1 2 3 4 5 6 g 8 1 1 1 1 1 1 e g 2 15 133 125 12 117 2 167 15 14 133 g 2 2 175 16 15 E 2 18 167 LL 2 184 2 Figure 3 Probability distribution ofthe harmonic number at which the maximum ampli tude occurs in speech sound spectra derived from the TIMIT corpus A The distribution forthe first 10 harmonics of the fundamental frequency of each spectrum More than 75 of the amplitude maxima occur at harmonic numbers 2 5 B The frequency ratio values at which power concentrations are expected within the frequency ratio range 1 2 Fig 2 when the maximum amplitude in the spectrum ofa periodic signal occurs atdifferent harmonic numbers There are no peaks in Figure 2 at intervals corresponding to the reciprocals of integers gt6 reflecting the paucity of amplitude maxima at harmonic numbers gt6 A See Materials and Methods for furtherexplanation Thus for a male utterance having a fundamental frequency of 100 Hz for example the fifth harmonic is likely to be the frequency at which power is concentrated in the spectrum of that utterance because the harmonics at or near 500 Hz are likely to be least attenuated Similarly for a female utterance having a fundamen tal frequency of 250 Hz the second harmonic will tend to be the frequency at which the amplitude in the vocal spectrum is great est Because the fundamental frequency of most adult human utterances lies between 100 and 250 Hz Stevens 1999 the fre quency of the third or fourth harmonic should most often be the frequency at which the power is maximal in adult human utter ances conversely power maxima at harmonic numbers less than two or greater than ve will be relatively rare The empirical distribution of amplitude maxima plotted according to harmonic number for the speech sound spectra in the TIMIT corpus ac cords with this general analysis Fig 3A Figure 3B shows why the distribution of amplitude maxima in Figure 3A produces the statistical concentrations of power ob served in Figure 2 If the maximum power in a given utterance were to occur at the fundamental frequency of the spectrum in question additional peaks of power would be present at fre quency ratios of 2 3 4 n with respect to the maximum Note J Neurosci August 6 2003 23187160 7168 7163 English Mandarin Mean normalized amplitude dB 1 1 2 1 4 1 6 1 8 2 Frequency ratio Figure 4 Statistical structure of speech sounds in Farsi Mandarin Chinese and Tamil plot ted as in Figure2 American English is included forcomparison The fu nctions differsomewhat in average amplitude but are remarkably similar both in the frequency ratios at which ampli tude peaks occur and the relative heights of these peaks that peaks refer to concentrations of power at integer multiples of the fundamental frequency of a speech sound and not to the formants of the vocal tract see above If however the maxi mum amplitude were to be at the second harmonic additional amplitude peaks would occur at frequency ratios of 15 2 25 etc with respect to the fundamental And if the maximum am plitude were to occur at the third harmonic additional amplitude peaks would be apparent at ratios of 133 167 2 etc Therefore in any normalized speech sound spectrum where frequency val ues are indexed to the value at the amplitude maximum sound energy will tend to be concentrated at intervals equal to the re ciprocal of the harmonic number of the amplitude maximum The distribution of amplitude maxima in speech sound spec tra thus explains why power in human utterances analyzed in this way tends to be concentrated at frequency ratios that correspond to the reciprocals of 2 3 4 and 5 and not simply at integer multiples of the frequency at the maximum amplitude as would be the case for complex periodic stimuli that have maximum power at the fundamental frequency The important corollary for the present analysis is that the occurrence of amplitude max ima at the positions evident in Figure 2 follows directly from the empirically determined probability distribution of the harmonic numbers at which the maximum power tends to be concentrated in speech sounds Fig 3 and cannot be derived from any a priori analysis of an idealized harmonic series The statistical structure of speech sounds in other languages Figure 4 shows the normalized spectrum for English together with corresponding analyses for speech sounds in Farsi Manda rin Chinese and Tamil see Materials and Methods Although the average amplitude differs somewhat across languages the frequency ratio values at which amplitude maxima occur as well as the relative prominence of these maxima are remarkably consistent Thus the relative concentration of power at different fre quency ratios in the normalized spectrum of speech sounds is largely independent of the language spoken as expected if these data are determined primarily by the physical acoustics of the larynx and vocal tract It is reasonable to suppose therefore that the statistical structure of speech sounds shown in Figures 2 and 4 is a universal feature of the human acoustical environment By the same token musical perceptions predicted by the normalized 7164 J Neurosci August 6 2003 23187160 7168 spectrum of the speech sounds in any particular language should apply to all human populations Rationalizing musical universals on the basis of speech sound statistics The widely shared phenomena in musical perception that require explanation in terms of the probabilistic relationship of auditory stimuli and their sources are 1 the partitioning of the continu ous dimension of pitch into the iterated sets of 12 intervals that define the chromatic scale 2 the preferential use of particular subsets of these intervals in musical composition and perfor mance and 3 similar consonance ordering of chromatic scale tone combinations across human populations The chromatic scale All musical traditions employ a relatively small set of tonal inter vals for composition and performance each interval being de fined by its relationship to the lowest tone of the set Such sets are called musical scales Despite some interesting variations such as the p log scale used by Gamelan orchestras in Indonesia whose metallophone instruments generate nonharmonic overtones the scales predominantly used in all cultures over the centuries have used some or occasionally all of the 12 tonal intervals that in Western musical terminology are referred to as the chromatic scale Nettl 1956 Carterette and Kendall 1999 There is at present no explanation for this widely shared tendency to use these particular tones for the composition and performance of music from among all the possible intervals within any given octave Figure 5A shows the frequency ratio values of the nine peaks that are apparent in the spectra of all four languages illustrated in Figures 2 and 4 As indicated the local amplitude maxima within any octave in the normalized spectrum of speech sounds occur at frequency ratios corresponding to intervals of the chromatic scale For any of the three tuning systems that have been used over the centuries the frequency ratios that de ne the octave fifth fourth major third major sixth minor third minor seventh minor sixth and tritone fall on or very close to the relative fre quency values of the mean amplitude maxima in the normalized spectrum of human speech sounds Fig 5B The remaining in tervals of the chromatic scale the major second the major sev enth and the minor second are not apparent as peaks within frequency ratio range 1 2 Within the octave interval defined by the normalized frequency ratio range of 2 4 however the spec tral peaks at frequency intervals of 225 and 375 correspond in this higher octave to the frequency ratios that define the major second 1125 and the major seventh 1875 in the lower octave Only the frequency ratio of the minor second 1067 has no apparent peak in the statistical analysis we have done Recall that these concentrations of power cannot be derived from an ideal vibrating source but are specific empirical attributes of sound stimuli generated by the human vocal tract The fact that the frequency ratios that de ne most of the 12 intervals of the chromatic scale correspond to the empirical con centrations of power in human speech sounds supports the hy pothesis that the chromatic scale arises from the statistical struc ture of tone evoking stimuli for human listeners The preferred use of particular intervals in the chromatic scale Some intervals of the chromatic scale such as the octave the fifth the fourth the major third and the major sixth are more often used in composition and performance than others Budge 1943 Schwartz et al 0 Speech Sounds and Musical Universals gt O unison Mean normalized amplitude dB E maj 3rd 391539 l min 3rdi fourth E l tritone 1 mal39 6th octave 20 i l 1 Infth min 7th 25 a a 5 5 a a 30 i i i i E 1 12 13314 16 18 2 125 15 167 175 Frequency ratio B Just Equal Pytha Intonation Temp gorean octave 2000 2000 2000 lmaj 7th 1875 1888 1900 C min 7th 1750 1782 1788 39 ma 6th 1667 1682 1688 min 6th 1600 1587 1602 quotquotquotquotquot fifth 1500 1498 1500 I quottritone 1406 1414 1407 fourth 1333 1335 1333 ma 3rd 1250 126 1265 min 3rd 1200 1189 1201 1 maj 2nd 1125 1122 1125 min 2nd 1067 1059 1068 quotquot unison 1000 1000 1000 Figure 5 Comparison of the normalized spectrum of human speech sounds and the inter vals of the chromatic scale A The majority of the musical intervals of the chromatic scale arrows correspond tothe mean amplitude peaks inthe normalized spectrum of human speech sounds shown here over a single octave Fig 2C The names ofthe musical intervals and the frequency ratios corresponding to each peak are indicated B A portion of a piano keyboard indicating the chromatic scale tones over one octave their names and their frequency ratios with respect to the tonic in the three majortuning systemsthat have been used in Western music The frequency ratios at the local maxima in A closely match the frequency ratios that define the chromatic scale intervals Youngblood 1958 Knopoff and Hutchinson 1983 These along with the major second form the intervals used in the pen tatonic scale and the majority of the seven intervals in a diatonic major scale the two most frequently used scales in music world wide Carterette and Kendall 1999 The preference for these particular intervals among all the possible intervals in the chromatic scale is also predicted by the normalized spectrum of human speech sounds illustrated in Fig ures 2 and 4 As is apparent in Figure 5 the frequency ratios that de ne the octave fth fourth major third and major sixth are those that among the ratios that de ne the 12 chromatic scale Schwartz et al 0 Speech Sounds and Musical Universals Stumpf 1898 I Buch 1900 Faist 1897 lt1 Kreuger 1913 Pear 1911 Malmberg 1918 gtOltl Meinong amp atasek 1897 A u l on Q O O O Consonance rank lt1 JAquot 9 4 F gt age A Q A 9080 86 06 quotlt 4quot 0 9 lt0 rgr 0 9 0 09 Q Q Q 9 V 9 0 Figure 6 Consonance ranking of chromatic scale tone combinations dyads in the seven psychophysical studies reported by Malmberg 1918 Faist 1897 Meinong and Witasek 1897 Stumpf1898 Buch 1900 Pear 191 1 and Kreuger1913 Graph showsthe conso nancerank 3quot quot hofthe12chmnwh 39 39 39 39 quot 39 The median values are indicated by open circles connected by a dashed line tones within an octave correspond to the five greatest statistical concentrations of power in the signals generated by human ut terances see also next section The fact that the most frequently used musical intervals cor respond to the greatest concentrations of power in the normal ized spectrum of human speech sounds again supports the hy pothesis that musical percepts re ect a statistical process by which the human auditory system links ambiguous sound stimuli to their physical sources Consonance and dissonance Perhaps the most fundamental question in music perception and arguably the common denominator of all the musical phenom ena considered here is why certain combinations of tones are perceived as relatively consonant or harmonious and others as relatively dissonant or inharmonious These perceived differ ences among the possible combinations of tones making up the chromatic scale are the basis for harmony in music composition The more compatible of these combinations are typically used to convey resolution at the end of a musical phrase or piece whereas less compatible combinations are used to indicate a tran sition a lack of resolution or to introduce a sense of tension in a chord or melodic sequence Malmberg 1918 has provided the most complete data about this perceptual ordering based on seven psychophysical studies of some or all of the 12 combinations of simultaneously sounded chromatic scale tones Fig 6 Kameoka and Kuriyagawa 1969 Hutchinson and Knopoff 1978Krumhansl 1990Huron 1994 Although 72 combinations of the tones in the chromatic scale are possible others are redundant with the 12 combinations tested There is a broad agreement across these studies about the relative consonance of a given tone combination the concordance being greater for combinations rated high in consonance than for the combinations rated low The coefficient of reliability of the sev eral rankings shown in Figure 6 Cronbach s or is 097 J Neurosci August 6 2003 23187160 7168 7165 To examine whether consonance ordering is also predicted by the statistical relationship between tone evoking stimuli and their generative vocal sources we compared the consonance ranking of chromatic scale tone combinations to the relative con centrations of power in human speech sounds at the frequency ratios that define the respective chromatic scale tone combina tions ie musical dyads Fig 7 The two measures of power concentration used were the residual amplitude at each local maximum in Figure 2C Fig 7A and the slopes of these local peaks Fig 7B The Spearman rank order correlation coefficient for the data plotted in Figure 7A is rs 091 t 8 605p lt 0001 for the data plotted in Figure 7B rs 089 t8 545 p lt 001 For the nine maxima evident in the octave range 1 2 both metrics show that the relative concentration of power in human speech sounds at a particular frequency ratio matches the relative consonance of musical dyads The absence of maxima corresponding to the major second major seventh and minor second in Figure 5A predicts that these intervals should be judged the least consonant of the 12 possible chromatic scale tone combinations as indeed they are Fig 6 This evidence that consonance ordering is also predicted by the statistical structure of human speech sounds further supports the hypothesis that musical universals re ect a probabilistic pro cess underlying the perception of periodic auditory stimuli Discussion The results of these analyses show that the statistical acoustics of human speech sounds successfully predict several widely shared aspects of music and tone perception Here we consider earlier explanations of these phenomena in relation to the evidence in the present report and the implications of our results for further studies of auditory processing Earlier explanations of these musical phenomena Explanations of consonance and related aspects of scale structure put forward by previous investigators fall into two general cate gories psychoacoustical theories and pattern recognition theo ries Burns 1999 The inspiration for both lines of thought can be traced back to Pythagoras who according to ancient sources demonstrated that the musical intervals corresponding to oc taves fifths and fourths in modern musical terminology are pro duced by physical sources whose relative proportions eg the relative lengths of two plucked strings have ratios of 21 32 or 43 respectively Gorman 1979 Iamblichus c3001989 This coincidence of numerical simplicity and perceptual effect is so impressive that attempts to rationalize phenomena such as con sonance and scale structure solely in terms of mathematical or geometrical relationships have continued to the present day Bal zano 1980 Ianata et al 2002 These long recognized mathematical relationships are explic itly the foundation for modern psychoacoustical theories of con sonance Hemholtz 18771954 the most vigorous exponent of this approach to the problem in the nineteenth century at tempted to provide a physical basis for Pythagoras s observation by explaining consonance in terms of the ratio between the peri ods of two complex acoustical stimuli In his view consonance was simply the absence of the low frequency amplitude modula tions that occur when the spectral components of two musical tones are close enough in frequency to generate destructive inter ference ie when the two tones are within each other s critical band When such interference occurs listeners perceive beat ing or roughness which Helmholtz 18771954 took to be the signature of dissonance More recent investigators have re 7166 J Neurosci August 6 2003 23187160 7168 Schwartz et al 0 Speech Sounds and Musical Universals fined this general idea Plomp and Levelt A B 1965 Pierce 1966 1983 Kameoka and g Kuriyagawa 1969 and have explored 9 quot35 50 To how such effects might arise from the me 239 8 45 chanics of the cochlea von Bekesy 1962 g 7 40 Pattern recognition theories Gold E 6 35 stein 1973 Terhardt 1974 were pro 2 5 330 posed primarily to explain observations g 4 Com 3 T 25 inconsistent with psychoacoustical mod g 3 9V 20 els These discrepancies include the per g 2 521 15 live 5th ception of dissonance in the absence of a 1 43 Mfrd mm 7 10 4th Mfrd physical beating recognized by Preyer as 3 o Mg th 39 T 9 m 5 MOGthmg hmfrd I math earlyas 1879 and the persistence ofdisso 40 1 2 3 4 5 6 7 ntgoneg 0 1 2 3 4 6 7Tr39tgne nance Whenfhe FWD tones of a dyad are Median consonance rank Median consonance rank presented d1chot1cally 1e one tone to eaCh ear HOUtsma and GOldStein 1972 Figure 7 Consonance rankings predicted from the normalized spectrum of speech soundsA Median consonance rank of Pattern recognition theories however like the psychoacoustical models they sought to improve on also focus on the numerical relationships among the frequency values of idealized tone combinations Terhardt 1974 for instance proposed that the perception of musical intervals derives from the familiarity of the auditory system with the speci c pitch relations among the frequencies of the lower harmonics of com plex tones A few authors working within this general framework have noted that amplitude relationships among the frequency compo nents of tone evoking stimuli might also have some in uence in determining consonance For example Kameoka and Kuriya gawa 1969 reported that the relative consonance of tone com binations depends in part on the degree and direction of the sound pressure level difference between the tones in a dyad More recently Sethares 1998 also suggested that consonance depends in part on the amplitudes of the frequency components of a com plex tone timbre in his usage and that the relative conso nance of almost any musical interval varies as a function of these relationships An awareness that consonance depends on the dis tribution of power as well as on frequency relationships in the stimulus did not however lead these authors to suggest a funda mental revision of the traditional approaches to explaining music perception Both Kameoka and Kuriyagawa 1969 and Sethares 1998 for example explicitly espouse the psychoacoustical the ory that the overall consonance of a musical interval is a function of the physical interaction or local consonance among the si nusoidal components of the complex tones defining the interval Musical universals and the statistical structure of speech sounds A different approach to understanding these musical phenom ena and perhaps audition in general is suggested by the inevita bly uncertain nature of the stimulus source relationship Audi tory stimuli like visual stimuli are inherently ambiguous the physical characteristics of the stimulus at the ear do not and cannot specify the physical properties of the generative source Gordon et al 1992 Hogden et al 1996 Nevertheless it is to ward the stimulus sources that behavior must be directed if au ditory or indeed any percepts are to be biologically useful Thus the physical similarities and differences among the sources of stimuli must somehow be preserved in a corresponding percep tual space A wide range of recent work in vision is consistent with the hypothesis that the visual system meets the challenge of stimulus ambiguity by relating stimuli to their possible sources and con musical intervals from Fig 6 plotted against the residual mean normalized amplitude at different frequency ratios B Median consonance rank plotted against the average slope of each local maximum By either index consonance rank decreases progres sively asthe relative concentration of power at the corresponding maxima in the normalized speech sound spectrum decreases structing the corresponding perceptual spaces for lightness color form and motion probabilistically Knill and Richards 1996 Purves et al 2001 Rao et al 2002 Purves and Lotto 2003 By generating percepts determined according to the statistical distribution of possible stimulus sources previously encountered rather than by the physical properties of the stimuli as such the perceiver brings the lessons derived from the success or failure of individual and ancestral behavior to bear on the quandary of stimulus ambiguity The fact that musical scale structure the preferred subsets of chromatic scale intervals and consonance ordering all can be predicted from the distribution of amplitude frequency pairings in speech sounds suggests this same probabi listic process underlies the tonal aspects of music The biological rationale for generating auditory percepts probabilistically is thus the same as the rationale for this sort of process in vision namely to guide biologically successful behav ior in response to inherently ambiguous stimuli In the case of tone evoking stimuli this way of generating percepts would en able listeners to respond appropriately to the biologically signif icant sources of the information embedded in human vocaliza tion This information includes not only the distinctions among different speech sounds that are important for understanding spoken language but also indexical information such as the probable sex age and emotional state of the speaker Indeed the hedonic aspects of musical percepts may also be rooted in prob abilities Unlike pitch which is more or less affectively neutral tone combinations judged to be consonant are preferred over dissonant combinations Butler and Daston 1968 Such prefer ences may re ect the relative probability of different amplitude frequency combinations in the human acoustical environment Zajonc 1968 2001 Neural correlates How the statistical structure of acoustic stimuli is instantiated in the auditory nervous system to achieve these biological advan tages is of course an entirely open question Perhaps the most relevant physiological observation is a recent study of the re sponses of cat auditory neurons to dyads that human listeners rank as consonant a perfect fifth or a perfect fourth compared with dyads that listeners deem relatively dissonant a tritone or a minor second Tramo et al 2001 Autocorrelation of the spike trains elicited by such stimuli mirrors the autocorrelation of the acoustic stimulus itself moreover the characteristics of the pop Schwartz etal 0 Speeh Sounds and Musical Universals ulation interspike interval correctly predict the relative consoe nance ofthese four musical interva s Another pertinent observation is that the cat inferior collicue lus is tonotopically organized into laminas exhibiting constant frequency ratios between corresponding locations in adjacent layers Schreiner and Langner 1997 The authors suggest that reciprocal lateral inhibition between neighboring laminas might be the anatomical basis of the critical band phenomenon ape parent in psychoacoustical studies Moore 1995 It has also been suggested that the architecture of the inferior colliculus in the cat is an adaptation for the extraction of the fundamental frequency of complex naturally occurring sounds and that perceptions of consonance and dissonance might be a consequence of this funce tional organization Braun 1999 To the extent that these physiological findings canbe general ized to the organization of t e uman auditory system the dye namical representation of neuronal firing patterns andor the laminar structure of the colliculus could embody some aspects of the statistical structure of acoustic stimuli but this remains a matter of speculation The implication of the present results is that one important aspect of the enormously complex neuronal circuitry underlying the appreciation of music by human beings may be best rationalized in terms ofthe statistical link between periodic stimuli and their physical sources The evidence presented here is consistent with the hypothesis that mu ical quot ing derive from the necessarily statistical relationship between sensory stimuli and their physical sources Generating perceptual responses to ambiguous periodic stimuli on this statistical basis takes the full complement of the stimulus characteristics into account thus facilitating a listener s ability to glean biologically significant in formation about the sources of periodic sound energy human speakers in particular Finally this conceptual framework for uni derstanding the major features of tonal music allows audition and vision to be considered in the same general terms References Balzano GI 1980 The grouprtheoretic description of 12efold and microe tonal pitch systems Comp usI 667 4 Boersma P Weenink D 2001 PRAAT 407 Doing phonetics by computer Department of Phonetic Sciences University of Amsterdam There is no print 39 A quot39 quot L th39 fm ltn 2 nl nrqqt BraunM 1999 ditory quot39 39 39 n quot K quotto extraction further evidence and implications ofthe double critical band width Hear Res 129717 82 Buch E 1900 Uber die Verschmelzungen von Emp ndungen besonders bei klangeindrucken Phil Stud 15240 Budge H 1943 A study of chord frequencies New York Bureau of Publie cations Teachers College Columbia Universi Burns EM 1999 Intervals scales and tuning In The psychology of music Deutsch D ed pp 2157264 New York Academic Butler IW Daston PG 1968 Musical consonance as musical preference a crossrcultural study I Gen Psychol 79129 7142 Carterette EC Kendall RA 1999 Comparative music perception and cogr nition In The psychology ofmusic Deutsch D ed pp 72Er791 New mi York c Deutsch D 1973 Octave generalization of specific interference effects in memory for tonal pitch Percept Psychophys 132717275 Driscoll TA 1997 Eigenmodes of isospectIal drums SIAM Rev391717 Escabi MA Schreiner CE 2002 Nonlinear spectrotemporal sound analysis by neurons in the auditory midbrain I Neurosci 22411474131 FaistA 1897 Versuche uber Tonverschmelzung Zsch Psychol Physio Sine nesorg151027131 Fisher WM Doddington GR GoudierMarshall KM 1986 The DARPA speech recognition research database speci cations and status Proceede ings ofthe DARPA speech recognition workshop Report SAICV861546 Palo Alto CA February J Neurosci August 6 2003 Z3187160 r7168 7167 Fletcher NH 1992 Acoustic systems in biology New York Oxford UP Garofolo IS Lamel LF Fis er WM Fiscus IG Pallett DS Dahlgren NL 1990 DARPAeTIMIT acousticrphonetic continuous speech corpus CDVROM Gaithersburg MD US De artrnent of Commerce Goldstein IL 1973 An optimum processor theoryfor the central formation of the pitch of complex tones I Acoust Soc Am 54 149671516 Gordon C Webb D Wolpert S 1992 One cannot hearthe shape ofa drum Bull Am Math Soc 271347138 Gorman P 1979 Pythagoras a life London Routledge and K Paul Helmholtz H 18771954 On the sensations of tone Ellis AI translator Hillenbrand I GettyLA Clark MI Wheeler K 1995 Acoustic characterise tics ofAmerican English vowels I Acoust Soc Am 97309973111 Hogden I Loqvist A Gracco V Zlokarnik I Rubin P Saltzman E 1996 Accurate recovery of articulator positions from acoustics new conclur sions based upon human data I Acoust Soc Am 100181971834 Houtsma AIM Goldstein IL 1972 The central origin ofthe pitch of come plex tones evidence from musical interval recognition IAcoust Soc Am 5125207529 HuronD 1994 I t t tin q quot K 39 t common scales exhibit optimum tonal consonance Music Percepts 1122897305 Hutchinson W KnopoffL 1978 The acoustic component ofWestern cone o ance Interface 71729 Iamblichus c3001989 On the Pythagorean life Clark Gtranslator Live erpool Liverpool UP Ianata P Birk IL Van Dorn ID Leman M Tillman B Bharucha II 2002 The cortical topography of tonal structures underlying Western music Science 298216 72170 Kallman HI Massaro DW 1979 Tone chroma is functional in melody recr ognition Percept Psychophys 2632736 Kameoka A Kuriyagawa M 1969 Consonance theory part II consonance of complex tones and its calculation method I Acoust Soc Am 452146071469 Knill DC Richards W 1996 Perception as Bayesian inference New York Cambridge UP Knopoff L Hutchinson W 1983 Entropy as a measure of style the influr ence of sample length I Mus Theory 2775797 Kreuger F 1913 Consonance and dissonance I Phil Psychol Scient Meth 102158 Krumhansl CL 1990 Cognitive foundations of musical pitch New York Oxford UP Krumhansl CI Shepard RN 1979 Quantification ofthe hierarchy oftonal functions within a diatonic context xp Psycho15 57975 4 Ladefoged P 1962 Elements of acoustic phonetics Chicago University of Chicago Lamel LF Kassel RH SeneffS 1986 Speech database development design A I ML r L 39 n 439 fthe DARPA r r speech recognition workshop Report SAICr861546 Palo Alto CA Febr a u ry Lieberman P Blumstein SE 1988 Speech physiology speech perception and acoustic phonetics Cambridge UK Cambridge UP Malmberg CF 1918 The perception of consonance and dissonance Psyr ch Mono r25937133 Mathworks 1996 Matlab Version 5 Natick MA Mathworks Meinong A Wimsek S 1897 S Zur Experimentallen Bestimmung der ver schmelzungsgrade Zsch Psychol Physio Sinnesorg 15 1897205 Moore BCI 1995 Frequency analysis and masking In Hearing Moore BCI ed New York Academic Muthusamy YK Cole RA Oshika BT 1992 The OGI multirlanguage teler phone speech corpus Proceedings of the 1992 International Conference on Spoken Language Processing ICSLP 92 Banff Alberta Canada October Nettl B 1956 Music in primitive culture Cambridge MA Harvard UP Pear TH 1911 Differences between major and minor chords Br I Psychol 7 4 Peretz I Blood AI Penhune V Zatorre R 2001 Cortical deafness to dissor nance Brain 1249287940 Pierce IR 1966 Attaining consonance in arbitrary scales IAcoust Soc Am 40 249 Pierce IR 1983 The science of musical sound New York Freeman From httppsychclassicsyorkucaMiller Classics Editor39s Note Footnotes are in square brackets references in round brackets The Magical Number Seven Plus or Minus Two Some Limits on our Capacity for Processing Info rmationi George A Miller 1956 Harvard University First published in Psychological Review 63 8197 My problem is that I have been persecuted by an integer For seven years this number has followed me around has intruded in my most private data and has assaulted me from the pages of our most public journals This number assumes a variety of disguises being sometimes a little larger and sometimes a little smaller than usual but never changing so much as to be unrecognizable The persistence with which this number plagues me is far more than a random accident There is to quote a famous senator a design behind it some pattern governing its appearances Either there really is something unusual about the number or else I am suffering from delusions of persecution I shall begin my case history by telling you about some experiments that tested how accurately people can assign numbers to the magnitudes of various aspects of a stimulus In the traditional language of psychology these would be called experiments in absolute judgment Historical accident however has decreed that they should have another name We now call them experiments on the capacity of people to transmit information Since these experiments would not have been done without the appearance of information theory on the psychological scene and since the results are analyzed in terms of the concepts of information theory I shall have to preface my discussion with a few remarks about this theory Information Measurement The quotamount of informationquot is exactly the same concept that we have talked about for years under the name of quotvariancequot The equations are different but if we hold tight to the idea that anything that increases the variance also increases the amount of information we cannot go far astray The advantages of this new way of talking about variance are simple enough Variance is always stated in terms of the unit of measurement inches pounds volts etc whereas the amount of information is a dimensionless quantity Since the information in a discrete statistical distribution does not depend upon the unit of measurement we can extend the concept to situations where we have no metric and we would not ordinarily think of using p 82 the variance And it also enables us to compare results obtained in quite different experimental situations where it would be meaningless to compare variances based on different metrics So there are some good reasons for adopting the newer concept The similarity of variance and amount of information might be explained this way When we have a large variance we are very ignorant about what is going to happen If we are very ignorant then when we make the observation it gives us a lot of information On the other hand if the variance is very small we know in advance how our observation must come out so we get little information from making the observation If you will now imagine a communication system you will realize that there is a great deal of variability about what goes into the system and also a great deal of variability about what comes out The input and the output can therefore be described in terms of their variance or their information If it is a good communication system however there must be some systematic relation between what goes in and what comes out That is to say the output will depend upon the input or will be correlated with the input If we measure this correlation then we can say how much of the output variance is attributable to the input and how much is due to random uctuations or quotnoisequot introduced by the system during transmission So we see that the measure of transmitted information is simply a measure of the inputoutput correlation There are two simple rules to follow Whenever I refer to quotamount of informationquot you will understand quotvariancequot And whenever I refer to quotamount of transmitted informationquot you will understand quotcovariancequot or quotcorrelationquot The situation can be described graphically by two partially overlapping circles Then the left circle can be taken to represent the variance of the input the right circle the variance of the output and the overlap the covariance of input and output I shall speak of the left circle as the amount of input information the right circle as the amount of output information and the overlap as the amount of transmitted information In the experiments on absolute judgment the observer is considered to be a communication channel Then the left circle would represent the amount of information in the stimuli the right circle the amount of information in his responses and the overlap the stimulusresponse correlation as measured by the amount of transmitted information The experimental problem is to increase the amount of input information and to measure the amount of transmitted information If the observer39s absolute judgments are quite accurate then nearly all of the input information will be transmitted and will be recoverable from his responses If he makes errors then the transmitted information may be considerably less than the input We expect that as we increase the amount of input information the observer will begin to make more and more errors we can test the limits of accuracy of his absolute judgments If the human observer is a reasonable kind of communication system then when we increase the amount of input information the transmitted information will increase at first and will eventually level off at some asymptotic value This asymptotic value we take to be the channel capacity of the observer it represents the greatest amount of information that he can give us about the stimulus on the basis of an absolute judgment The channel capacity is the upper limit on the extent to which the observer can match his responses to the stimuli we give him Now just a brief word about the bit p 83 and we can begin to look at some data One bit of information is the amount of information that we need to make a decision between two equally likely alternatives If we must decide whether a man is less than six feet tall or more than six feet tall and if we know that the chances are 5050 then we need one bit of information Notice that this unit of information does not refer in any way to the unit of length that we use feet inches centimeters etc However you measure the man39s height we still need just one bit of information Two bits of information enables us to decide among four equally likely alternatives Three bits of information enable us to decide among eight equally likely alternatives Four bits of information decide among 16 alternatives five among 32 and so on That is to say if there are 32 equally likely alternatives we must make five successive binary decisions worth one bit each before we know which alternative is correct So the general rule is simple every time the number of alternatives is increased by a factor of two one bit of information is added There are two ways we might increase the amount of input information We could increase the rate at which we give information to the observer so that the amount of information per unit time would merease confusxons begm to occur Confusions wul appear near the pomtthatwe are ea1hng hxs channel capaex Ahsnlute Judgments nr Unidimensinnal Stimuli v u 1m asked Fr an n t nthmt steps eorreetxdenn eauon ofthe tone With fourteen durehent tones the hstenehs made many mistakes These data are plottedm Fxg 1 Along the bottom 15 the amount ofmputmformanon tn xspers hnu1us As the number ofaltematzve tones was increasedfrom 2to 14 the e mputmfor manon increased from 1 to 3 8 bxts on the ordmatexsplott dthe amo 1 s mueh the way we would expect a communication ehanne1 rimsw sb w39n39wa y to about 2 bxts and then bends off toward an asymptote at about 2 5 bxts Thxs value 2 5 bxts therefore us what we are eallmg the ehanne1 eapaeuty ofthe hstenerfor absolute m m Wm WM 17 m 0 W Judgments ofpxteh amount nf ut munm that as u umuuu u hsuncrs who 39uuxkn ahgulute judzmam or on have the number25bxts What doesxt mean7 W A V my fining Fust note that 2 5 bus corresponds to about sxx equally n n he u nt wt uein emu lxkely altematzves The result meanst atwe cannot pxek j law the amml L m lmaam39 ted ininmmr more than sxx dufferent pxtehes that the hstener wxll never liun apprnhmru is f5 quot131 quot 1quot Lhmr confuse 0 stated shghtly dfferendynomatterhow many W W 7 1 P 1W wmu HFDF MM 1w th uht on agaun Judgment enables us to narrow down the pameular suhnulus to one out ofN o t Ofcourse 60 duffehent mt h F mu t t Isayxtxs fortunate e confused m rtw n tFrmth ask how reproduelble thls result ls Does rt b u 0 wl transmrtted more than a smal ferent grouplngs ofthe prtehes deereased the transmrssron but the 1 ss was small For example lf you ean dlscnmlnate ve YnANswrlEu Mn a another serres rtrs reasonable to expeet that you eould o t z 3 4 eombrne all ten rnto a srngle senes and sall tell them all apart mw wwmnow wrthout error en you try rt howeyer rt does not wor we 2 untr lrom Carnnr m on AM L n he ehannel eapaerty for prteh seems to be about slx and that me mutiny m ahwm iw39gmmm m umr 5 me bestyou can do tnry tounntss summanzedrn Flg 2 rntensrty range from 15 to 110 db He used 4 5 o 7 10 and 20 dlfferent samulus rntensrtres The r uh h w w Fl p 85 The ehannel 3 brts Srnee these two studres were done m dlfferent laboratones wrth slrghtly dlfferent teehnrques and methods of analysls we are not m agood posraon to argue whether flve Tia NFDRvnlou r te WS Probably the dlfference ls m the nght dlrecnon and absolute Judg e slrghtly te than abs g l r pg ments ofpltch ar more aeeura olute Wm 75 o some judgments ofloudness The rmportantpornt howeyer ls that E 39 the two answers are ofthe same order ofmagnrtude o l 2 3 A b The expenment has also been done for taste rntensraes 1n Flg 3 are the results obtarned by BeeberCenter Rogers and WFUY tnmamlntt Ext 3 I r 1mm Ilrtltchcnlny nugrts uvd cennnelt m absolute mum s ot mums salt solutzons The concentrations ranged from 0 3 to 34 7 gm NaCl per 100 ee tap waterrn equal subjective steps They used 3 5 9 and 17 dlfferent eoneentraaons The ehannel eapaerty ls l 9 brts whreh ls about four but agarn the order of magnrtude ls not far off on the otherhand Thelr results are shown m Flg 4 1 10 20 erreles on the graph That ls to say erreles on the graph 3 25 brts he HakerGamer expenrnenthas been repeated by Coonan g dKl rner Although they have notyet publlshed Lhelr s results they have glven rne perrnlsslon to say that they g obtatnedehannel eapaeluesrangln from 32blts for p 86 2 very short exposures of the polnter posltlon to 3 9 blts f r E l er exp T se val es sh tl th 3 t ake and Garner s so we rnust eonelude that there ar between an Thls ls the largest ehannel capaclty thathas been rneasured 2 l J for any umdlmenslonal vanable quot41 WWW m mm mm Hake and C39lmlzr a At the present urne these four expenrnents on absolute a quotR d W W quot quotNW quotAquot menu or e phylum or a nonler ur 1 m Jud rnents ofslmple unldm uh are all that mm g enslonal have appearedln the psychologlcal journals However a great deal ofwork on other surnulus vanables has notyet appearedln thejoumals For example Enksen or abou ve t u Wdlh n found 2 8 blts for slze31blts for hue and 2 3 blts for bnghmess Geldard has measured the ehannel lntenslues about ve durations and about seven loeauons vlsual dlsplays the ur m l n th A A l v 1 n A A t t and2 F r L l L m tt n bltsfor When twhen the u t u Vb length ofthe ehord was eonstant the result was only 1 6 blts Thls last value ls the lowest that anyone L A A T h uld dd h r dtrrm ll rst the ehannel p hurnan b rt r froml L L In terms of lncludesfrom4to 10 t n to 15eategorles lanun or L A t m umdtm n t n l one slrnple sensory atmbute to another p 87 Ahsnlnte Judgments hr Mlltidimensinnal Stimuli ete The why h rnolepenolently yanable attnbutes othe starnulr that are belngjudged Objects faces worols and the llke dlffer from from one anotherrn only one respeet 39 Fortunately there are a few data on what happens when we another m several ways Letus look rst atthe results Kl rnrner anol Fnck g have reported for the absolute Judgment of the posrtaon ofa olot m a square 1n Flg 5 we see news to a 9mm therrresults Nowthe ehannel eapaerty seernsto have no em rnereaseolto 4 o brts whreh rneans thatpeople ean ldentlfy us see urnsqu 3 4 5 a 7 quotHquot quot URW W39 The posmon of a olotrn a square ls elearly atwordlmenslonal ne 5 ttt n 1 5 5mm 55mm fur ahm utz inr z beldennfled Thus rtseerns natural to comparethe4 Srblt quot m quotIt 1mm 1 391 d m a 3 a eapaerty for a square wth the 3 25ebrt eapaerty forthe r rnerease from 3 25 to 4 o butltfalls short ofthe perfeet addlnon that wouldglve o 5 brts Another example ls provlded by BeeberCenter Rogers and O Connell When they asked people to 3brts altar n a 19we ml m n V b m eoneeryably rnrght brts bt w d l brts pltch Four V w v ll seeonol dlmenslon augments the ehannel eapaerty butnot so rnueh as rt rnrght fourth exam A pl ofequal p 88 lule n th estrrnate that there are about ll to 15 rolentrfrable eolors or m our terrns about 3 o brts Srnee these Judgment Enksen s 3 l olraw 1t ls stall along h r L NW1 provlded by faces worols ete To flll thrs gap we have only one expenrnent an audltory stuoly alone by frequency rntensrty rate oflntermpuon onrtlme fraetaon total duratron and spataal loeataon Eaeh one or 15 dtmenstons 7 2 btts up tnto the tange that otdtnaty expettenee wouldlead us to expeet p rt 6 1n amoment of tahtng V Clearly the addttton oftndependentty vanable attnbtttes to the stamtthts tneteases the channel eapaetty but at a deeteastng tate Itts tntetesttng to note that the ehanne1 eapaetty ts tneteased even when the se eta1 vanables are not 3 tomes mquot tndependent Enksen 5 tepotts that when stzebnghtness g r and hue all vary togethet tn petfeet eonetatton the g A ttansmtttedtnfotmataon ts 4 1 btts as compared wtth an 1 avetage of about 2 7 btts when these attxtbutes ate vatted one Hm at atame By confoundmg thtee attnbutes Enksen tneteased 0H7 L jig l the dtmenstonahty othe tnptttwtthottttneteastng the amoun o were cape Ft 6 to 31m quott wouldlead us to expeet z twcivdm y tmsttto nttttht or m sttmtth httt we deetease the aeeutaey fot any pamcularvanable 1n othetwotds we ean make te1attve1y etude tudgments ofseveral thtngs stmuttaneottsty 1 n tr w quott the one we seem to have made ts clearly the mote adaptave speeeh A A L ot atmosttemary tn natute Fot example abmary dtsttnetton ts made between vowels and eonsonants abmary deetston ts made between ota1 and nasal eonsonants atematy deetston ts made among front mxddle and back phonemes ete mm r t that there ts not ttme to dtseuss tthete ot about r tth tunexptoted L L tndeftnttety tn thts way In human speech there is clearly a limit to the number of dimensions that we use In this instance however it is not known whether the limit is imposed by the nature of the perceptual machinery that must recognize the sounds or by the nature of the speech machinery that must produce them Somebody will have to do the experiment to nd out There is a limit however at about eight or nine distinctive features in every language that has been studied and so when we talk we must resort to still another trick for increasing our channel capacity Language uses sequences of phonemes so we make several judgments successively when we listen to words and sentences That is to say we use both simultaneous and successive discriminations in order to expand the rather rigid limits imposed by the inaccuracy of our absolute judgments of simple magnitudes These multidimensional judgments are strongly reminiscent of the abstraction experiment of Kulpe M As you may remember Kulpe showed that observers report more accurately on an attribute for which they are set than on attributes for which they are not set For example Chapman A used three different attributes and compared the results obtained when the observers were instructed before the tachistoschopic presentation with the results obtained when they were not told until after the presentation which one of the three attributes was to be reported When the instruction was given in advance the judgments were more accurate When the instruction was given afterwards the subjects presumably had to judge all three attributes in order to report on any one of them and the accuracy was correspondingly lower This is in complete accord with the results we have just been considering where the accuracy of judgment on each attribute decreased as more dimensions were added The point is probably obvious but I shall make it anyhow that the abstraction experiments did not demonstrate that people can judge only one attribute at a time They merely showed what seems quite reasonable that people are less accurate if they must judge more than one attribute simultaneously p 90 Subitizing I cannot leave this general area without mentioning however brie y the experiments conducted at Mount Holyoke College on the discrimination of number Q In experiments by Kaufman Lord Reese and Volkmann random patterns of dots were ashed on a screen for l 5 of a second Anywhere from 1 to more than 200 dots could appear in the pattern The subject s task was to report how many dots there were The first point to note is that on patterns containing up to five or six dots the subjects simply did not make errors The performance on these small numbers of dots was so different from the performance with more dots that is was given a special name Below seven the subjects were said to subitize above seven they were said to estimate This is as you will recognize what we once optimistically called quotthe span of attentionquot This discontinuity at seven is of course suggestive Is this the same basic process that limits our unidimensional judgments to about seven categories The generalization is tempting but not sound in my opinion The data on number estimates have not been analyzed in informational terms but on the basis of the published data I would guess that the subjects transmitted something more than four bits of information about the number of dots Using the same arguments as before we would conclude that there are about 20 or 30 distinguishable categories of numerousness This is considerably more information than we would expect to get from a unidimensional display It is as a matter of fact very much like a twodimensional display Although the dimensionality of the random dot patterns is not entirely clear these results are in the same range as Klemmer and Frick s for their twodimensional display of dots in a square Perhaps the two dimensions of numerousness are area and density When the subject can subitize area and density may not be the significant variables but when the subject must estimate perhaps they are significant In any event the comparison is not so simple as it might seem at first thought This is one of the ways in which the magical number seven has persecuted me Here we have two closely related kinds of experiments both of which point to the signi cance of the number seven as a limit on our capacities And yet when we examine the matter more closely there seems to be a reasonable suspicion that it is nothing more than a coincidence The Span 0f Immediate Memory Let me summarize the situation in this way There is a clear and de nite limit to the accuracy with which we can identify absolutely the magnitude of a unidimensional stimulus variable I would propose to call this limit the span of absolute judgment and I maintain that for unidimensional judgments this span is usually somewhere in the neighborhood of seven We are not completely at the mercy of this limited span however because we have a variety of techniques for getting around it and increasing the accuracy of our judgments The three most important of these devices are a to make relative rather than absolute judgments or if that is not possible b to increase the number of dimensions along which the stimuli can differ or c to arrange the task in such a way that we make a sequence of several absolute judgments in a row The study of relative judgments is one of the oldest topics in experimental psychology and I will not pause to review it now The second device increasing the dimensionality we have just considered It seems that by adding p 91 more dimensions and requiring crude binary yesno judgments on each attribute we can extend the span of absolute judgment from seven to at least 150 Judging from our everyday behavior the limit is probably in the thousands if indeed there is a limit In my opinion we cannot go on compounding dimensions inde nitely I suspect that there is also a span of perceptual dimensionality and that this span is somewhere in the neighborhood of ten but I must add at once that there is no objective evidence to support this suspicion This is a question sadly needing experimental exploration Concerning the third device the use of successive judgments I have quite a bit to say because this device introduces memory as the handmaiden of discrimination And since mnemonic processes are at least as complex as are perceptual processes we can anticipate that their interactions will not be easily disentangled Suppose that we start by simply extending slightly the experimental procedure that we have been using Up to this point we have presented a single stimulus and asked the observer to name it immediately thereafter We can extend this procedure by requiring the observer to withhold his response until we have given him several stimuli in succession At the end of the sequence of stimuli he then makes his response We still have the same sort of inputoutput situation that is required for the measurement of transmitted information But now we have passed from an experiment on absolute judgment to what is traditionally called an experiment on immediate memory Before we look at any data on this topic I feel I must give you a word of warning to help you avoid some obvious associations that can be confusing Everybody knows that there is a nite span of immediate memory and that for a lot of different kinds of test materials this span is about seven items in length I have just shown you that there is a span of absolute judgment that can distinguish about seven categories and that there is a span of attention that will encompass about six objects at a glance What is more natural than to think that all three of these spans are different aspects of a single underlying process And that is a fundamental mistake as I shall be at some pains to demonstrate This mistake is one of the malicious persecutions that the magical number seven has subjected me to My mistake went something like this We have seen that the invariant feature in the span of absolute judgment is the amount of information that the observer can transmit There is a real operational p If h Vth m um of lnformauon For example declmal dlglts are worth 3 3 bus aplece We ean reeall about seven ofthem for atotal of 23 bus ofrnformauon Isolated Engllsh words are worth about 10 bus aplece Ifthe total lnu M n hmquot t deflmtlve kmds oftestmatenals blnary dlgltsdeclma1 dlglts letters ofthe alphabet letters plus declmal dlglts wuh l Woodworth was useolto seore the responses The results are shown by the lled clrcles m Flg 7 Here 1 5D f the olotteolhne rndleates what the span shoulol have been lf h am unt on form hon m span were eonstant Z r e no r solld euryes represent the data Hayes repeated e a t WWW expenment uslng test yoeabulanes of dlfferent slzes but all urosumuo w eontarnrng only Engll sh monosyllables open clrcles m Flg Z l 7 Thls more homogeneous test matenal dwlnot ehange the V pleture slgnlfeantly wuh bmary uems the span ls about t 2 mute and although u olrops to about ye wuh monosyllablc 3 Enghsh words the dlfference ls far less than the hypothesls W V of e nstant lnformatlon would requlre 0 Jam rg e F t There ls notlung wrong wuh Hayes s experrment beeause wFoRMAYvoV um ta nus Pollack m repeateolu mueh more elaborately and got km mm W 0 m m Fr 7 Dru essenually the sameresult Pollaektook pams tomeasure of numerunto annury plotiul as n undiun r h 39 per hem In th o t e the tradluonal proeeolure for seorlng the responses Hls 3 Mi 5v rt 8 tnbthnl b a A by the amount ofrnformauon Immedlate memory ls hmueol by the number ofltems 1n oroler to eapture ThenI e rmmedlate memory W bus per ehunk atleast oyerthe range thathas been examrneolto olate Th ontrast othe terms hr and chunk a1so serves to h ghhght the faetthatwe are not very defxmte about what am e at eonstrtutes a ehunk ofmformanon For ex pl the memory f1 o1thatH s ba d henea ordw 2 m s 1000 ghsh monosy11ab1es g mxghtjust as appropnate1y have been ea11 d me s 5 of 15 ph 1 e eaeh word had about three phonemes 1rnrt 1nturtave1yrtrs 1 arth subjects were re an e e at the vewords not 15phonemes but the not 1mme ate1y apparent We are de roeess o rgamzmg or groupmg the rnputrnto famxlxar ts or ehunks and agreat deal ofleammg has gone rnto the f ts e mg 1ogrea1 dutmcuon 15 almg here wrth a p u 1 formauon ofthese amxharum wot a 0 0115 m 5 mm mm mm 16 rm at mnunL n m ornmtnn ctt ncrl ntter 01 nnmtatmn wound a n funcumx hi on Reendmg nmuuni n1 mfmmnlmn per ttrrn hr the to WM In order to speak more preerse1y therefore we must h 1 Smce the mdlarger before Soon he L Then the ehunks F L dunng the 1 mm or T mcrease the brts per ehunk e A but V tn wd39hwrmmhr the new name rather than the ongmal rnput events 1want to ten xplmt w t w h m Thrs P h 1 1 Assoerataon m 1954 64915 at tt we bmary dAgnS LnTab1e1 of18b1nary at tt 1m n t 0115 renamed 1 1015 renamed 2 and 11 15 p 94 renamed 3 That 15 to say we reeode from abaseetwo so we gwe eaeh A 1 15andfrom 0to31 mm 1 1mm be beeuutso 5M2le or Hunk MW n n 41 o u 1 t 1 2391 cunter l 10 ut u no 11 10 but 1mg 0 1 o o 31 Chunks It DUO 00 In 11mm 5 0 4 n 41 Chunks 1010 0010 10 kccmlmg m 2 51 Chunla 1011 mom 1 Mounts 211 9 61915 b oeta1s c b 10 mmutes wh1le they tnedto use the recodmg sehemes they had studmd pant r ydAgns m eyery trans1ata on othe 1ast group Smce the 4 V L F1 predmt on the basrs ofh1s span for oeta1 dAgnS He eou1o1remember 12 oeta1 dAgnS th the 2 1 reeodmg these 12 ehunks were worth 24 bmary dAg1ts th th 3 1 recodmg they were worth 36 bmary dAg1ts Wrth the 4 1 and 5 1 reeodmgs they were worth about 40 bmary dAgnS rrrH r e The pomt 15 deal wrth In one form or anotherwe use recodmg eonstant1y m our dar1y behayror eode When there 15 a story or an argument or an rdeathatwe want to remember we usua11y try to rephrase rt m our own words When we wrtness some event we want to remember we make ayerba1 verbalxzan on Upon recall we recreate by secondary elabor n the detads that seem consistenthth the vs 40 35531233 pameular verbal recodmg we happen to have made The Wk Wellrknown enpenment by Canmchael Hogan dWalter g on them uenee thatnames have on the recall ofvxsual gures is one demonstration oftheprocess The maeeunaey ofthe testamony of eyewnnesses is well known m legal psychology but the distortions oftesumony are recodingtha the Witness used an ep cularrecodmg he used depends uponhxswhole hfehstory Ourlanguagexs mnc m m ehunks neh m mformauon Isuspeet that imagery ls aform F dw V operationally andto study expenmentany Lhanthemore 5 g u hunted In a unrucn o the 39 The pnde 4mm aquot n nmnanen by v minlymx he mm m mats b 2 s n to m mumquot ium hash had b39 p m nspmzmhv ymbohe kmds ofrecodm F t t V enough H m w m elustenng m the recall ofwords is especially mteresung m this respect Summary summanzmg remarks Fust L or win m mh r we manage to break or at least stretch this mformauonal botueneek Second Tun m m r hmtn n h ep91 Vhme nrhh because Tmazes the N nh r nun uneharted wilderness ofmdmdual differences 739 A h F mm rt rm Infomauon can be useful in the study of concept formation A lot of questions that seemed fruitless twenty or thirty years ago may now be worth another look In fact I feel that my story here must stop just as it begins to get really interesting And nally what about the magical number seven What about the seven wonders of the world the seven seas the seven deadly sins the seven daughters of Atlas in the Pleiades the seven ages of man the seven levels of hell the seven primary colors the seven notes of the musical scale and the seven days of the week What about the sevenpoint rating scale the seven categories for absolute judgment the seven objects in the span of attention and the seven digits in the span of immediate memory For the present I propose to withhold judgment Perhaps there is something deep and profound behind all these sevens something just calling out for us to discover it But I suspect that it is only a pernicious Pythagorean coincidence Footnotes 1 This paper was first read as an Invited Address before the Eastern Psychological Association in Philadelphia on April 15 1955 Preparation of the paper was supported by the Harvard PsychoAcoustic Laboratory under Contract N5ori76 between Harvard University and the Office of Naval Research Us Navy Project NR142201 Report PNR174 Reproduction for any purpose ofthe Us Government is permitted References 1 BeebeCenter J G Rogers M S amp O39Connell D N Transmission of information about sucrose and saline solutions through the sense oftaste J Psychol 1955 39 157160 2 Bousfield W A amp Cohen B H The occurrence of clustering in the recall of randomly arranged words of different frequenciesofusage J gen Psychol 1955 52 8395 3 Carmichael L Hogan H P amp Walter A A An experimental study of the effect of language on the reproduction of visually perceived form J exp Psychol 1932 15 7386 4 Chapman D W Relative effects of determinate and indeterminate Aufgaben Amer J Psychol 1932 44 163174 5 Eriksen C W Multidimensional stimulus differences and accuracy of discrimination USAF WADC Tech Rep 1954 No 54165 6 Eriksen C W amp Hake H W Absolute judgments as a function of the stimulus range and the number of stimulus and response categories J exp Psychol 1955 49 323332 7 Garner W R An informational analysis of absolute judgments of loudness J exp Psychol 1953 46 373380 8 Hake H W amp Garner W R The effect of presenting various numbers of discrete steps on scale reading accuracy J exp Psychol 1951 42 358366 9 Halsey R M amp Chapanis A Chromaticityconfusion contours in a complex viewing situation J Opt Soc Amer 1954 44 442454 10 Hayes J R M Memory span for several vocabularies as a function of vocabulary size In Quarterly Progress Report Cambridge Mass Acoustics Laboratory Massachusetts Institute of Technology Jan June 1952 11 Jakobson R Fant C G M amp Halle M Preliminaries to speech analysis Cambridge Mass Acoustics Laboratory Massachusetts Institute of Technology 1952 Tech Rep No 13 12 Kaufman E L Lord M W Reese T W amp Volkmann J The discrimination of visual number Amer J Psychol 1949 62 498525 13 Klemmer E T amp Frick F C Assimilation of information from dot and matrix patterns J exp Psychol 1953 45 1519 14 Kulpe O Versuche uber Abstraktion Ber al I Kongr f exper Psychol 1904 5668 15 Miller G A amp Nicely P E An anlysis of perceptual confusions among some English consonants J Acoust Soc Amer 1955 27 338352 16 Pollack I The assimilation of sequentially encoded information Amer J Psychol 1953 66 421 17 Pollack I The information of elementary auditory displays J Acoust Soc Amer 1952 24 745 749 18 Pollack I The information of elementary auditory displays II J Acoust Soc Amer 1953 25 765769 19 Pollack I amp Ficks L Information of elementary multidimensional auditory displays J Acoust Soc Amer 1954 26 155158 20 Woodworth R S Experimental psychology New York Holt 1938 Received May 4 1955 SelfOrganized Criticality SOC Tino Duong Biological Computation 0 Introduction 0 Background material 0 SelfOrganized Criticality Defined 0 Examples in Nature 0 Experiments 0 Conclusion SOC in a Nutshell o Is the attempt to explain the occurrence of complex phenomena Background Material What is a System a A group of components functioning as a whole Obey the Law 0 Single components in a system are governed by rules that dictate how the component interacts with others System in Balance o Predictable 0 States of equilibrium Stable small disturbances in system have only local impact Order Chaos i Systems in Chaos o Unpredictable o Boring Order Chaos Example Chaos White Noise Edge of Chaos Emergent Complexity SelfOrganized Criticality SelfOrganized Criticality Defined o SelfOrganized Criticality can be considered as a characteristic state of criticality which is formed by selforganization in a long transient period at the border of stability and chaos Characteristics 0 Open dissipative systems a The components in the system are governed by simple rules Characteristics continued 0 Thresholds exists within the system a Pressure builds in the system until it exceeds threshold Characteristics Continued 0 Naturally Progresses towards critical state 0 Small agitations in system can lead to system effects called avalanches o This happens regardless of the initial state of the system Order Chaos Domino Effect System wide events a The same perturbation may lead to small avalanches up to system wide avalanches By Bak 1 Example olz Domi 312 olzla 2 2 no Effect Characteristics continued 0 Power Law 0 Events in the system follow a simple power law Ns s39t Power Law graphed Ns squot 1 log Ns t log 5 ii Characteristics continued 0 Most changes occurs through catastrophic event rather than a gradual change 0 Punctuations large catastrophic events that effect the entire system How did they come up with this Nature can be viewed as a system a It has many individual components working together 0 Each component is governed by laws 0 eg basic laws of physics Nature is full of complexity o GutenbergRichter Law a Fractals o 1overf noise Earthquake distribution 19744983 By Bak 1 GutenbergRichter Law NcEarthquakesNear 0 14 am 1 1 w r r V39 1 I I I 5 Magnitude Mb lag E By Bak 1 Fractals o Geometric structures with features of all length scales eg scale free 0 Ubiquitous in nature 0 Snowflakes 0 Coast lines Fractal Coast of Norway Log Length Vs Log box size Lag Mkm 30 I I I I I gag 00 05 39l U 15 20 Log akm By Bak 1 1IF Noise Year By Bak 1 1f noise has interesting patterns Mm Year lf Noise White Noise Can SOC be the common link a Ubiquitous phenomena 0 NoseIftuning 0 Must be selforganized o Is there some underlying link Experimental Models Sand Pile Model 0 An MxN grid Z 0 Energy enters the model by randomly adding sand to the model 0 We want to measure the avalanches caused by adding sand to the model Example Sand pile grid 0 Grey border represents the edge of the pile 0 Each cell represents a column of sand Model Rules 0 Drop a single grain of sand at a random location on the grid 0 Random xy 0 Update model at that point Zxy 9 Zxy1 o If Zxy gt Threshold spark an avalanche 0 Threshold 3 Adding Sand to pile o Chose Random xy position on grid 0 Increment that cell c Zxy 9 Zxy1 c Number of sand grains indicated by colour code By Maslov e Avalanches c When threshold has been exceeded an avalanche occurs 0 If Zxy gt 3 o Zxy 9 Zxy 4 o Zx1y 9 Zx1y 1 0 Zxy 9 Zxy1 1 By Maslov 6 Before and After 39B for r By Bak 1 3l2l10 2 322 1 2oz a 12hhh zTTT 1 31Is 2 0213 2 Domino Effect o Avalanches may propagate DEMO By Sergei Maslov Sandpile Applet httpcmthphybn1g0Vmaslov S andpilehtm Observances o Transientstable phase 0 Progresses towards Critical phase 0 At which avalanches of all sizes and durations 0 Critical state was robust 0 Various initial states Random not random 0 Measured events follow the desired Power Law Size Distribution of Avalanches 50x50 Grid 003001 1 By Bak1 Sandpile Model Variations o Rotating Drum 0 Done by Heinrich Jaeger 0 Sand pile forms along the outside of the drum Rotating Drum Other applications 0 Evolution 0 Mass Extinction a Stock Market Prices o The Brain Conclusion o Shortfalls 0 Does not explain why or how things selforganize into the critical state 0 Cannot mathematically prove that systems follow the power law 0 Benefits 0 Gives us a new way of looking at old problems References 1 P Bak How Nature Works Springer Verlag NY 1986 o 2 HJJensen SelfOrganized Criticality Emergent Complex Behavior in Physical and Biological Systems Cambridge University Press NY 1998 o 3 T Krink R Tomsen SelfOrganized Criticality and Mass Extinction in Evolutionary Algorithms Proc lEEE int Conf on Evolutionary Computing 2001 11551161 c 4 PBak C Tang K WiesenFeld SelfOrganized Criticality An Explanation of 1f Noise Physical Review Letters Volume 59 Number 4 July 1987 References Continued 5 PBak C Tang Kurt Wiesenfeld SelfOrganized Criticality A Physical Review Volume 38 Number 1 July 1988 o 6S Maslov Simple Model of a limit orderdriven market Physica A Volume 278 pg 571 578 2000 O 7 P Bak Website httpcmthphybnlgovmaslovSandpilehtm Downloaded on March 15th 2003 o 8 Website httpplatoneeduthgrsoeist7tLessonslessons4htm Downloaded March 3rd 2003 Questions


Buy Material

Are you sure you want to buy this material for

25 Karma

Buy Material

BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.


You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

Why people love StudySoup

Jim McGreen Ohio University

"Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

Jennifer McGill UCSF Med School

"Selling my MCAT study guides and notes has been a great source of side revenue while I'm in school. Some months I'm making over $500! Plus, it makes me happy knowing that I'm helping future med students with their MCAT."

Steve Martinelli UC Los Angeles

"There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

Parker Thompson 500 Startups

"It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

Become an Elite Notetaker and start selling your notes online!

Refund Policy


All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email


StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here:

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.