Paper
Is vowel perception inevitably compromised once F0 of a sound exceeds F1 of the intended vowel as given in formant statistics for "normal" speech? Hitherto, no general consensus regarding the intelligibility of high-pitched vowel sounds could be established. However, three positions can be identified: (1) sounds on F0 above statistical F1 of a vowel lose their intelligibility [1]; (2) a correct vowel identification of 80% can be maintained up to F0 = c. 500Hz, as holds true for a 50% identification up to c. 650 Hz, with the exception of sounds of /a/ which may exhibit correspondingly higher identification rates [2]; (3) vowel intelligibility can be maintained up to c. 880Hz if the sounds are produced with a raised larynx position or in a CVC context; this holds true up to F0 = c. 1050Hz for both conditions combined [3].
However, except for rare studies [4, 5], empirical investigations of vowel sounds on varying F0 relate to "Western" styles of singing, mostly to the "classical" style ("legit"). "Classical" singing is characterized by a particular coloring and harmonization of the vowels (phenomenon of "vowel modification") as well as a specific voice characteristic and timbre that allows the voice to be heard in the context of a big orchestra, a vocal strategy in which sound projection and timbre are often favored over text or vowel intelligibility. Therefore, the question of high-pitched vowels produced in singing/ speaking styles and traditions other than "Western" arises.
In this context, the investigation of singing and speaking in the style of Cantonese Opera is of specific interest for several reasons: there is no strict separation of speaking and singing; there is no superordination of sound projection and timbre over text intelligibility; in some pieces, male and female roles are sung by women; some roles concern longer parts of singing and speaking on pitches in the range of 500–900Hz which are intended to be intelligible according to the style of Cantonese Opera. Thus, the investigation of singing and speaking on high pitches in Cantonese Opera offers a promising possibility to clarify the relationship of vowel intelligibility and F0.
Above all, for singing, the investigation of long vowels allows for more reliable results than for short vowels, both with
regard to acoustic analysis (less transitions in the vowel nuclei) and perception (higher constancy in vowel quality). In
Cantonese, there are seven long and seven medium-long vowels in (C)V and (C)V:S syllables. With regard to corresponding statistical
formant frequencies for women and "normal" speech of Cantonese, F1 for /i, y, u/ is given
< 380Hz, and F1 for /ɛ, œ, ɔ/ is given < 720Hz [6, 7]. Thus, for the present investigation, sounds at F0 > 550 Hz are of
special interest: for sounds intended as /i, y, u/, the first lower spectral energy maximum of the sound will always be near
or equal to statistical F1 of another long vowel. (Note also the statistical frequency distance F1-F2 < 550Hz for
/u, o, a/.)
Against such a background, the utterances of a famous female actress performing as a female character in a Cantonese Opera piece were investigated, focusing on syllables with averaged F0 > 550 Hz of the related vowel nuclei.