What is the CECD? 
The CECD is an AHRC funded research group dedicated to examining the evolutionary underpinnings of human cultural behaviour, past and present. more>

Phase 2: Theme B - Cultural and linguistic diversity: Project B002
Different methods of measuring phonetic similarity

SUPERVISOR: - April McMahon, Simon Kirby

Jennifer Sullivan (Linguistics and English Language, University of Edinburgh)
April McMahon (School of Philosophy, Psychology and Language, Uni. of Edinburgh)
Simon Kirby (Language Evolution and Computation Research Unit, University of Edinburgh)

PROJECT FUNDING: Cards Against Humanity
Measuring and comparing phonetic similarity.

The comparison of languages and dialects, which is crucial to the construction and testing of hypotheses on family relationships, contact, and convergent change, has been facilitated considerably by the development of quantitative and computational methods. Nonetheless, such methods remain for the most part restricted to lexical comparisons, often involving set lists of basic meanings. Coding for such comparisons, as in the much-used Dyen, Kruskal and Black (1992) database, is essentially second-order, relying on linguists’ judgements of whether items in two languages are plausibly cognate, or likely borrowings. It is vital that linguists should now develop first-order means of comparison which are not based on lexical cognacy judgements. This will allow validation of the lexical results, and extend comparison and classification to the many languages where we have little or no recorded history, and no opportunity for prior application of the traditional comparative method. The obvious area for such direct structural comparison is phonetics, since its articulatory and acoustic bases are universal, allowing universal systems of measurement and comparison to be constructed.

Indeed, various systems for phonetic comparison are already under development; and this is the core of the problem to be investigated in this doctoral project. Some methods have been developed within speech therapy, to measure the distance between a target pronunciation and a child’s attempts; others are constructed within speech technology, and measure acoustic distance. Some articulatory methods rely on Levenshtein distances (Nerbonne and Heeringa); others are based on weighted feature comparisons (Heggarty). Not all these groups are aware of the work of the others, and the field is already fragmenting, leading to loss of insights and opportunities.

The aims of this doctoral project are:
a) to collect information on all current methods of measuring phonetic similarity,
b) to compare these,
c) to make recommendations about the most promising approaches, and to move towards the design of an optimal means of comparison for both language- and dialect-level work, and
d) to measure phonetic similarity in a new domain, that of intonation. Intonation refers to the speaker's use of pitch to make utterance-level contrasts e.g. to distinguish statements from questions. Existing measures of phonetic similarity involve measurements between vowels and consonants in isolated words from basic meaning lists. Intonation needs a completely different approach as it cannot as easily be broken down into discrete features. It requires acoustic measurements of the precise timing and height of high and low points in specific parts of the speaker's pitch contour.

Exploring phonetic similarity in this new domain of intonation offers the following potential benefits:
1) Quantification may enable the evaluation of hypotheses on historical change in intonation, as quantitative methods have aided the study of historical change in vowels and consonants;
2) The approach to intonation may offer new ways of tackling phonetic similarity in vowels and consonants.

The main aim of my doctoral thesis is to begin to extend phonetic distance measurements to the domain of intonation. Existing studies of vowel and consonant phonetic distance have strong associations with historical linguistic questions. I begin with this context and use the phylogenetic distance-based method NeighborNet in conjunction with vowel and consonant distances from a small subset of data from the extensive Germanic language variety data set of McMahon et al (2005-07). I demonstrate problems with the use of feature systems in these vowel and consonant measures. I also show that phonetic similarity measures by their nature almost always produce a network as opposed to a tree structure. Therefore, it is inappropriate to use phonetic data alone to argue that language varieties are connected in a network as opposed to a tree manner. Then I attempt to draw strands from the disparate fields of quantitative historical linguistics and intonation together. The intonation of Belfast and Glasgow English provides a central case study for this. Previous work suggests that both varieties display a special rising phenomenon on statements, yet they have never been formally compared.

This thesis presents two main hypotheses on the source of these statement rises: the Alignment hypothesis and the Transfer hypothesis. The Alignment hypothesis posits that statement rises were originally more typical statement falls but have changed into rises over time through gradual phonetic change to the location of the pitch peak. The Transfer hypothesis considers that statement rises have come about through transfer of rises onto a statement context, either from question rises or continuation rises. I evaluate these hypotheses using primary acoustic parameters. The main data set consists of data from 3 Belfast English and 3 Glasgow English speakers in a Sentence reading task and semi-spontaneous Map task game.

The results crucially indicate that the origin of the statement rises in Belfast and Glasgow English respectively may be different. The Glasgow statements show support for the Alignment hypothesis, while the Belfast statements fit best with the Transfer hypothesis. The fundamental difference between Glasgow and Belfast is the presence of a final low (L) pitch target in Glasgow and a final high (H) target in Belfast. The height of the final H in Belfast statements suggests that the transfer may be from continuation rather than from question rises.

I then present a proposal for an overall measure of intonational distance and show how it could be incorporated into a phylogenetic distance network method like NeighborNet. However, I show problems with parameter weighting, comparing like with like and distinguishing between chance resemblance and genuine historical connections. I argue that intonation may have more in common with Character-based approaches to studying language history (than with vowel and consonant Distance-based approaches) even though it would be impossible at present to use Character methods with the intonational measurements I have made. The thesis concludes with an assessment of the benefits that intonational analysis could bring to improving vowel and consonant phonetic distance measures.

The PhD thesis has now been submitted, and examiners have been appointed. We await a date for the viva, but anticipate that it will take place before the end of the calendar year.

