Traduzione e Interpretazione

Exmaralda: a possible integration between time-based and hierarchical data models

Empirical examinations of transcription methods and their implications are singularly lacking in the research literature. This post aims at filling this gap, by recalling the main differences between time-based and hierarchical data models, and by providing some technical information on the feasibility of their integration in one piece of software called Exmaralda. Similarly to the post on transcription tools I have uploaded a few days ago, this work is a technical complement to the mainly theoretical paper (forthcoming) I have written on the transcription of oral interactions.

Assuming that one complete, exhaustive system that suits all purposes may not be achievable, I divide attempts at finding the best transcription methods into two groups, according to the way in which oral data are treated: either on a time base or hierarchically.

Time-based data models “take the temporal relation between elements as the main principle for the organization of transcription data” (Schmidt, 2005: 1). Alternatively called “stave” or “single timeline, multiple tiers” data model, they place simultaneous events at the same horizontal position, and associate the left-to-right direction to temporal sequence (Schmidt, 2004: 1). Such time-based systems enable the transcriber to represent the words uttered by different participating speakers on different tiers, to display the overlaps of different speakers, and to add an indefinite number of other tiers to code verbal and non verbal phenomena that are associated to a particular point in time and hence relevant for the analysis (see Schmidt 2004 for a detailed account of structural relations and Document Type Definition in this type of data model, and for an overview of transcription tools based on musical score notation).

In hierarchical data models, on the other hand, “the principal relation between any two elements […] is not defined by their respective positions on a timeline, but by their positions in an ordered hierarchy” (Schmidt, 2005: 2). Following an interpretation of events and language functions, entities of the same nature are placed in the same division, and their order of appearance corresponds to temporal sequence. Transcription methods based on HTML and XML technology can be generally defined as hierarchical, since both the HyperText Mark-up Language (HTML) and the eXtensible Mark-up Language (XML) hierarchically encode electronic documents in order to ease their interchange over the internet (see the TEI guidelines for a detailed account of structural relations and Document Type Definition in this type of data model,  and Dodd 2008 for an introduction to Xaira, a tool based on TEI conformant hierarchical XML).

Both time-based and hierarchical data models try and transcribe strings and undo knots of information (Tilley, 2003: 758) so as to “re”-present audio-recorded interactions. The former uses the timeline as a benchmark, transcribing equally important strings of sounds speakers produce after one another or at the same time; while the latter detects, or artificially constructs, the event structure, and then transcribes strings having different value according to their position in the hierarchy.

In spite of this huge difference, there has been one attempt at bringing the two data models, time-based and hierarchical, together in the tool Exmaralda. It was as recently as 2005 that Schmidt concretely formulated “one scenario where one particular time-based data model is brought into accordance with one particular subset of the TEI guidelines for transcription of speech” (2005:3). Admitting that there might be other ways of bringing the two data models together, Schmidt (2005) puts a special emphasis on the technical aspects of his solution, and provides practical guidelines for Converting the Exmaralda Basic-Transcription format to the TEI format.

Before I provide some theoretical and technical information on the feasibility of an integration I have finally opted for, a step backwards is necessary in order to better outline its rationale. For my Master’s Degree Thesis (Niemants, 2008) I also dealt with oral data, which I transcribed in TEI-XML and subsequently queried with the software Xaira (Dodd, 2009). Xaira is a general purpose XML search engine which operates on any corpus of well-formed XML documents, and which is best used with TEI-conformant documents. Through the use of a CSS style sheet, the “apparent dichotomy between machine-friendly and reader-friendly formats” (Cencini & Aston, 2002: 57) can be solved, and one can end up with easily and intuitively readable displays. Given that TEI and Xaira had brought good results in the MA, when I embarked on my PhD I had strong reasons to believe in hierarchical data based models and to replicate the experience with Xaira. But since there appears to be no good transcription tool that supports TEI natively, I needed software which could help me with the transcribing activity and which could also allow me to export or easily convert transcripts into TEI-XML. When I first heard of Schmidt’s (2005) integration between time-based and hierarchical data models, Exmaralda appeared to be the best possible solution to my needs. But when I downloaded it, I found it quite complex to use its basic functions, and I therefore decided to test other three tools (visit for a technical account of such a testing) before making a final, and well-informed, decision.

What follows is the justification of that decision, along with some technical remarks on the integration of time-based and hierarchical models in Exmaralda.

By and large, time-based data models have the huge advantage of putting things in relation to time, without necessarily requiring to distinguish them according to their function or to their relation with other things happening in the same or in other divisions. As such, they may be suitable to carry out a Conversation Analysis (CA) of interactions which basically tries and answer the question “Why that now?” (Heritage & Clayman, 2010: 17, my emphasis). On the other hand, hierarchical data models do run the risk of over-interpreting or artificially constructing the hierarchy of what is happening in conversation, but they also offer a big advantage: by relating things to one another, and not simply to the time-line, they can also provide answers to the basic question of CA, shedding light on particular turn patterns and practices occurring within specific subdivisions (i.e. questions-answers in history-taking).

In principle, the strengths of the two approaches are complementary, and combining them should result in a greatly enhanced view of interactions. In practice, such an integration is still far from an easy task, because Schmidt’s (2005) guidelines are useful for a start, but too generic to cover the specific needs of particular research projects and transcription conventions (not to mention the fact that the XML exported is not entirely TEI conformant).

Here is a summarized version of Schmidt’s guidelines, which first made me dream of a possible integration between time-based and hierarchical data models, then helped me make this dream come partially true:

-    Follow a set of simple conventions when creating or editing a transcription in the Exmaralda Partitur-Editor (Schmidt, 2005: 23);
-    Use the export filter to transform the resulting Exmaralda Basic-Transcription file into a TEI-XML file and precisely to automatically
o    transform the Basic-Transcription into a Segmented-Transcription in which consecutive events in transcription tiers are grouped into segment chains;
o    calculate a so-called List-Transcription on the basis of these segment chains;
o    apply XSL transformations to turn what is already structurally very similar to a TEI transcription into a TEI conformant document.

If I say partially, it is because the implementation of the import and export Schmidt (2005) talks about is “proof-of-concept”, that is it shows that compatibility between Exmaralda and TEI is possible in principle by showing that it works for a certain class of documents. It does however not necessarily work for any kind of Exmaralda and TEI document. In other words, I had to figure out what I aimed to capture from the oral data so as to ask for a modification of the export filter according to my own transcription conventions and research needs.

Fabricated example of what my transcriptions look like in Exmaralda Partitur Editor

It is very hard to account for changes in transcripts which have been for long in the making, and whose conventions result from a complex decision-making process that definitively ends upon finishing transcribing. But I can try and sketch the main decisions that affected my transcription conventions in the Exmaralda Partitur Editor so as to give you an idea of what they look like. By and large, I decided to separate what is uttered (things like words, laughter or pauses attributable to a particular speaker are transcribed in Tier T category V, where T stands for Transcription and V for verbal), from how it is uttered (things like tempo, volume, intonation and prosody are transcribed in Tier A category P, where A stands for annotation and P for prosody), from the function that may have (labels like narrative or interactional speech are transcribed in Tier A category F, where A stands for annotation and F for functional). I also decided to dedicate one line (Tier D category S) to the description of silences and noise that cannot be attributed to any particular speaker, and to reserve another line (Tier D category E) to non phonological and non lexical elements I may need to describe (i.e. an external element such as a phone call or the noise of a printer and the like).

As you may remember from the post I have written on Transcription tools, Exmaralda allows for multiple exports in different formats so that one is not bound to use the programme default visualization (Partitur) and query device (Exact). The .rtf output does not cause any problems, since information contained in all of my verbal, prosodic and functional tiers is output in that format. Some problems however arise when exporting data in .txt, .html and, what interest me most, in .xml. More precisely, some pieces are unfortunately lost in the .html segment chain list (mainly overlaps) and some others are still difficult to interpret for Xaira in the .xml export (the main problem resides in the fact that Exmaralda does too much with <seg>, which potentially leads to problems of overlapping elements).

Fortunately, however, the two scholars who are in charge of software development and dissemination, Thomas Schmidt and Bernt Meyer, are extremely helpful with people wanting to transcribe with Exmaralda. They always replied to my often confused emails, they repeatedly provided feedback and advice, and they concretely implemented Exmaralda with a filter they called TEI Modena export, from the town where I carry out my PhD project and the TEI conformant XML I need for querying the corpora with Xaira (visit to see all changes to Exmaralda).

Although I still have some problems with the .html and .xml export, I am confident enough that we can solve them by the end of my PhD project, ending up with a suitable .txt, .html and .xml export I won’t fail to share on this website.

In the meanwhile, if it is true that “experience is what you get when you do not get what you want” (Dan Stanford), I think I got a lot of it while preparing this post.

References may be found on the Reference page:

Leave a Reply