Welcome to BangorTalk

The ESRC Centre

This site holds the conversational corpora assembled by the ESRC Centre for Research on Bilingualism in Theory & Practice at University of Wales Bangor.

We are seeking to gain a greater understanding of how bilingual individuals in a variety of communities manage both their languages within the same conversation.

The questions we consider include:

  1. Do bilinguals in different types of communities handle their two languages in different ways in conversation?
  2. How do social variables such as class, age and gender affect the way people handle their two languages in informal conversations?

The corpora

To date, we have assembled three corpora:

Summary data for each corpus:
WelshEnglishSpanishindeterminateTotal (words)
Siarad84%4%---13%447507
Patagonia78%<0.5%17%5%192939
Miami---63%34%3%264579

Search for a word across all conversations

A new search page is available, which returns 20 instances of a word from all conversations in the Siarad or Patagonia corpora. The conversations are combined into one file, but some information such as glosses and (optionally) transcription marking is removed.

Publications

A number of publications and presentations have resulted from mining the corpora for the linguistic information they contain.

Collaborators

The researchers have received input and assistance from a variety of collaborators around the world. We have also received help in translating the Miami corpus from a number of people, listed on this page.

Tools

Our corpus material is transcribed and annotated using the CHAT and CLAN applications developed by Prof Brian MacWhinney and Leonid Spektor at Carnegie Mellon University. Our Siarad data is also available via the Talkbank portal (although the version there differs slightly from the one on this website.)

To gloss the Miami and Patagonia corpora we are using autoglossing software we have developed in-house. To mine all three corpora we are using a variety of techniques, including the output from the autoglosser.

Ethics

The ESRC Centre has collected these materials following the ethical guidelines set out in the Talkbank Code of Ethics.

Licensing

The material on Talkbank and on this site is available under the Free Software Foundation's General Public License. This means that you can access it freely and use it however you like. We would be grateful, however, if any such use could also acknowledge the ESRC Centre.

Change language


Contact us

bilingualism@bangor.ac.uk


The corpora

The Siarad corpus
The Patagonia corpus
The Miami corpus


Research Team


Collaborators


Publications


Bangor Autoglosser


Acknowledgements

The support of the Arts and Humanities Research Council (AHRC), the Economic and Social Research Council (ESRC), the Higher Education Funding Council for Wales (HEFCW) and the Welsh Government is gratefully acknowledged.