Hamburg MapTask Corpus
The Hamburg MapTask Corpus (HAMATAC) is a spoken language corpus
documenting the performance of 24 L2 learners of German in a map task.
HAMATAC was recorded and transcribed in project Z2 at the Research Centre on Multilingalism.
The current version 0.3 contains a new communication with video recording as well as the resources known from the previous version, e.g. orthographic transcriptions of the recordings, manual annotation of disfluencies and automatic annotation of part-of-speech and lemmas.
We plan to add further annotations in future versions.
By using the HAMATAC corpus, you agree to:
- use it for non-commercial research and teaching purposes only
- not redistribute it to third parties
- cite the following source in any published work which is based on the corpus:
The following documentation is available:
- A PDF document explaining the design and structure of the corpus.
- The maps used in the experiment: Map 1 [with path] [without path] / Map 2 [with path] [without path]
- A HIAT transcription manual explaining the conventions used for orthographic transcription in the corpus
- The STTS tag set documentation used for the automatic POS-tagging with the TreeTagger and the manual tagging of main part-of-speech
- The annotation guidelines for the disfluency annotation in the corpus
- A PDF document explaining online and offline use of EXMARaLDA corpora
Online data (password protected)
The following data can be viewed online
- A corpus overview which links to all transcriptions, recordings, visualization and export documents
- A corpus statistics organised by communications
- A corpus statistics organised by speakers
- A wordlist for the whole corpus
- [NEW] Online query interface
Downloadable data (password protected)
The following data can be downloaded for offline use:
- A zip archive with all data in EXMARaLDA formats (basic transcriptions, segmented transcriptions, Coma file)
- A zip archive with transcriptions in FOLKER format
- A zip archive with transcriptions in ELAN (*.eaf) format
- A zip archive with transcriptions in TEI format
- A zip archive with transcriptions in Praat TextGrid format