Parsed Corpus Projects

The “acceptability judgment task” has, as an experimental technique, led to much progress in our understanding of the human capacity for language, as I discuss with my colleagues in this paper. But the theory itself (and therefore, the effectiveness of this methodology) is limited by our understanding of the empirical facts of natural linguistic production. To build theories and construct effective experiments, we have to have an accurate picture of the syntactic patterns found in vernacular speech. The fact that individual linguistic behavior is variable adds a further challenge to experimentation, as I discuss in Chapter 10 of this book.

These concerns gave rise to my interest in building audio-aligned and parsed corpora of vernacular Englishes (which was influenced entirely by the work of Anthony Kroch and Beatrice Santorini on the Penn Parsed Corpora of Historical English). My collaborators and I are currently building two parsed corpora of vernacular speech that are large enough to provide the frequency data that researchers need, in order to understand syntactic variation.

Please explore the links below to learn more about these corpus projects (and the research generated from them):


Tortora, C., B. Santorini, F. Blanchette, & C.E.A. Diertani. to appear. The Audio-Aligned and Parsed Corpus of Appalachian English (AAPCAppE).


Tortora, C., C. Cutler, B. Haddican, M. Newman, B. Santorini, & C.E.A. Diertani. in progress. A Corpus of New York City English (CUNY-CoNYCE).