...wanted a means of obtaining corpus evidence in regards to spoken language, until recently the difficulties of transcribing large quantities of text have prevented the construction of a spoken corpus of the necessary size (Crowdy 224). However, Longman, as part of the British National Corpus (BNC) project, has created an orthographically transcribed corpus of ten million words that cover a wide range of speech variation (Crowdy 224).
The BNC is an extremely large corpus of modern English, consisting of roughly 100 million spoken and written words (What is the BNC?). The BNC provides a unique "snapshot" of the English language, which is presented in a manner that makes it possible to do almost any kind of computer-based research on the nature of language (UsesCorpus). John Sinclair defines "corpus," as the...