Have you ever done a Google search to check if your writing is correct? Many of us do it all the time – especially when writing in our second language. The idea behind this approach is simple: The more results Google gives us (i.e. the more often our chunk is found on the Internet), the more ‘accepted’ it apparently is. For example, if we are not sure if the correct form is ‘looking forward to seeing you’ or ‘looking forward to see you’, Google will tell us it might be better to use the first (148,000,000 versus 15,800,000 results). This way, Google can serve as an incredibly useful tool to help us in our (academic) writing.
The idea of using a large language database to extract information is not new. The field of corpus linguistics uses such databases (or ‘corpora’) to analyse language systematically. For example, you can use the two freely accessible British National Corpus (http://corpus.byu.edu/bnc/) and Corpus of Contemporary American English (http://corpus.byu.edu/coca/) to analyse how a word or chunk might be used differently (e.g. more or less) in British and American English. The field of language learning and teaching also makes great use of corpus linguistics (Aijmer, 2009; St.John, 2001).The most widely used sources are perhaps the Academic Word List (Coxhead, 2000) and the Phrase List (Martinez & Schmitt, 2012), both providing learners of English with the most frequently used (academic) words and phrases in the English language. Overall, there is general agreement that the statistics derived from corpora show people what language they should learn and use – provided, of course, that these corpora contain accurate language.
This is, in fact, the main drawback of using Google as our corpus: the Internet covers texts from a lot of different sources and writers, containing a great deal of sloppy language. For example, searching for ‘is effected by’ shows us 38,000,000 results. Now this form may be accepted online, but is clearly not something we want to use in our academic writing. Another drawback of using Google as our corpus is that we have to revisit the browser contantly to check the frequency of different chunks.
In order to overcome these two issues, I developed an app called Writefull (http://writefullapp.com/). This app allows us writers to carry out these frequency-searches without having to visit the browser: we can select a chunk from our text and a popover will appear, displaying the information from the corpus. It also gives more accurate language results, as it uses the Google Books database (which contains language from over 5 million published books) instead of the Google web search engine.
These are the five options currently provided:
1) Checking the number of results of your chunk in the corpus:
This is basically the same as doing a Google search.
2) Comparing the number of results of two chunks in the corpus:
This is the same as doing two Google searches, but at the same time – this is perfect for finding out which of the two options might be better to use.
3) Seeing examples of your chunk in different contexts:
Sentences that include your chunk are extracted from various online sources and displayed. This gives you an idea of how your chunk is normally used in context.
4) Seeing which word is most frequently used in a gap in your chunk, which you specify with a star (*):
This is perfect for those cases in which you are not sure (or cannot remember) which word is normally used at a paticular spot. For example, ‘This is of * importance ‘ will list ‘This is of particular importance’, ‘This is of great importance’, ‘This is of special importance’, etc.)
5) Seeing which synonyms for a specified word are more frequently used in a gap in your chunk, which you specify with two stars (* *):
This option is very useful if you want a synonym for a word while keeping sure that this synonym actually suits your context. For example, ‘a *major* finding’ will list ‘a significant finding’, ‘a key finding’, ‘a principal finding’, etc.
I believe that the first two options will be welcomed by all of those who are in the habit of using Google to check their writing. Options 3, 4 and 5 will come in handy for anybody who writes and could use some feedback.
At this point, the whole idea of using corpus linguistics to check for accuracy is still unknown to most students. However, I do believe the corpus method can help a lot of writers – and this little app makes it easy to do.
The app is free to download from the website and can be used on both MacOSX and Windows.
- Aijmer, K. (2009). Corpora and Language Teaching. Amsterdam: Benjamins.
- St.John, E. (2001). A case for using a parallel corpus and concordancer for beginners of a foreign language. Language Learning & Technology, 5(3), 185-203.
- Coxhead, A. (2000). A New Academic Word List. TESOL Quarterly, 34(2), 213-238.
- Martinez, R., & Schmitt, N. (2012). A Phrasal Expressions List (‘The PHRASE List’). Applied Linguistics, 33(3), 299-320.