Overview:

Based on my own experiences as a high school student and my recent observations of high school students, demonstrating the relevance of books, especially old ones, to some members of this age group may be challenging. Teachers can use graphs generated by Google Ngrams to connect the present day of their students to texts across time. Students can see the interests of a culture reflected in its writings. Teachers can also use the tool to demonstrate the fluid, living nature of language.

The user enters a word or phrase of interest into the Google Ngrams Viewer. The Viewer then generates a graph of how often this word or phrase has occurred since the 1500s. It samples from a collection of 5.2 million books that have been digitized by Google Books.  For instance, in the example below, I entered the word “technology. “ The Viewer returned a graph showing a sharp increase in the use of the word after 1960. Perhaps these results show a change in language, or perhaps it models a cultural shift. With these results, I could speculate with students whether the increase occurred because people had previously been using a word other than “technology”, or whether it occurred because people began writing a lot more about technology during the Cold War.

Example 1: http://mjhofe.people.wm.edu/modules/TechnologyNgram.png

Users can enter more than one search term for comparison. For instance, students might be interested in comparing how frequently people have written about vampires versus zombies or werewolves. Finally, users can click on the hyper-linked years below the bottom axis (see above screenshot) to view sample occurrences of the word or phrase within Google Books.

These capacities make the tool an intriguing one for language study in the classroom. Students can easily enter search terms and parameters into the viewer.  Teacher bloggers have various suggestions for using the viewer to search for or demonstrate linguistic and cultural change.  However, the tool can easily be incorrectly used to “support” erroneous conclusions, and both teachers and students should know the limitations of the tool in order to use it effectively.

 


 

How to Get Started with Google N-grams:

 Again, the basics of the viewer are almost intuitively grasped. To begin, visit http://books.google.com/ngrams

1.       Enter words or phrases of interest in the box at the top of the screen.

2.       Users can choose which years between 1500 and 2010 they would like to search by typing in years in the boxes directly below.

3.       Users can also select a corpus, or the body of texts which will be searched. They can choose between several languages. There is also a distinction between American English and British English. The user can further narrow by selecting only English works of fiction.

4.       The user can also select the rate of smoothing. Google explains this in “About Google Books NGram Viewer” tab. Smoothing is averaging the values around any given point so that the resulting graph shows a smoother, easier to interpret pattern.

5.       Click on “Search lots of books” and view your graph!

6.       Explore in-context occurrences of the word by clicking on a time span on the bottom bar.

 More background details are necessary for a correct interpretation of results and manipulations that maximize learning. Below are some helpful links:

Jean Baptiste Michel and Erez Lieberman Aiden, both instrumental in the tool’s existence, present an enticing introduction to Ngrams, including some tips for using the technology and some hints at its possibilities. Their TED talk offers an inspiring starting point.

http://www.ted.com/talks/what_we_learned_from_5_million_books.html

The following video by Tekzilla provides a less ingenious, but shorter cover of Google. This could be useful for introducing the tool to students in a limited amount of time.

https://www.youtube.com/watch?feature=player_embedded&v=gQZydEZ6G0U

Google itself provides a fairly clear outline of how to set the parameters and frequently asked questions here:

http://books.google.com/ngrams/info

Finally, this site offers some practical tips and cautions for approaching the corpus, or data set. For instance, it notes that the most reliable values are those gathered for English between 1800 and 2000.

http://www.culturomics.org/Resources/A-users-guide-to-culturomics


 Classroom Examples:

Teacher thoughts

Introduced at the end of 2010, Google Ngrams does not yet have many solid examples of its actual use in the classroom. However, it has clearly captured the interest of the English teachers-who-blog crowd, several of whom have intriguing suggestions for its use, albeit no follow-up on whether they successfully used it in their own classrooms.

Peter Pappas suggests using the viewer as an introduction to research methodology. Because of the ease with which students can manipulate the data, they could practice making testable hypotheses with the tool in less than a class period.  Pappas does not have a write-up of actual teaching experience.

http://www.peterpappas.com/2010/12/how-to-quantify-culture-google-ngram-viewer-explore-500-billion-published-words.html

This blogger has vocabulary-related suggestions. For instance, students can view the earliest use of a word and comment on whether it has changed.

http://teachingwithinfographics.wordpress.com/2011/06/09/google-books-ngram-viewer/

Coolcatteacher had students look up their names to start off. This is the first instance I could find of a secondary English teacher actually implementing the technology. Another teacher commentator on her blog suggested tracing words coined by Shakespeare and Chaucer.

http://coolcatteacher.blogspot.com/2011/09/new-authentic-research-frontier-google.html

College professor bloggers reported some limited use of the tool. One used it to graph the use of the word “Caucasian”, while another suggested its uses for thesis research to a student.

 

http://hastac.org/blogs/derekattig/whiteness-different-color-google-ngram-version

If a teacher were to compile a good lesson plan, this site would be a good place to share it:

http://www.google.com/apps/intl/en/edu/lesson_plans.html

Teachers have shared various other lesson plans that use Google Apps such as Google Maps or Google Docs here.

In addition to these “leads” on using Google Ngrams in the classroom, there are many existing lesson plans on language change in which Google Ngrams could very easily be incorporated.

Example 1

       “Language Change: The Origin of Names,” from TeachLing. http://www.teachling.wwu.edu/node/35

This lesson introduces students to the idea of English as borrowing words from different languages, and allows students to speculate about why English borrowed from certain languages. The teacher provides students with the etymology of their name, and students discuss patterns. By incorporating Google Ngrams into this activity, we could add a different dimension. Students could see when their name became popular in print, and students could compare the popularity of their names over time.

Example of graphed name: http://mjhofe.people.wm.edu/modules/NameNgram.png

 Example 2

      “Linguistic Application in the Classroom: Introducing Language Exploration” from North Carolina State University http://www.ncsu.edu/linguistics/filsoncurriculum.php

This 9th grade language unit, aligned with North Carolina state standards, offers linked lessons introducing students to the fluidity and variance within a language. While the lessons are structured and interwove, Google Ngrams could possibly be a useful tool to interject in “Day 3: Language, It’s Alive!” This lesson includes an activity in which students must compare dictionaries from different decades to understand that language change still occurs today. If teachers do not have dictionaries spanning different decades, or if students are struggling and may shy away from a dictionary-centered activity, teachers could have students manipulate google ngrams to make the same point.

Example 3

“Language Constantly Changes,” from Washington Post Integrated Curriculum https://nie.washpost.com/nielessonplans.nsf/0/EB1F98C3208AF36A852570FC0081D86A/$File/LanguageFinal.pdf

This lesson plan includes articles that students read about language change. A brief google ngrams search after reading the article could illustrate the meaning after the article. For instance, after reading “Language Constantly Changes to Fit Our Needs and Interests”, students could search for terms the article references, such as “quidditch” and “spam.”

Example 4

 “The Punny Language of Shakespeare,” by Jan Madden, from PBS In Search of Shakespeare http://www.pbs.org/shakespeare/educators/language/lessonplan2.html

Madden’s lesson plan aims to make Shakespeare’s language more accessible to students, an imperative for engaging students in his works. In one part of her lesson plan, she suggests familiarizing students with the different words used by both Shakespeare and his Elizabethan counterparts by having them look through an online glossary. Incorporating Google Ngrams can better give students a sense of the change in language between the 1500s and today by allowing them to enter unfamiliar words and tracing them. They can also, as suggested by a teacher blogger, enter words that have their coinage attributed to Shakespeare to appreciate his impact on their language.

 

 


 

Assessing ... for the Classroom:

Pros

Google Ngrams offers a novel way to engage students. Teacher-created examples, or examples from the TED talk, will quickly intrigue students, even if a teacher simply incorporates one of these graphics into his or her presentation of material.

Google Ngrams is easy to use. Students can easily test the patterns they suspect or wonder about. As Peter Pappas noted, the tool’s potential for student research is large. Students can authentically test hypotheses. By connecting students to such a large quantity of data and providing them with the means to manipulate it, the tool offers an intrinsically motivating entry point to begin talking about research and empirical support without becoming bogged down in data collection or statistical analysis.

Google Ngrams is quick and results are immediate. Student use of this tool could be incorporated into an existing computer lab activity, so testing it out in the classroom is not a huge time liability.

The focus of the tool on the empirical and concrete may introduce students to a different dimension of English and language study, engaging students who may usually dislike English.

Cons

It’s very easy for students to misinterpret data, as Michel and Aiden wittily demonstrate in their TED talk with their comparison of “best” and “beft.” For this to reach its potential as a meaningful tool, teachers would need to spend time determining student thinking behind the conclusions they draw and probing students to find logical errors. For instance, teachers should be wary of assertions that a change in word frequency signifies a shift in what a culture thinks about or is concerned about, as it could just mean that the culture now favors other terms to describe a concept.  The lesson plans examined use the viewer primarily for exploring how language itself shifts, so this phenomenon is not as problematic. Teachers should also point out that the tool can only measure what has been printed, which may not always reflect what the general public is talking about.

The corpus suffers from optical character recognition (OCR) errors, so that words or phrases within a text may have been incorrectly transcribed by equipment. This skews graphs and could potentially confuse students.

Scaling may frustrate or confuse students. For instance, if you compare an incredibly popular word, such as Shakespeare, with a well-known but significantly less popular word, such as Zora Neale Hurston, Hurston becomes a flat line at the bottom of the graph.


 

Considerations for Teachers:

Provide students with a framework for using the tool, rather than simply turning them loose with it. While they can certainly experiment on their own, guiding them through one approach and talking about the results may give them more ideas for their own tests and more familiarity with the tool.

-Because the corpus is most valid in English between the years 1800 and 2000, consider activities that could fit within this time span. For instance, many terms I searched spiked right after 1960, such as “equality” and “Civil Rights”.

-If using Google Ngrams specificially to illustrate language change, consider integrating this with other things your class is currently studying. For instance, the fourth featured lesson plan integrates Sheakespeare and language change. For a unit on WWI or WWII poetry, the focus could be on terms that relate to warfare.

Explain the parameters and variables to students. For example, explain to them that the smoothing tool better shows trends by presenting them with an average rather than a raw data point, and show how varying this varies the graph. Show how the tool searches different bodies of data by switching between American English, British English, and English Fiction. An understanding of these will give them more ownership over their own projects.

-Along these lines, have students search in the other available language bodies. The first lesson plan, which uses the etymology of student names, could benefit from this. This could also appeal to ESL learners if their language was represented.

Caution students against using the tool to search for obscenities or words that they would not be allowed to use in school, as this might be tempting for them.

If time permits and Google Ngrams is a large part of the lesson, the TED talk might be worthwhile student viewing. The speakers give a motivating picture of the tool’s potential and add to a discussion of language change by taking such a “big picture” view of culture and cultural change.

Thanks to the innovativeness of Google, we can expect this tool to rapidly evolve. Continue monitoring its development for improvements in the data set and increased or improved features.

 

 

 

 

Do NOT follow this link or you will be banned from the site!

Non-profit Tax ID # 203478467