34 35th St., Unit 26, Brooklyn, NY, 11232
Twitter Dialects - Carnegie Mellon University
Shared by reBlog @ Eyebeam
Microbloggers may think they're interacting in one big Twitterverse.
But researchers at Carnegie Mellon University's School of Computer Science find regional slang is as common in tweets as conversations.
Postings on Twitter reflect some well-known regionalisms: Southerners' "y'all," and Pittsburghers' "yinz." There are also the usual regional divides in references to soda, pop and Coke.
Jacob Eisenstein, a post-doctoral fellow in CMU's Machine Learning Department, says more are evolving within the medium.
It shows in the automated method he and colleagues developed for analyzing Twitter word use.
In northern California, something cool is "koo." In southern California, it's "coo." In many cities, something is "sumthin," New York City tweeters favor "suttin."
Eisenstein said some of this usage clearly is shaped by the 140-character limit of Twitter messages. But geography's influence also is apparent.
Automated analysis of Twitter message streams offers linguists an opportunity to watch regional dialects evolve in real time.
"It will be interesting to see what happens. Will 'suttin' remain a word we see primarily in New York City, or will it spread?" Eisenstein asked.
The CMU team used to a statistical model to recognize regional variation in word use and topics.
The model predicted the location of a microblogger in the continental U.S. with a median error of about 300 miles.
Eisenstein recently presented at the Linguistic Society of America annual meeting in Pittsburgh. The paper is available online.
For this study, Eisenstein and his co-authors collected a week's worth of Twitter messages in March 2010.
They selected geotagged messages from Twitter users who wrote at least 20 messages. It yielded a database of 9,500 users and 380,000 messages.
Eisenstein was joined by co-authors Eric P. Xing, Noah A. Smith, and Brendan O'Connor.
Xing is an associate professor of machine learning. Smith is an assistant professor in the Language Technologies Institute (LTI). And O'Connor is a machine learning graduate student.
The research was supported, in part, by funding from Google, the Air Force Office of Scientific Research, the Office of Naval Research, the National Science Foundation and the Alfred P. Sloan Foundation.