Intro to Text Analysis in R
March 22 @ 12:00 pm - 1:30 pm
Working with text is an important component in many academic and industrial contexts. Whether your aim is to understand public reactions to the latest political event on Twitter, determine the appropriate target demographic with your product, or analyze DNA sequences, manipulating and restructuring text is a fundamental task. Fortunately, recent developments in the R programming environment have made many of these tasks more intuitive. The tidytext package, a new addition to the tidyverse collection of packages, provides an intuitive interface for manipulating text data. Using the tidytext package, you will learn how to take unstructured song lyrics, wrangle them into a “tidy” format for analysis, and perform a basic sentiment analysis with the data. The focus of this workshop will be on how to manipulate text data into a format ready for analysis, regardless of the software you wish to perform the analysis in (although R has some strong capabilities in this area too). This is an intermediate-level workshop, where we expect participants to be familiar with R and the tidyverse, including pipes (%>%) and the dplyr functions select(), filter(), mutate(), group_by(), and summarize().