Digital Research Institute – My Experience

This is a guest post by Jenna Freedman, a librarian at Barnard College and Master’s student at the Graduate Center. Jenna writes about her experience at the January 2017 Digital Research Institute. Follow her on Twitter .

Going into the Digital Research Institute, my fear was that I’d be the biggest doofus in the room, the one everyone would be waiting for to catch on or catch up. I’m grateful to say, that fear was not realized. I felt like the cohort was selected or self-selected to have comparable skills and aptitudes for learning new technology tools. Even better, I think even if I had turned out to be someone who needed a lot of help, it was a safe and comfortable environment for that. There were enough digital fellows to go around, and they were all astonishingly patient and helpful.

 

Picture of sticky notes
We stuck color-coded sticky notes on our computers to indicate if we needed help–and later in the week if we needed a break. Public domain sticky note photo from OpenGridScheduler on Flickr

The four-day institute was designed so that each workshop built on the last. We started with an introduction to the command line, taught by Mary Catherine Kinniburgh. I’ve tried to learn the command line before, but never with as much success. It seems super obvious now, but Mary Catherine’s instruction made clear the connection between creating folders and files and moving them around in a GUI interface is the same thing as doing it on the command line. What was once abstract was now concrete! MC, like all the other fellows, was invested in our learning–kind about it, and funny, too.

We put what we’d learned on the command line to work in the next session, which was on Git and GitHub, led by JoJo Karlin, who has been the fairy godmother of the DH Praxis class. They’re two different things! Who knew? I’ve had a Github account for ages, but never tried to tackle it from the command line, which means I couldn’t do very much with it.

Next up was Python, enthusiastically shared by Patrick Smyth and Rachel Rakov. The rest of the week would incorporate Python. It was just a taste, and at this point I couldn’t do anything with Python without following instructions with screenshots and circles and arrows, but it was a good foundation. Another thing that seems ridiculously easy to programmers, but that was an essential lesson to us n00bs was how to toggle back and forth between the plain old command line (known by many names, but I’ll use Bash here) and Python. Cmd/Ctrl L 4 ever! Pardon my enthusiasm, but I really love keyboard commands. (Cmd/Ctrl K and Cmd/Ctrl Z are some other favorites.)

DRI attendees
Attendees at the 2017 Digital Research Institute

I think all of the students’ attention and ability to absorb information waxed and waned, depending on the time of day and what was in our stomachs, as well as the content of the workshops. For me one of my wane times was the Databases workshop led by Ian Phillips. I’d even taken a similar workshop during the fall semester, and yet, I’m having a hard time making the connection between the commands and rows and columns. My table neighbor was all over it, and I was jealous of her comprehension. I hope next time I work with MySQL it will finally click the way command line did the first day.

Rachel Rakov, the passionate Python teacher, made Text Analysis/NLTK fun by choosing a varied library of texts to work with, like a Monty Python script and Moby Dick. Maybe that’s a standard library, but new-to-us and fun to compare. One of my biggest takeaways from this session was what a large percentage of the dataset you need to set aside for training the tool vs. for testing it: 75-80%! I found myself being critical of what someone my do with the tool that might be developed from the sample we used. I’m not clear if the technique is developed for the corpus only, or if it’s meant to be able to be applied elsewhere, even if that one corpus is comprised of writings that don’t represent multiple demographics. I was still brain fatigued, and my notes don’t make a ton of sense to me. I’d like to repeat this workshop and then take a part two, as text analysis is highly appealing to me.

The clearly brilliant Hannah Aizenman led the ultimate brain smushing workshop: Machine Learning, aka Quantitative Analysis. She used as her dataset the Titanic manifest, which I thought was hilarious, especially given that it was Inauguration Day, something about which many? most? all? of us in the room had a deeply sinking feeling. I don’t know if it was the timing, if I’d reached the limit of what my brain could absorb or what, but I only kept up for the first half of the workshop.

I was glad that the final training of the day, making Twitter bots, team taught by Patrick Smyth and Steven Zweibel, was pretty straightforward. The session was a fun and silly introduction to APIs with several references to a certain political candidate whose followers had made much use of Twitter bots. We made our bots with Python, and it was satisfying to come back to the Python command line after using Jupyter Notebook for the previous sessions. It almost felt easy!

I was so brain-tired after the week that I failed to make it to the women’s march the next day, instead sleeping until past noon. I intended to go through all of the exercises in the repository again afterward, but because I moved apartments and the rest of my life busted back in, that never happened. I’m grateful that Patrick asked me to write this blog post so I could get a chance to revisit the work I did at the institute. I would definitely attend again and recommend it highly to fellow GC students and CUNY professors.