PDIG Training Grant Spotlight: Yoshiko Oka

students working on laptop

By: Yoshiko Oka

I was selected as one of the recipients of the Provost’s Digital Innovation Training Grant for the 2018-2019 academic year, which enabled me to attend the “Python and Machine Learning Bootcamp Series” at the General Assembly. This blog post is about my experience with this program.

As a doctorate student in economics, I conduct research that involves collecting data and running regressions. We usually use the statistical software STATA to run regressions; however, some of the datasets we use are so complicated that STATA does not have the functionality to handle them. This situation made me realize that knowledge of coding such as Python or R could be a huge help when conducting these special types of analysis.

Unfortunately, our economics program does not offer courses that involve Python coding, so I began teaching myself Python coding and machine learning, which I hoped to incorporate into my research in the future. Still, I searched for in-class training; however, the courses offered at ed-tech companies are usually very pricy. The Provost’s Digital Innovation Training Grantgave me the opportunity to take an in-class bootcamp in Python coding called the “Python and Machine Learning Bootcamp Series,” which is an intensive workshop offered by the ed-tech company General Assembly.

This workshop was a two-day bootcamp scheduled from 10 am to 5 pm on Saturday and Sunday, located in the General Assembly building on 21ststreet in Manhattan. We were required to bring a laptop with Anaconda, which we were asked to install in advance. This introductory course included about 15 students, most of whom had never touched Python and had no experience with other programming languages. The instructor started by teaching us how to use Jupyter notebook and some of its basic functions. Then, he moved on to teaching Numpy and Pandas. The class was very beginner-friendly: the instructor taught some functions and then students tried exercises using the functions on their laptops. While students were working, the instructor went around the room to make sure everyone got the same result and no questions were unsolved. By the end of the first day, we were able to download a large dataset and then learn how to run regressions and plot the data. Since I had taught myself Python and was already using it, the first day was a review of what I already knew, but I really enjoyed the atmosphere of the workshop and the interaction with the instructor who was a very knowledgeable data scientist, something I was not able to get from self-teaching or taking online courses.

The second day focused on machine learning where we learned the basics of setting up, implementing, and evaluating machine learning models using SKlearn. This day included learning the basic concepts and working exercises on training and test sets, including K-fold cross validation. The curriculum also covered the L1 and L2 Regularization.

My honest feeling toward the workshop overall is that it may not be ideal for most students to combine an introduction to Python coding and machine learning into 2-day bootcamp. For absolute beginners to Python coding, who have no prior knowledge, the course would be very challenging.  Furthermore, students may not retain information after learning Python basic coding in one day and then studying machine learning the next day. For someone who already knows basic Python coding, they would learn more new techniques if they took courses focused on machine learning that cover a variety of topics. But the quality of this workshop was great, and the instructor was fantastic, so I would recommend taking courses at General Assembly.  I recommend choosing the right course for the student’s level.

It has been a few months since I participated in the workshop, and I have been using Python to work on the data for my dissertation. Knowledge of Python coding has given me more flexibility and has made it easier to deal with any kind of dataset. Since I am in the economics program, I’m surrounded by people with a lot of knowledge of statistics and data analysis. However, when it comes to Python coding, I didn’t know who to ask questions. I appreciate the Provost’s Digital Innovation Training Grantfor giving me the opportunity to take a formal training class on Python because the knowledge that I obtained from the experience expanded my range of possible research topics in the future.