December 11, 2018
The Data Open is a series of competitions we host at universities around the world to allow students to work through complex datasets and showcase their problem-solving skills. At each event, participants present their approach, findings, and insights to a panel of judges. We facilitated more than 20 Data Opens in 2018 and we’re thrilled to bring you advice from some of the winners over the next few months. We recently spoke with two-time Data Open winner Jason Liang. Jason is currently pursuing his PhD in Operations Research at the Massachusetts Institute of Technology (MIT). Read below for excerpts from our conversation with Jason, lightly edited for style, to learn about how he first discovered his passion for data science, how he approached the assigned problem at the Data Open, and what advice he has for future Data Open participants.
Jason, how did you first become interested in data science?
My first exposure to the field of data science was at Columbia University, where I delved into computational biology and machine learning research during my undergraduate years. What was most exciting to me was unearthing insights from massive amounts of data. I felt rewarded when I could get past surface level conclusions and discover the dynamics that truly drove behavior.
Can you describe what aspect of your studies interests you the most?
At MIT, I’m focused on machine learning, as well as game theory. What’s become most interesting to me is the differences between these two research methods. For example, when leveraging machine learning, you’re generally given data that you run through a model in order to receive a result. But with game theory, you are able to design a mechanism to identify agents that might be “untruthful” and train the data to capture these unfamiliar agents in order to get a truthful result.
Additionally, through my studies in operations research, I’m focused on the decision-making process that occurs after you run data through a model, which to me is more interesting than the actual data.
What intrigued you about the Data Open?
I first heard about the Data Open through my involvement with a student organization at Columbia University. I was drawn to the unique format of the Data Open, which requires you to submit a comprehensive report detailing your approach to the problem and the results you produced.
While I looked into other competitions, I particularly enjoyed the Data Open because it offered me and my peers the opportunity to address real world problems with real world data. To be successful, I had to think through defining the problem, contextualizing the data, and testing specific conclusions.
Beyond this, the Data Open provided me the opportunity to meet interesting people with diverse data science backgrounds, and I ended up making friends through the competition. It was fascinating to see how other people approached the same problem.
What advice would you give to future participants?
I would encourage future participants to think outside of the box and leverage your team’s unique strengths and skillsets. A lot of people would tend to use a standard and well-established machine learning package to analyze the data. However, my teammates all had diverse backgrounds and we were successful in incorporating our own academic research into the analysis to discover a novel approach.
At the same time, it’s important to convey how you solved the problem in a clear and concise way while producing your report. I would encourage participants to set aside significant time to think about how they’re going to communicate their conclusions; communication is almost as important as the conclusions themselves. We found success in clearly explaining the models we used and the results that were produced in a non-technical way. We believe this helped when presenting to people who might not have familiarity with the methods used.
Lastly, it was invaluable to talk with other participants and learn how they solved the problem and understand their methods. You’re able to expand your own knowledge and can incorporate these new disciplines into your own research, studies, and career after the competition.
Do you have a passion for tackling the most interesting and critical changes that impact the global markets? Learn more about the Data Open and how you can compete amongst the brightest intellectual athletes here.