Project Luther
Lessons from Luther
Project Luther has finally concluded and it was a fantastic learning experience. While I wasn’t able to get a good r2 score, I learned a few things that I would tell anyone starting this project:
Get Started Quickly
This is the most important step and in retrospect, this is the best thing I did. Baseball Reference was one of the websites that was recommended and I immediately jumped into pulling the pages and data I needed. By the middle of the first week, I already was pulling data I wanted and it allowed me a good baseline from which to pivot if I needed. When I realized that my original idea wouldn’t be sufficient, I went back into my source and was quickly able to adapt my code to get new data points to combine with the data I already had.
You’re going to want to change ideas
Continuing from the previous point, half way through the project I found a new data source that would’ve provided me with more advanced statistics that Baseball Reference did not have. At first, I investigated how difficult it would’ve been to switch data sources and after some initial discovery, I opted to stick with my original source. I used my original pull as a baseline and adapted my baseline to figure out how long it would take to pivot. In the end, while this is the data that I originally wanted, I had already sunk a lot of time and energy into pulling data from my original source. Additionally, I concluded that I did not have a baseline for understanding how long the rest of the project would take. While I would hope that it would be smooth sailing for the rest of the project, I understand the principles behind the planning fallacy and decided to move forward.
You don’t know what you don’t know
This is again linked with the previous point. It’s dangerous to overestimate your capabilities, especially early in the bootcamp. Delays and distractions come up and will impede your progress. Whether it is code issue or a hardware issue, these setbacks will impact you significantly. In my case, while I was pulling data from the site, I had to repeat the process multiple times (it took 8 hours to pull one of my data tables). In one of the instances, I did not have a try block in there to catch errors and it fell over quickly. In other instances, I had made changes to my code and did not account for them when I ran it at a later point. It took a significant amount of troubleshooting to figure out why I was having issues and it delayed and, more importantly, distracted me from my end goal.
Keep your eye on the prize
In the end of the day, any project needs to accomplish two things, it needs to be thorough and it needs to be actionable. The bootcamp is designed to get you ready for real world business problems and keeping that it mind will help drive you toward the results you want. While you can dive into rabbit holes and manipulate your data to get the highest r2 value, interpreting those results for someone who is not as technical will increase the difficulty of any project you do. By focusing on interpretability, you can simplify your presentation and your audience will be able to take your findings and make effective conclusions. In the end, always ascribe to the phrase, “Keep it simple stupid”
Link to Repo: https://github.com/stokvis4/projectLutherGit