Building on solid foundations

Upon embarking into grad school, a good professor of mine warned me against the risk of “lack of foundations". As a young econ student I’ve initially registered this advice as fair yet almost obvious. It took me some good ten years to fully appreciate its implications.

Living in a time where change has never been faster and with so many good resources, embarking oneself into the goal of mastering a technical skill might feel overwheling. So many things to know, so much change in tools and techniques, so many smart people whose skills appear to be out of reach. Impostor sindrome at its best!

At the same time, the wide and ill-defined world of data science attracts talented individuals from wildly different backgrounds. From computer science, to research, philosophy and everything in between. Everyone will bring unique skills and perspective to the table and, most certainly, everyone will have gaps they need to fill. No shame in identyfing and addressing them as best as we can.

Here is a little secret: even experienced professionals might have their own gaps. Perhaps one made it to the field via her/his research skills and needs to brush-up on databases, or, on the contrary, one can be a strong software developer who might feel intimidated by advanced algorithms.

In my case I come from a research background, I make frequent use of SQL in my work activities, I use extensively use R for data analysis, and I’ve delivered one international project in phyton. Nonetheless, I still very much feel that there is more I can do to get to the bottom of this thing called data science.

So, here is a potential solution to deal with chaos: to invest into own's foundations.

This approach is, in my view, the safest one to ensure that our skills won’t deprecate quickly in the fast-evolving market, while shielding oneselves from the onipresent impostor syndrome.

Ok, cool words, but what does it mean in practice? After some search online I’ve randomly found one post by Danny Ma listing some key skills to learn while approaching data science from scratch. Here I’ll share a slightly modified version of his list:

SQL
Git
Python (as a programming language)
Probability/statistics
A/B testing
Traditional ML
ML experiments
Deep learning
My addendum: cloud technologies, leadership and communication skills

I really like Danny Ma’s approach. There’s is nothing I find less productive in a data scienctist than immediately jumping to a fancy library prior analysing the dataset’s features. Danny is preaching for the exact opposite: always start from a solid understanding of the data, cool techniques come later.

In 2022 I plan to go through this learning cycle for the resons I’ve explained above. I’ll take the first step by immersing myself into the 8 weeks SQL challenge. I’ll keep you posted on the developments.

Here’s Danny’s original post. Chek it out!