Common Misconceptions About Data Science
I have been involved in the online Data Science scene for quite a while now – about three years. Over that time, I've seen some advice, which, in my opinion, is not very good.
While it's a good idea to look for guidance on how to break into data science, you should be careful who you listen to. Ensure they have some credibility in the field and are not trying to sell you something. Also, see if others agree with their claims!
However, in my experience, many people are genuinely honest about their experience, but you have to tailor it to your personal situation.
So, in this article, I want to offer my two cents on some advice that may lead aspiring data scientists down the wrong path.
AI Will Take Your Job
This is one I see all the time and often appears in my comments section.
"Don't learn data science because AI will take over the job in a few years."
If AI took over data science jobs, I fail to see what other jobs it wouldn't take over.
- Software engineer? gone.
- Accountant? gone.
- Lawyer? gone.
If AI got so smart that it could do all the mathematical reasoning and logic deduction required to be a data scientist, then literally every other job would be gone too.
You can even argue that data scientists, Machine Learning engineers, and statistics specialists would be the last ones to go as we have the most knowledge about AI systems; hence, we would need to maintain them and keep them ticking over.
However, don't get me wrong. I still don't think AI will ever take over data science jobs in its current state unless there is suddenly an AGI breakthrough in the next decade, which I think is highly unlikely.
I doubt the answer to AGI is the cross-entropy loss function!
It has been shown that current AI lacks good mathematical reasoning abilities, which is the most fundamental skill for a data scientist.
Even the so-called "software engineer killer", Devin, wasn't all it was hyped out to be and the company has gone back on some of its initial claims.
Don't worry about AI if you want to be a data scientist; there is much, much bigger fish to fry before we cross that bridge!
You Don't Need To Learn Maths
Now, don't get me wrong, you certainly don't need a PhD in Maths to be a data scientist, although some jobs do list this as a requirement, but this is a rare case!
But saying you don't need to know any maths at all because all the machine learning libraries handle this for you is a bit reductive and can be damaging for your career.
To be brutally honest, you'll never be a "good" data scientist unless you learn some of the underlying maths. Again, you don't need to be a whiz, but you need grounding in calculus, linear algebra, and probability to understand what you are doing.
Nothing is worse than being asked by a stakeholder or senior manager to explain how your model works or talk through your statistical analysis and not knowing how to respond. I have seen this happen and it is very awkward.
I know it may seem scary, but taking a year or two to learn the maths to a good standard is worth it for the long-term payoff in your career that will span several decades.
So, make sure you learn the required maths. I have a whole article dedicated to the exact things you should know and useful resources that you can check out below.
You Need To Be Smart
Along the same lines, I often hear that you must be super bright to become a data scientist. While many data scientists come from STEM backgrounds, which most people would classify as "smart," they miss a fundamental component. Which is simply effort and hard work.
I know that "hard work" is not seen favourably and may not be helpful nowadays, and I am not promoting a "grid" mindset. However, I believe that consistent effort can achieve anything in data science and many other professions.
It's not glamorous or glitzy, but truth be told, there is no real "secret" to becoming a data scientist. I always say that everything is quite simply, but it is just difficult. It's hard to put in consistent effort every day with no guarantee at the end.
The people who succeed stick to it and have faith that something will work out. It's all about making small incremental gains all the time.
Many of the data scientists I know had to overcome significant challenges to get to where they are today. Some even quit their jobs and went back to university to pursue a data science master's degree.
I am not saying you should do that, but that's the kind of effort and faith you need to put in sometimes to succeed. It is all about risk vs reward.
Focus On Deep Learning & LLMs
Don't go into data science if you only want to work on deep learning and large language models (LLMs) because you will be bitterly disappointed.
It's important to note that only a few companies actually use deep learning and LLMs. Regular machine learning, on the other hand, is sufficient for most tasks and is widely applicable across various domains and companies.
Research labs and domains like recommendation systems would use deep learning, but again, they are in the minority of cases.
Additionally, if you want to work in these areas, don't begin your learning solely on deep learning or LLMs.
You should first learn the fundamentals of machine learning, Statistics and basic algorithms. You need to gain an understanding of things like:
- Linear and logistic regression
- Bayesian statistics
- Random forest and gradient-boosted trees
- Probability distributions
- Central limit theorem and probability theory
And that's just scratching the surface.
My advice to any beginner data scientist is to start with the basics. These fundamental concepts will give you the most return throughout your career as you will use them most of the time.
Tech, Then Business
I fell victim to this one too often in my early data science career. I focused too much on the new technology or algorithm I would use to solve the problem instead of just focusing on actually solving the business problem in the most efficient way.
Stakeholders and senior managers literally couldn't care less if you use reinforcement learning or some "if" conditional statement to solve their pain-points. As long as it meets their requirements and is a net positive for the business, that's all that matters at the end of the day.
If you are a new data scientist, I want you to focus solely on generating impact for the business in any way possible. Use the best and easiest tools for the job and don't overthink it.
Generating business value is the best way to fast-track your career and get consistent promotions. I know it's not all about money and status, but you don't want to be that guy who knows all the technical details but doesn't know how to apply them in a business setting.
It is good to learn all the fancy algorithms, and they certainly have their place in particular situations. Not to mention, continuous learning is how you level up as a data scientist, but make sure you focus on the business outcomes along the process.
I have a separate article explaining my advice for getting promoted as a data scientist.
Summary & Further Thoughts
There is a lot of data science advice out there, so you must be careful who you listen to (even me, to be honest!). In general, if anyone is telling you there is a "fast track" to becoming a data scientist, this is often untrue. Like anything in life, there are no shortcuts; you have to put in effort for a long enough time to see the benefits of your work.
Another Thing!
I have a free newsletter, Dishing the Data, where I share weekly tips and advice as a practising data scientist. Plus, when you subscribe, you will get my FREE data science resume and short PDF version of this AI roadmap!