Almost a decade ago, Harvard Business Review declared that the job of a data scientist is the sexist job of the 21st century. The word “sexist” was synonymous with “most desirable/most lucrative” and data science as a field of science gained the best possible momentum. What followed was a big influx of professionals from all different fields transitioning into data science.
Data was already available across the value chains of almost all organisations from sales to operations, from finance to inventory, from HR activities to CSR initiatives and it gained the spotlight with the new tools and technologies that were getting increasingly available around the same time and were being democratised with open-source platforms like R and Python around the same time.
Cut to the present, the whole landscape has changed drastically. Data scientists who were once notorious to make the workflow “efficient” and “effective” and thus cutting down some regular monotonous jobs are now faced with “Auto-ML” tools that claim that advanced machine learning models can be built, operationalised, and monitored without much knowledge of Machine Learning or “limited” knowledge of machine learning. Some of these tools even go to the extent of claiming that data scientists are not even required. Nothing wrong with democratising the whole of data science and enabling everyone to harness the power of data. But the pros and cons of handing down data science to everyone in the organization is beyond the scope of this article. Let’s say that Auto -ML tools are a boon, and everyone can and should use them.
From doing the sexist job of the century data scientist now face the threat of tools that can do as good as you if not better. Few people might argue that “Auto-ML” tools are not meant to replace data scientists but to facilitate them and I got the message wrong. But, if we can be a bit honest here, every now and then there is a tug of war between “tool data scientist” vs “human data scientist”. C-level executives host a champion challenger competition between the two. They make the decision based on who’s cheaper, and more efficient and a data scientist obviously has no choice but to put the best model forward to continue in the “sexist” job and be “lusted” at by one and all. In this article, I have tried to list down some combinations of hard and soft skills that all data scientists should develop to survive in this human vs tool tussle apart.
1. LEARNING NEW SKILLS
There is no dearth of advancements in the data science field. It’s always good to keep learning new skills and updating your knowledge. It can never hurt to keep reading in this knowledge industry. And keeping a learning hat on ensures or enhances job security.
2. PROJECT SCOPING
Very often, when a business explains its problem areas to the analytics team, they are not able to articulate what they exactly want, and they are unable to break the problems into projects which can be delivered as an analytical solution. While it’s very morale-boosting to hear about a plethora of business problems that need analytical solutions, it can be a daunting task to convert these problems into analytical problems, break down the pain points and prioritise the problems. To completely scope the project and clearly define the objectives and end goal considering all the hypothesis is an art that no data scientist can choose to ignore. Project scoping and requirements gathering also require a very good business understanding. And this strategic role has not been automated yet.
3. INSIGHT ANALYTICS
As simple and non-glamourous as it may sound, providing insights and being able to tell the data story is another skill that can lead to job security. Understanding what just happened or what is happening and connecting it to the changes in strategy and macro-economic factors and competitors’ strategy (if I am allowed to daydream) can go a long way. Data visualisation skills can come in handy to present the story and leads to a solid foundation for model-building exercise as well. Doing data discovery and finding out outliers and being able to connect insights from different variables can be a game changer. Regular reports have been automated long back, but getting the relevant insights are still human-driven.
4. COLLABORATION
There are a lot of different teams working together on one solution. If you are building a model for the sales team it’s highly likely that the pricing and operations team shall get involved as well. The ability to “get things done” by collaborating between different teams, different personalities, and with external vendors/collaborators and being able to work with everyone without being a headache for the manager is a great soft skill. The “go-getters” are often managers’ pets and they enjoy the dependencies on themselves.
5. ANALYTICS TRANSLATION
Analytics translators are so much loved and regarded. If you can explain the different KPIs of a confusion matrix to a businessperson in a very easy way like you would explain that to a toddler the next project funding gets committed so much easier. To develop this skill, a complete understanding or a very good intuition of the subject matter is mandatory. This can take a lot of research and reading up.
6. KNOW YOUR TOOLS
Everything said and done, there is a lot of merit in knowing the Auto ML tools. There are times when auto-ML is not a threat, but a facilitator as advertised in the first place. It’s good to know about the Auto-ML tools and use them to know what their pros and cons are and get first-hand experience using them. This shall ensure that as a data scientist, you are the tool evangelist, and you shall have many people seeking your expertise.
It’s possible that I have a wrong perception of Auto-ML tools. Nevertheless, the points mentioned above are relevant skills to be developed by all data science professionals.
Abhigya Chetna is a data science professional. Over the years, she worked to provide analytical solutions in a wide range of industries across the globe. This diverse experience gives her an edge to crossbreed her skills and deliver impactful solutions.
She enjoys being an analytical translator for non-data professionals and aims to work for data literacy.