top of page

Data Scientist Qualifications

Being a data scientist requires extensive experience with statistically modeling and being able to synthesize results to adjust one’s analysis adequately. Questions from interviews would revolve around statistical modeling, understanding the results of algorithms and improving code, conducting root cause analysis, and knowing of prominent data scientists in the field that have contributed to research (Piatetsky, 2016).


In my career, I have engaged in the above topics when needed for my projects – however, that usually required utilizing Google and other sources to complete the job in a fairly lengthy amount of time. A data scientist would be able to take data sets, perform a complete set of analyses, and refine their results until satisfied in a fraction of the time I would be able to do the same. Do become a data scientist, it would require many years of consistently performing statistical analysis on projects.


A study by Accenture shows that organizations only need 1% of talent to be Analytics Champions, whereas 5-10% to be Analytical Scientists (Hernandez, Berkey & Bhattacharya, 2013). Champions are able to lead initiatives, while Analytical Scientists are able to build models at scale. An additional study showed that of theses specialists, roughly 30% are “Exceling” in their field, while another 30% are “Satisfactory”.


Lastly, data scientists must know how a multitude of technologies. Hadoop is one important framework which all data scientists must have a strong grasp. They must know Hadoop, the HDFS backend, Hadoop YARN, and Hadoop MapReduce (Bappalige, 2014). The complexity of the architecture and how the processes are working together would be important for a data scientist to accurately use MapReduce applications on top of YARN, support additional programing models on the Hadoop infrastructure, and be agile with the company and external libraries.


Resources

Bappalige, S. P. (2014, August 26). An introduction to Apache Hadoop for big data. Retrieved from https://opensource.com/life/14/8/intro-apache-hadoop-big-data.



Piatetsky, G. (2016). 21 Must-Know Data Science Interview Questions and Answers. Retrieved from https://www.kdnuggets.com/2016/02/21-data-science-interview-questions-answers.html.

Comments


bottom of page