What is a data scientist?

Job titles in the data community can be confusing. Looking at job postings and data applicants? resumes, you must dig deeper than the title. Data architects is a case in point where many organizations use the title for data modelers while other reserve the title for those who oversee the architecture of data solutions.

Enter the data scientist. I began to see this title thrown around quite a bit late this year; usually in conjunction with talk about big data. It is definitely taking a spot front and center in the must-use analyst buzzwords. The idea of data combined with science doesn?t seem that unusual. I decided to dig a little deeper and see exactly what this buzzword is all about.

I found that there is quite a bit of discussion about what a data scientist is but not necessarily a firm job description. Steve Miller in his Information Management Blogreferenced a quote from Chief Google Economist Hal Varian that puts the job into perspective for me.

“I keep saying the sexy job in the next ten years will be statisticians. People think I?m joking, but who would?ve guessed that computer engineers would?ve been the sexy job of the 1990s? The ability to take data ? to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it ? that?s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it.”

Data is exploding today. Big data is becoming a reality for a larger set of organizations. The volume of data is exponentially increasing. Where we once were concerned about collecting data entered into on-line applications, we are now concerned about the data that can be collected about the user?s experience in entering the data. I?ve seen this data referred to as data exhaust, the information new leave behind as we shop, surf, and socialize on the web.

Over the past twenty years or so, IT has become skilled in collecting and analyzing the structured data. We have managed that data very well and have even enabled some pretty sophisticated non-IT user analysis of the data. We now have this large amount of unstructured data before us and need to enable easy to use analysis of that data. More importantly, this analysis needs to speak to the relationships that exist between structured and non-structured data.

This is where the data scientist enters into the picture. It?s a new job that is a mashup of job responsibilities from Data Management, Business Intelligence, marketing, statistics and analytics. It is a unique individual that can understand and analyze the data and then present it in a way that makes it critical to the organization.

Being part statistician is one of the key components of this position. What distinguishes a data scientist from a Business Intelligence analytics role is the ability to think more strategically and merge structured and unstructured data in a way that presents a very telling strategic story to the organization. Being part marketer is almost as equally important. Telling that story in a compelling manner with a heavy dose of data and statistics needs to be tempered with a clean, easy to understand front-end.

How many people do you know that possess all of these skills? I personally no none. It is quite unique. In this blog, I often speak about the importance of data professionals stepping outside of their technical shell and becoming more conversational and a marketer of their expertise to the organization. A data scientist is an example of stepping outside of the shell. The data scientist can very well be a non-IT person that possesses strong statistical, marketing and communication skills that is able to take on data and Business Intelligence expertise.

The life of the data scientist job title is quite new. As with other data professional job titles, there will be many interpretations of this job. I expect to see many organizations grant this role to their senior Business Intelligence analytics professionals. It will most likely be a status symbol in the Business Intelligence and big data communities. It will take some time for the data scientist job to become a mainstream concept with a highly defined list of roles and responsibilities. Ah, the convergence of data and science is an exciting place indeed.

Tom Bilcze
Modeling Global User Community President

I’ve reached 3NF in “Why Be Normal?” How normalized are you?

Find out more about ?Why Be Normal?? at http://erwin.com/whybenormal/. Want to know what ?We Be Normal!? is all about? Visit https://communities.ca.com/web/ca-modeling-global-user-community/to get the whole story.

Follow ERwin online through Twitter, Facebook, and LinkedIn.
Twitter: @ERwinModeling and #YBNormal

Leave a Reply

Your email address will not be published. Required fields are marked *