First Series by Shaku Atre- Part 2 of 10
Q6: What should be the Strengths of Big Data Analysts, aka “Data Scientists”?
A6: Open Source
With or without Big Data one has to have a very good understanding and experience with Open Source Software in today’s computing environment. And getting that expertise is not difficult. One can get it with comfort of one’s laptop.
- The Ultimate Open Source Software
- List 30 Essential Pieces Of Free & Open Software For Windows
And then the person has to make her hands dirty by working with Big Data. Theory alone is never sufficient. You have to “live it” to feel the data. Database Software such as NOSQL, Cassandra, and Hbase: Data architecting of databases with petabytes of data (1000 Terabytes is 1 Petabytes – terabytes is a measure of yester year) of data and growing every minute. Learning to install a small database on your laptop is not a big deal either. By doing it one can put one’s toes in water.
Expertise in databases like NoSQL, Cassandra, and HBase
Experience managing software frameworks like Hadoop with Map/Reduce as well as with SPARK;
Hadoop Project Based Training
Languages such as R or Pig:
Languages Expertise with analytics programming languages and facilities such as important languages R or Pig
Managing thousands of CPUs:
Ability to manage hardware with hundreds or thousands of “small’ CPUs, for many petabytes of data.
Q7: What should be the Soft Skills of Big Data Analysts / Data Scientists?
A7: Soft skills have not much to do just only with Big Data (these skills are necessary as well as sufficient in “Small Data” environment too). These skills are needed in many organizations:
Check: Necessity and Sufficiency
Understanding of the ”ins and outs” of the business and especially the business processes which could give substantial returns on investment if insights are identified.
- Understanding of the “bottom line” of the business and of the competitors’
- Ability to discern which analytics will answer the bottom-line questions
- Communications skills to explain the analytics results.
And keeping in mind, all the time, that you are working because there are people who are using data.
Data Storytelling is a very essential skill (refer to my other series on “01-Data Storytelling FAQs – Second Series by Shaku Atre- Part 1 of 10”)
One has to learn how to put emotions in data so that the data starts to talk by itself.Understanding not only transactions (as we have been doing all along) but also interactions (such as people buying products on the web or not buying after having looked at them) and observations (such as machines or sensors measuring and reporting about happenings or not-happenings).
Q8: What is different now?
A8: Well – we have never complained that we had less data. We always had more data than we could handle. So what is different now?
In the past, even in the recent past, we practically reported “what happened”. No one could say that we were liars. We just reported what was. We even said “Data doesn’t lie”. Well, that is true too. But now it is not sufficient just to report what happened but we have to say what may happen based on the data that we have collected. The main purpose of Big Data is to be able to look at data in new ways so that accurate predictions are made. In order to accomplish accurate predictions with the speed at which big data is arriving an organization has to have skills in a number of areas. In the next session “Big Data FAQs – First Series by Shaku Atre- Part 3 of 10″ I will continue to cover the skills that need to be present in an organization to manage and use Big Data to provide accurate predictions as well as, possibly, make those predictions a reality.