Every organization wants a data analytics team that delivers good results. Learn here what roles to hire for your company project and how to structure a winning data team.
Everyone needs a Data Team
We are at the end of 2021, and if your company doesn’t have a dedicated Data Team, then you are out of the game. No matter which sector you are working in, data is currently driving every decision that companies make.
However, if you already have your Data Team, does it have what it takes to give you the proper information to make that game-changing decision?
In this post, we will go over a couple of things you should consider for your team, whether you are looking to build a new one or change the scope of the existing one.
Before we dive too deep into what roles I believe you should hire for your company project, please keep this in mind:
Today, everything is Big Data, or it will be very soon…
With that premise in mind, let’s begin with some questions.
Do you have the platform you want?
What you need is different than what you want
Most companies already have a data platform that stores what is currently generated by their core business processes. Some of them create a massive silo to hold most of the data, while others build small silos to act as a distributed platform where every team can decide independently and have faster implementations. At least, that is the purpose.
What are the skills in this step?
Every company may have multiple roles distributed throughout different teams. When we think about platforms, the first thing that comes to most people’s minds is Database Administrator (DBA), which is the most common role within a company.
Although a DBA is an important role in a Data Team, currently, there are plenty of variables you need to think about when building your platform. Here is a list with some of them:
- Access concurrency
Didn’t think about some of them? Actually, this is not the full list. That is why there is a difference between a DBA and a Data Engineer. In my opinion, there are two types of Data Engineers: platform-oriented and data-oriented. At this stage, I will focus on the skills you should be looking for for the former.
- Basic bash scripting knowledge (of course Linux would be better)
- Distributed systems implementation
- Systems monitoring and dashboarding experience
- Cloud implementation experience (Terraform, Serverless, etc.)
- Basic level of data manipulation experience
Notice I am not mentioning Hadoop, Hive, and Spark, among others. That is because my focus is to give you those skills that are related to the person rather than to the technology behind it, because at the end of the day, the solution depends on what you need.
So, once we have the people to build the platform, what is next?
Do you have the context for your data?
What is context?
In today’s world, data needs context, and context is what happens around your client. Social networks have changed everything, and if it is not part of your Data Warehouse or Data Lake, you are missing the most important input you can have from your end-user.
If you don’t know that most of Google’s income comes from ads (80% in 2020 according to CNBC), then you probably don’t realize that you generate a cost every time you click on that sponsored link to the company you are looking for. We can add Facebook and Apple under the same business model, and there you go, everything you type (yep, including WhatsApp!) is being used to create ads to fulfill consumer trends.
The good thing is all that data is available (yes, under some cost) for you to grab it and create a context for your business. So here is the second need: add context to your data.
Acquiring the context
Retrieving data is not simple, it is in different places and is mostly in JSON format (Java Script Object Notation). This means you will need to add resources to your team that can make the data flow to your selected repository. But remember, social network data never stops growing.
Taking this into consideration, here are the most common skills you will need:
- Web requests handling
- ELT mindset
Seems simple, right? Matt Turck made us realize in 2012 that working with data is not straightforward. He has a yearly publication that tells the community how many tools are out there to work with data on many levels. For ingestion, acquisition, or streaming purposes, there are over 24 different applications to move data from outside your company into your desired repository. In my experience there’s no person in the world that can handle all 24 in-depth, but don’t worry, that is where the mindset part comes into play.
A Data Engineer, in this case, someone who is data-oriented, is someone who handles ETL/ELT (Extract Transform Load / Extract Load Transform) applications with a certain efficiency and can adapt to other tools in a short period of time.
Put your data in play
Nothing happens if your teams can’t see the data
Having a massive platform with tons of data is only useful if your business-oriented team members can see it. These teams are the ones able to relate core business data with the context, and it is very important for them to have the skills to pull data out in the proper way.
New skills for analytical people
When we started collecting that massive amount of data, it became more difficult to use on a day-to-day basis. In today’s world, spreadsheets may have a limitation on the amount of data an analyst needs to use, or the processing time is too slow. So which skills should you look for in a person to analyze your data?
- Working with spreadsheets (don’t get me wrong, it IS a basic skill for an analyst)
- Dashboarding (e.g., Tableau, Power BI)
With these skills, you will have an analyst who can talk to a Data Engineer and get better requirements for the data and show & tell skills, which will result in better information for you to make a decision.
Remember, from this stage, what you are looking for is to get data connected to your business, track your KPIs, and monitor your business performance.
Going deep into your data
Nothing better than good insight
Data is different from information, which is different from knowledge, which is different from wisdom. The more insights you produce using data and context, the closer you get to really understanding your business.
Skills for generating insights are difficult to find, mainly because the person creating it must know about your business. Another thing to understand is that in this stage, you are not looking for known outputs but for that missing piece to make a game-changing decision.
Insights should either change or strongly confirm your beliefs, another way is just another analysis. The hard part of coming up with good insights is to consider all possible variables and have outputs based on evidence without bias. If you are now thinking about research, science, and published papers, you are correct. This is the part where you need to be patient because there are no results in the short term or frequently.
The rise of Data Science
Of course, you will need Data Scientists in your team, which is not a surprise at this point (or in this year). You probably have seen a lot of skills required to hire this kind of role, but have you taken a look inside your organization? Most organizations have very analytical people with skills that might surprise you. Therefore, for skills required in this stage, here is a list:
- Business knowledge
- Mathematical background
- Coding skills (most common are Python or R)
You have probably already seen all but the last one of them. And ironically, it is probably the most important one, given that the person responsible needs to convince the people who make decisions to go one way or the other.
Putting things together
There is a lot to digest in this article, and since we are talking about data skills for making decisions, here is a takeaway for you to assemble your team.
- Basic bash scripting (Linux)
- Distributed platform experience such as Hadoop, Hive, NiFi, etc.
- Terraform/Serverless knowledge for Cloud deployments
- Coding experience (Python mostly Pandas, Numpy, Requests, Multiprocessing, PySpark, SQLAlchemy)
- ETL tools experience (NiFi, Pentaho, Alteryx, etc.)
- Strong SQL skills
- Python (Pandas, Numpy)
- Basic statistics skills
- Basic SQL skills
- Strong dashboarding (Tableu, Power BI, etc.)
- Python (Pandas, Numpy, Scikit Learn, NLTK, etc.)
- Strong mathematical skills
- Strong business knowledge
I hope you enjoyed reading, and see you in the next article!
Comments? Contact us for more information. We’ll quickly get back to you with the information you need.See All Posts