Introverts Guide to the Data Analyst Workflow
Data analysts have a variety of workflows depending on the field that they’re in, the team they’re on, and even the project they’re involved in. What follows here is the most common workflow that I’ve used in my career, working first in Consumer Packaged Goods and now in Tech.
This is specifically how I handle my workflow as an introvert. I focus on trying to create environments where I can thrive, which typically means ensuring I have the ability to communicate and work how introverts do best (processing internally, slowly absorbing information, and listening over talking).
The tools mentioned here can be swapped as needed (Tableau for Power BI, Mode, Excel, etc, and SQL for any other way to get your hands on data). What’s important to focus on is the soft skills involved, and the importance of those skills.
Note: Throughout this article, I will use the words dashboard and report interchangeably for simplicity.
Start with a reason to dig into some data
Data analysis only starts once there’s a question to answer. We don’t dig into data wildly. We always want to have a question in mind to answer. Yes, it’s true that sometimes that question is something as broad as “I wonder what our data looks like”, but that inevitably has a more narrow question attached to it. This narrow question could be “How many customers do we have?” or “Is there seasonality in our data?”. These are still broad questions but they are answerable with data.
These first types sorts of questions are proactive ones that can help you to get more familiar with your data. Spending a little time on these early will help you out later on. For example, if you know that you have 1,000 customers in the database then you’ll notice that you did something wrong when a query says that you have 100,000! This type of info has saved me lots of time in my career - it’s worth spending some time on.
In terms of a second, less traditional workflow, another reason to dig into the data is that someone on your team has identified something worth answering. This workflow may deviate from the norm given internal questions are often related to the data itself and not necessarily to moving the business forward. These questions can seem unimportant but, in my experience, they often end up being some of the most valuable. They tend to uncover gaps in the data or perhaps highlight some logic that missed an edge case. These usually result in you becoming a stakeholder when you follow up with the party responsible for correcting this problem.
Overall, the third and most common reason to dig into data is when a stakeholder asks a question. This will be a question that is tied to the business and should have some impact. It’s your job as an analyst to first ensure that answering this question will move the business forward. To do this, start by asking probing questions to ensure that your stakeholders are asking the right questions. This can be hard, but discussing it with the stakeholder and your manager is a good place to start! I prefer to ask questions through written communication here - it allows me to process information at my own speed and double-check anything that I need to. As an introvert, I find that unexpected questions take me slightly longer to answer than my extroverted colleagues. This helps to even that out.
Knowing which scenario you’re in is important as it will inform the rest of your workflow.
Communicate with your stakeholder(s)
This is the most important part of your workflow. You’ll need to communicate multiple times during your analysis and this is a critical part of what sets a good analyst apart. The first step here is to set up some face-to-face (or video-to-video) time. As an introvert, I prefer asynchronous communication (instant messaging mostly), yet I do this first step in a synchronous fashion. Being able to talk through ideas in real-time can get a lot of confusion out of the way, saving you time in the long run. However, it won’t eliminate all the confusion, even if it seems like it has.
This is why you’ll want to take some light notes. I like to focus on high-level items such as: any caveats that are brought up, what this analysis will be used for, the agreed-upon deadline, etc. This should all be written down and agreed upon. The reason for this is simply to ensure you’re on the same page. It’s frustrating for both parties when a deadline is misunderstood, or the scope of work isn’t aligned. Writing it down helps tremendously. It also gives you the ability to ensure you’re properly heard, a common problem for introverts, especially in a group setting.
I also like to include the next action items in these notes. Often times that’s just “Dylan to do a first look at the data and confirm that the agreed upon timeline can be met”. This lets everyone know what the next step is and who needs to take the requisite action. As well, always include when you’ll reach out next. Your stakeholders are excited and you don’t want them to lose that energy - keep them in the loop! This helps them feel as though they’re part of the project which is important for buy-in, which I’ll cover later.
If you don’t have a direct stakeholder, which would be the case if you’re just looking to get a better handle on the data, then use your manager. There’s a big difference between work that has been explicitly thought through and that which hasn’t. Being accountable to someone will force you to communicate your ideas more clearly and ultimately help you be more explicit about what you’re trying to uncover/achieve.
Our brains are great at the high-level fuzzy ‘details’, so any time we communicate we necessarily have to refine this understanding - that’s why having a stakeholder to provide updates and insights to is so valuable, even for your own digging. You’ll notice holes in your logic, things that are suddenly obvious when said out loud will arise, furthering the quality of your analysis.
Regardless of the situation you’re in, you’ll want to have a refined problem statement that can be answered with your analysis. This could be anything from “What countries do my customers live in?” to “Is it possible to predict when a customer will buy next?”. Both of these can be answered with metrics, even if they look quite different.
Find and organize your data
With a refined problem statement and some metrics in mind to answer it, the time has come to find the data! If you’re lucky then you’ll know where the data are and how to access them. For me, this has mostly meant using SQL to dig into a data warehouse and do all the transformations needed to get my data into my desired format. In a perfect world (about 80% of the time for me) I simply join my tables together, do some aggregation, filtering, and segmentation, and I’m done with SQL. This sounds easy, but even when all the data exist it can still take days to bring everything into the proper format to join together.
There are also situations where I need to lean on an analytics or data engineering team. I won’t go too deep into these cases, but they generally arise when the data needs to be reshaped or changed significantly and it makes sense to have those changes exist in a permanent table.
You may be in a different situation where all of your data doesn’t exist in a data warehouse. In that scenario, it’s time to go gathering! If this is a one-time analysis that isn’t likely to be repeated, then simply bringing everything into Excel or a Python notebook is usually good enough. If this is an ongoing project then I would suggest going to the additional effort of putting it into a more permanent place, like a data warehouse. Either way, once this work is done you’ll want to make a copy of the data to work on (unless the data has been backed up already). If you modify something irreversibly then you’ll want the option to go back to your initial data. This step seems small but even needing your backup data once is worth doing this 100 times over.
Start massaging your data
Now comes the time to get all of your data into your desired format. A good rule of thumb here is to extract the data in the most granular way (if you have daily-level data but want to look at it monthly, bring it in at the daily level). This is of course within reason - if you only care about yearly data then daily is likely to slow you down. As well, if the size becomes unwieldy then paring it down is a worthwhile endeavour. Use your judgement or that of senior colleagues here!
I often find that this step takes more time than I expect. Data can be fickle and getting it into the exact shape you need can take a bit of wrangling and creativity (or sometimes just a bit of manual effort).
Because of this fickle nature, I very highly recommend having a peer review at this stage. This is the first Quality Assurance (QA) step that I do and it’s normal to find errors in my thinking at this point in time. I view an error here as a good thing. Errors are common and catching them early is key to keeping a project to the timeline. The further that we go in this workflow, the more costly errors get.
If you haven’t done a peer review before, here’s an easy guide:
Sit down with a peer and walk them through your SQL script - live is usually better here but asynchronous is okay too
Point out any oddities or things that you found that were unexpected - these tend to be the spots where you might have misunderstood something
Have them ask questions and ensure that you can answer them. If you can’t do so on the spot then write them down and ensure you get back to them with an answer - there’s nothing wrong with taking the time to be sure of your answers
Ask them to validate your output in one way or another
This could be them trying to pull just 1 or 2 customer accounts and ensure they line up with yours
It could also be pulling summary statistics and ensuring they line up (if your query says there are 100 customers in Canada, does your peer also find that through a different method of pulling the data?)
Manually aggregate a subset of data and ensure the same outcome
If another report already exists, are you able to validate against a subset of that report?
This will be the end of your analysis in a small number of cases. If you were just trying to figure out how many customers you have, for example. However, it’s still worth doing a bit of a write-up of your findings, even if they were just for your own info. You never know when someone else will ask and you’ll save yourself a lot of time if you’re able to just link a short document. I also recommend storing your SQL script somewhere in case you want to reuse any of the logic elsewhere. This takes little time and I’ve found that I’m often glad that I’ve done it.
Communicate again
If this is a case where your data work is now done, it’s time to communicate that to your stakeholders. If this is just your manager then I still recommend spending 15 minutes writing up a quick summary of what you found and showing it to them.
If your work is still going and you’re simply at the point where you’ve finished pulling your data, it’s still a good time to reach out. This can be a quick message letting them know that you’ve managed to wrangle all the data that you need and are excited to start on the actual analysis. I recommend also including anything that you’ve found along the way so that they feel part of the process. It may seem small but the more involved they get to be the better for both parties!
This also gives them a chance to mention anything that they may have forgotten earlier. Yes, this can be frustrating, but it isn’t uncommon for a stakeholder to also be thinking about the problem and have additional thoughts. We’ll go into how to handle this scenario shortly but know that data projects tend to be somewhat fluid and regularly change as they go.
Dashboarding and reporting
Now is the time to take your freshly pulled data and start to massage it into its final form. In most cases, this is when you move your data into a data visualization tool, such as Tableau.
As an aside, some people will tell you that Excel shouldn’t be used at this point. While I agree that purpose-built tools like Tableau will give you better results faster, Excel is also extremely capable. You can build charts and graphs in it and basically everyone understands how to interact with it. It also already exists at virtually every company (Google Sheets works too) so you don’t need to convince the company to go buy it, which can be harder to do in some places than others. The downside of Excel is due to it being able to do so many different tasks - you’ll need to put extra effort in to have an output that will look clean. It’s also harder to stop people from accidentally making a change to your report, though not impossible. There are loads of tips and tricks out there to look up and I recommend spending a bit of time digging into them if you’re using Excel.
If you’re working in a purpose-built tool then you have a few other advantages. Generally speaking, you can easily set them up to refresh data on whatever cadence you like. This means your stakeholders can use your report continuously with fresh data without effort from you. In a perfect world, this frees up your time to answer questions that may arise from said report and move on to a fresh project!
Once you have your dashboard/report in a working state, with everything where you want it to be, it’s time to ask for another peer review. I suggest going through a similar exercise to the first peer review, where you walk through what you did and why. It’s good to get a second set of eyes on your work, especially in places where you’ve done any additional data work (segmenting, aggregating, filtering, etc.).
It’s also worth doing your own QA before asking for this review to avoid eating up someone else's time with things that you could catch on your own. This makes you look more competent and ensures your reviewer’s time is being used effectively.
Polishing your work
With your dashboard/report complete and accurate, you might think you’re done - you’d be wrong. Polishing and refining your work is a step that you always want to do, even if few people will see your work. It’s likely that at this point you just want to get the project shipped and off your plate, but skipping this step does a huge disservice to you. Ultimately, the impact of your work is measured by the outcomes that arise from it. In other words, if no one looks at your dashboard then your work was wasted.
Here is where I look to see what I can easily improve. Some items to consider are:
Size and font of the text
Ensure the text is all the same size and font
Proper number formatting
Add the thousands comma for big numbers and remove any decimals that aren’t needed
Look at chart axes
Are they clearly labelled and logical?
If there are two vertical axes, can I align the two or do they need to be separate?
Look at colours
If the company has brand colours then use those
If not, find a colour palette that already exists and use that
Pay extra attention to any prominent dimensions and ensure the colours are consistent
If Canada is red on your bar chart but blue on your line chart then you’ll want to align them
This may feel tedious but it makes a big difference to how stakeholders view you. They’ll see that you care about your work and are a professional. It also shows them that their project is important to you, increasing the odds of them taking action on your insights.
Communicate yet again!
Your dashboard/report is now complete. At this point, you’ve spent a bunch of time on this project, and it’s likely taken longer than you expected. You’re right near the end - try to be patient with the last few steps.
You now want to reach out to your stakeholders and let them know that your work is done! You also want to set up some time to talk it through with them. This is your chance to explain what you’ve found, what you’ve learned, and most importantly, what actions they can take to improve the business. It’s also a time for them to ask questions and challenge you - expect that they will. If you’ve communicated regularly and kept them up to date then this portion may be limited, but it’s rare to not have a few questions.
After this meeting, I like to reach out a few days later to see if they have any questions now that they’ve had some time to sit with the project. It’s normal for them to have some follow-up questions and I’ve found appreciation when I reach out instead of them needing to. It’s also much easier to answer questions a few days later vs a few weeks later. I find that even 2 weeks later I’ve forgotten half of what I did and why. Even with good documentation, it takes me a bit of time to find the answers to questions that may feel somewhat basic.
Notice how there are multiple sections on communication? That’s because this is the single most important part of your work. It deserves multiple sections and more effort than we tend to give it.
Celebration time
Close out your ticket and take a moment to bask in the success of a project well done. I mean it, take the time to note all the effort that you’ve gone to and be proud of your work. It’s far too easy to just jump into the next project without taking a bit of time to reflect on your work. This is also a good time to note down anything that you’d want a peer to know about the process you went through. Anything unexpected with the data or tooling is worth communicating.
Bonus: How to handle scope creep
Scope creep is the unfortunate reality of a project starting off small and then branching off to become significantly larger. It’s a problem that all data folks struggle with and there isn’t one simple solution to it.
However, I have found a few tactics to be helpful in reducing scope creep:
Being aware of it, to begin with. You can’t notice scope creep if you don’t even know what it is, so now that’s been handled for you!
Agree on an initial scope of work and have that in writing. This isn’t to cover your butt, but it’s normal for a stakeholder to get excited about something and assume it’s small. Or sometimes they just forget what was agreed to and are disappointed when you don’t deliver more.
If the initial scope is going to take you more than 2 weeks of dedicated work, then agree to deliver it in pieces. This will allow you to get some wins along the way AND it allows your stakeholders to see what you’re up to. You do not want to be a black box to them!
It also allows them to see some wins and will help with buy-in to the project
In the end, scope creep is manageable as long as you’re aware of it. You may also end up being the problem here, so keep that in mind. I find a good sign that I’m the problem is when I start getting excited about all the cool and interesting things I can add or automate. In this situation, just writing down all my ideas as a phase 2 or 3 is great. Most of the time I realize later that the effort to do these things isn’t worth it.
Bonus 2: How to get buy-in from stakeholders
Stakeholder buy-in is critical for your project to create impact. In most companies, your power to create change is through others, indirectly. This means that you need your stakeholders to be excited about what they’ve learned as a result of your work.
There are a few simple but critical things you can do to increase your odds of success here:
Involve your stakeholders from the start. The very start. When you’re ideating and coming up with your plan, involve them. This makes them feel like they’re part of the work and at the end of the project, it isn’t just you standing behind it. They are there with you.
Communicate regularly and clearly, and ask for their thoughts where it makes sense to do so.
Keep them in mind throughout your analysis. If you have actions that you feel they should take, make it easy for them. Show what impact you believe those actions should make and give them additional data to jump in right away if they want.
For example, if you say they should reach out to certain customers and offer an additional service, provide them with a list that they can act on.
If you feel a new experiment might be useful, ask them what their thoughts are before you’re done with your work
If you learned from this and feel that you could benefit from my advice, go check out the Coaching & Resources page!
Thanks for reading :)