I Turned My Friend into a Data Analyst in One Year
Read Time: 11 Minutes
At the start of 2023, a friend said he wanted to become a Data Analyst. He had worked for a bank for the past 6 years and wanted to make a career change. He loved the light Excel analysis in his current job and wanted to get deeper into data.
So we met every fortnight, anywhere from 30 to 120 minutes at a time. We used that time to go over what he had been learning and further solidify the information in his brain.
He started with SQL and bought the same $20 Udemy course I tell everyone to start with - The Complete SQL Bootcamp by Jose Portilla (I have no affiliation, I just love this course). I picked this as the starting point because it isn’t glamorous but is a good indicator of real Data Analyst work. If he hated the course he would likely hate being a Data Analyst. I wanted him to make this decision early on and not waste months.
Thankfully he loved it! Over the coming weeks, his SQL skills were blossoming and he was quickly forming the strong technical foundation needed for Data Analytics.
But this is when I decided we would take a new approach to his learning which varied from traditional learning paths. I hadn’t tried this before but I knew my experiences would guide me well.
My method centred around best equipping him for a real-world Data Analyst role. After all, what better way to train a Data Analyst than to treat them like a real Data Analyst?
From then on, I played the role of stakeholder in all of our conversations. Everything he did needed a purpose and “I thought this was cool” was not an acceptable answer. My first manager loved hitting me with “Those are your findings… So what?” constantly, and it forced me to think through my work better. So, I did the same to him.
We used this method to jump into his first project. We did this very quickly - in the first month or two. I knew that project-based learning would reveal the gaps in his knowledge and we addressed them as they arose. We optimized his learning for getting into a role as quickly as possible, not for learning every tool on the market. A common trap for over-eager fresh Data Analysts.
Let me repeat that. DON'T LEARN EVERY TOOL ON THE MARKET.
So he needed data to work on, but Kaggle and other datasets were too clean and unrelated to normal life, so we hunted for something that didn’t have those problems. We settled on government datasets, having some knowledge of what we would find.
He grabbed a few CSVs off the Statistics Canada website, focusing on data that could be joined on dates and geographic locations. A mixture of housing, population, and income data, which he loaded into a PostgreSQL database. While I wouldn’t suggest loading your data into a database to most people (it’s unlikely you’ll do this as a fresh DA), I found that step useful for myself. And with my guidance as backup, I knew he would complete this step in a few hours. It gave him a closer representation of the real world, working with a database instead of just flat files.
He now had data and a way to explore it using SQL! The start of a good project.
But a database is useless without business questions to answer. So I went into stakeholder mode, wanting him to get a good handle on the data but not without direction.
We started with some basic questions to get a better understanding of the dataset:
What’s the most expensive province in Canada? What about city?
What’s the cheapest province to live in? City?
What’s growing the fastest? Slowest?
He quickly realized that finding these answers would be harder than expected.
Should he just use the latest month of data or do an average of the last year or two? What was the impact of each of those decisions?
He had questions to ask of his stakeholder - me. And I wasn’t immediately available, much like on the job. So he had to think through his options and present them to me in a way that allowed for a quick decision. He was learning stakeholder management already - a skill that most don’t learn until their 2nd or 3rd year on the job.
We continued on that way for a few weeks. I’d ask questions that I thought he could get answers to and he’d go off and try to get them. From time to time he couldn’t get to the answer with his current knowledge, so he looked to see what he needed to learn. Again, filling just the gaps in his knowledge through courses instead of spending all his time in them. Oftentimes, he just used stack overflow or ChatGPT to explain what to do and then figured out how to do that with his data. Simulating the real world once again. That was a big part of the plan.
It took about 3 months for him to be at the higher end of an intermediate SQL understanding, including a few advanced concepts he needed to use, like window functions. He was by no means an expert but with an internet connection he could answer anything expected of an Intermediate Data Analyst.
That convinced me that he could answer all of my questions so I stopped asking them and we moved to the next stage of the Data Analysis lifecycle - visualization.
We had been talking about what sorts of companies he was interested in to decide what visualization tool would be the most relevant to learn. We figured this out by looking at 20 job postings he found intriguing and tracking the tools listed using my free Job Analyzer. Tableau came out on top for him.
Truthfully, the tool didn’t matter that much. They all function similarly and moving between PowerBI and Tableau isn’t hard, especially at the beginner/intermediate level.
Initially, we wanted to go all out with Tableau. He had a database already set up and could connect directly to it via Tableau, allowing him to work directly off his data. But we looked at it from a cost-of-time perspective and decided that, given his static data, working off of a CSV would be easier. So he looked up how to do a database to Tableau connection in case he ever needed to and we moved on.
Taking even a day or two for setup wasn’t a good use of his time - most Data Analysts don’t do this anyway and those that do can figure it out with the internet.
He already had a good handle on his data and understood it well, making the Tableau stage of learning about visualization more than exploration. We returned to our earlier questions and looked to squeeze more value out of them. Instead of finding just the most expensive province, he compared them. He started to notice trends and explained them, prompting further questions.
Could we now use population data to create a growth rate metric and track that against housing costs?
He found a way to do it and was able to compare population growth against house price growth - something potentially useful!
But he found the next issue - was he comparing like to like? It was time to use the income data to create an index of house prices based on income. Then he could feel more confident in his findings and would be able to explain one of the biggest variables with house prices - affordability.
And just when he felt confident in his analysis, he looked and found demographics within the population data. Would age have an impact on the analysis? Older people may be past their earning years and likely care less about prices if they bought 20 or 30 years ago. This required him to segment data further, incorporating age.
At this point, we were getting too granular. He would probably do this sort of work at a company, but it wasn’t yielding large incremental learning anymore. It became tedious and was about avoiding small mistakes rather than learning concepts. This is a great sign he had exhausted the basics and was moving from intermediate to advanced concepts. The jobs he would be applying for didn’t require advanced concepts and he was butting up against those limits. It was time to move on.
While I felt he had a great handle on SQL and Tableau, I didn’t feel confident in his ability to work with different datasets. He had only worked off this government data and I wanted him to face a different dataset. Seeing a wider range of data myself has given me a big advantage in my career and I figured it would do the same for him.
We could have explored the next dataset with SQL, but I didn’t want him to get bored and lose motivation. This was all self-learning after all and he had to motivate himself, with me to keep things on track.
Thankfully Python excited him so we agreed that he would use a fresh dataset to play around with the language. The focus wasn’t on becoming a Python expert, but instead on being able to use it to explore a new dataset. This was much more about the data than the tool.
We went back on the hunt for a new dataset, with less of an emphasis on how well it mirrored the real world. He already knew how to handle messy data and, while more practice there would be good, just keeping motivation was more important. So we went for something he was interested in - cars.
I took on the role of stakeholder again and we created a fake relationship. I acted as the manager of a used car lot and it was his job to help me make money. I wanted to know what any car was worth and what features best indicated the price for quick decision-making. I was a greedy manager.
He started to segment the data and quickly ran into problems. The dataset wasn’t all that large and getting signal was hard when he drilled into it. So we needed to remove granularity - something that isn’t overly common. He looked for a way to group the cars that was most likely to yield good results and settled on a few methods. Thankfully he cares a lot about cars and had subject matter expertise to guide him - something he would normally get from a stakeholder.
He grouped cars by body type (Car, SUV, Truck, etc), as well as bucketing vehicles into decades for anything older than 10 years. Initially it looked good, but his results were confusing. He had massive price ranges even when looking at cars that were 30+ years old. Digging in with Python only led to frustration. He spent a full week banging his head against the problem and came up empty-handed.
This is when he learned a VERY valuable lesson - tools are just tools. Use whichever you need to get the job done. (He looked over this article and below is his comment on this section).
He dropped his data into Tableau and within minutes found the problem - some of the cars were classified as collectors - meaning the price <> age correlation didn’t follow the usual trend for these vehicles!
I told him my car lot only handled common vehicles and he could remove those collectors cars. As if it was magic, his data made sense. He was ready to move forward!
His interest in statistics drove the final piece of this project - he wanted to use regression to predict prices. This was exactly what I wanted as manager of my car lot. I could plug in the most important variables and have a price spit out at the end.
He took a few weeks to figure it out and ended up being able to predict prices with reasonable accuracy. He wanted to refine his work more, but it was time to focus on his resume. He now had the skills needed and a few projects to back them up. Working on more projects may have been useful, but we didn’t see a reason that he wouldn’t be able to land an entry-level job with his current skills. It was time to put that theory to the test.
Something unexpected happened at this point. After months of pouring effort into learning, he stalled and pushed back. “Maybe I need to learn just a bit more?” “I don’t think I’ll actually land any interviews” and “There’s no way I’m ready or qualified” were all phrases that came out of his mouth.
I had missed out on the human aspect of our journey together. He hadn’t been on the job hunt for many years and was nervous. Understandably so, given he was putting all of his trust in me and my teaching method.
But I was confident he was ready. I had interviewed dozens of people at that point and knew what other applicants looked like. So we spent a bit of time talking through his qualifications and compared them to job postings. Sure, he only met 50% of the requirements for some roles, but when we looked at those requirements more closely we found that they didn’t make sense. Why would a Junior Data Analyst need to know ETL? That’s the role of a Data or Analytics Engineer. We found a lot of these “reaching” requirements in postings, usually put there to help filter down candidates. Discussing the reality of the job calmed his nerves and reminded him that there are always going to be new things to learn. That’s part of the fun of working in data!
So we dialled in his resume, which focused on outcomes and concrete language. Each of his bullet points had an outcome and there wasn’t any fluff. I was developing my Email Course at this time and he had the chance to use my bullet point cheat sheet. Much like the rest of his upskilling, we were looking to have the biggest impact without additional work. The course materials helped him immensely because he didn’t need to piece together advice from a bunch of places, it was all there for him without any searching needed.
A few months later his opportunity came. He interviewed and ultimately landed a role as an Intermediate Data Analyst! Not only did he get the role quickly, but he started off higher up than expected. He’s been there for a few months now and has received positive feedback on his work, proving that his skills were more than adequate from our work together.
This isn’t the typical path for people upskilling. He had the advantage of working with me throughout that year - it made a big difference. I caught him before he fell into all the common traps that I fell into myself, which saved months of work. When I was trying to learn what he was, I spent 3 months scraping my own data - we skipped that step with him.
Even if you don’t have a close friend to mentor you, it is possible to do something similar to this. You can swap a lot of the data advice I gave him with podcasts, books, and conferences. You can use ChatGPT as a stakeholder, feeding it information about the dataset and asking for questions. It would be more work, but the outcome would be similar.
Or, if you want to pursue a self-learning path like my friend did, reach out to me. I don’t know what the format and cost will be, given this was the first time I did it. But at the least, I can give some direction and pointers to save you time and unnecessary effort.
P.S. I asked if anyone had questions before I finished this and I got a great one from Rachel Ingham that I’ve included here with our back-and-forth
Rachel: When transitioning from a different career field into Data Analysis, it seems like all of the intermediate positions want 4+ years of experience in data and all of the entry level positions want recent college grads? How do you know which jobs to apply to? How do you avoid self-eliminating yourself from positions you could apply to?
Dylan: My short answer is that you need to apply to jobs where you meet 50-80% of the criteria and those criteria are generally going to be loosely ranked based on how high up in the job description it is.
That, and the requirements are there as a filter for the real skills behind them. If you can prove that you can write SQL as well as someone with 4 years of experience then you don't need to actually have those 4 years. But simply having the opportunity to prove that is really hard to get to and posting on social media or networking so that people get to know you better are both great ways to push through that barrier.
Rachel: Thanks so much for your response! I'm definitely struggling with not having the opportunity to prove my knowledge in an interview setting because it's so hard to get past the initial resume based cut of job applicants. Do you see a value in creating a short data portfolio to add to the materials that I submit with an application?
Dylan: That makes it hard for sure! Have you looked into Applicant Tracking Systems (ATS) to ensure your resume is functionally designed to make it through the initial filter before it even hits a human?
Yes, I 100% think portfolios are a great way to start proving that you can do what they need. Truthfully, most people will never look at your portfolio, but it gives you the ability to talk about your projects in a more professional and robust way. It goes from "this little thing I'm working on" to a real project with an outcome, which is important.
I've even used portfolio projects to fill a criteria gap for a role that I wanted (and ultimately was hired for). I hadn't had the chance to use the skill on the job so I built a project around it instead and talked about it in interviews and even mentioned it on my cover letter.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
Whenever you’re ready, here are 2 ways I can help you:
1. Data Analyst Launchpad - The course I mentioned above. It covers how to build a resume and cover letter that gets results, among other j. I share 7 years of data analyst experience, including interviewing and hiring for most of 2023.
2. A Coaching Call - If you’re struggling with applications or want to level up your skills, I’ll give you a plan to get there and the resources that I’ve used to get there myself. If you’re unsure if I can help, shoot me a message and I’ll provide any guidance I can right then and there, no cost to you!