
Image by Author | Canva
# Introduction
Raise your hand if you started your data analyst career in Excel. Yup, me too. Excel is a powerful tool for data analysis and visualization—and you know it. Let’s keep the Excel jokes for another article. However, despite improvements in handling larger datasets, there’s a point where Excel starts to creak under the load.
At this point, you might think, “Ah, screw Excel, I should’ve learned Python.” You still can. (Learn Python, not screw Excel.) Also, making the shift doesn’t mean abandoning Excel. Think of Python as a natural extension of your skills, reflected in these steps.
# Step 1: Map Excel Skills to Python Equivalents
Some Excel skills are transferable to Python, even though it’s a programming language. You can think of it as “Excel without the grid,” since many functions map between the two tools. Here are some examples.
While you’ll still have to learn Python’s syntax and language fundamentals, you’re not starting from scratch—you already understand the analytics part of the job. Now it’s about doing in Python what you already do in Excel.
# Step 2: Learn Python Fundamentals
Before you start coding, familiarize yourself with the language fundamentals. I recommend starting with:
- Basic syntax
- Variables, data types, loops, conditionals
- Lists and dictionaries (they’re similar to named ranges or lookup tables)
- Functions for reusing code (they’re like reusable formulas in Excel)
Here are some resources to get you started:
# Step 3: Set Up Your Environment
You don’t need a complicated Python environment. If you want to have it locally, install Anaconda. It comes with Python and the key libraries you’ll need at the beginning (pandas, NumPy, and Matplotlib). It also includes Jupyter Notebooks; think of it as a workbook where you write code and text notes.
You can make it even easier. If you have a Google account, you can use Colab. It’s Google’s version of the Jupyter Notebook and comes with even more libraries installed than Anaconda.
# Step 4: Start with Pandas
Python is famous for its ecosystem, rich with libraries that extend its capabilities. One of them is pandas, a library designed for data analysis and manipulation. It’s so common in data analysis that it’s practically inseparable from Python itself; once you start learning Python, you also learn pandas. Some things you should practice are:
- Creating DataFrames from Excel or CSV files
- Filtering, sorting, merging, aggregating
- Replicating your Excel workflows: pivot tables, lookups, conditional calculations
In general, try to translate everything you do in Excel into Python code.
Once you get the hang of pandas, start using NumPy, a library for numerical computing that underpins pandas.
# Step 5: Practice on Real Data
The fastest way to learn is by doing. There are several options. You can solve analytical questions on StrataScratch and LeetCode and practice on real interview questions. You get the data and the problem to solve; all you have to do is write the solution in Python.
Another option is to use available datasets and solve the problems that you think of. Some great dataset sources are Kaggle Datasets, data.gov, and Awesome Public Datasets.
If you need some suggestions for problems to solve, start with:
- Data cleaning (removing duplicates, standardizing dates, filling missing values)
- Building simple reports you’d normally do in Excel
# Step 6: Start Visualizing Data
The next step is to start visualizing your analyses. A great start is to recreate in Python the charts you already have in Excel. The two most popular data visualization Python libraries are:
- Matplotlib – for basic plots (line, bar, scatter)
- seaborn – for advanced visualizations with minimal code
# Step 7: Combine Excel and Python
You don’t need to abandon Excel. Even if you wanted to, you couldn’t because most stakeholders around you are wedded to Excel.
The ideal combination is to use openpyxl or xlwings to write back into Excel files from Python. In other words, Python does the heavy lifting in the background, but the final output lands in Excel for stakeholders. No need to stop there; currently, Microsoft is testing the new COPILOT() function that allows you to use AI in Excel.
# Conclusion
As you can see, transitioning from Excel to Python doesn’t mean you’re starting from zero. If you do data analysis in Excel, that already means you have some fundamental knowledge. You know your data analysis; the only thing is to make it more technically sophisticated by transferring that knowledge to a programming language.
Follow the steps in this article, and your transition will be smoother than you think.
Nate Rosidi is a data scientist and in product strategy. He’s also an adjunct professor teaching analytics, and is the founder of StrataScratch, a platform helping data scientists prepare for their interviews with real interview questions from top companies. Nate writes on the latest trends in the career market, gives interview advice, shares data science projects, and covers everything SQL.