With so many people pushing into the data analyst role across industries, it’s become more important than ever to set yourself up for success when starting out in your new internship.
Just like any job, you will need to have fundamental skills in place to start quickly helping to solve problems for your organization from day one. However, the depth of a specific organization or industry probably isn’t available to you unless you’ve worked or studied in that areas before.
Analytics and data science programs generally don’t usually provide enough context to working in say a media company or in finance. Needless to say, there will be lots of on-the-job learning which is why you’re doing your internship in the first place. So, what are the essential skills you need to succeed regardless of the industry?
Data Analysis Toolkit
There are three hard skills we recommend to everyone who is entering the data analysis field:
These are hard skills that can be applicable across industries, job titles, and are easily accessible for beginners due to their low or no cost to learn.
The Best Excel Training
Excel, or more preferably (from my perspective) Google Sheets, is inescapable on the job. You can do as much hard number crunching you want in Python or SQL, but the vast majority of business stakeholders will want their data in Excel.
Excel can also enable additional layers of calculations, data manipulations, visualizations, and integrations with things like PowerPoint than is quickly available through Python.
Moral of the story, you’re not going to get far ignoring Excel. Instead, I would recommend taking a different apptiach: Master Excel. Whether you want to go into consulting, finance, or even the non-profit world, being the go-to person with the Excel-wizard skills is the person you want to be seen as. You’ll be approached with interesting and challenging questions people want to solve and will be as equally valuable for answering those questions.
While Excel can seem quite boring, it actually has add-ons which allow for it to perform some interesting business problems through statistical tests, linear regression, time-series analysis, and constraint maximization problems.
So, how do you get started? I got started with an older version of Microsoft Excel Data Analysis and Business Modeling back in 2008, but the newest versions have kept up with the times. That said, some of the use-cases can be a little boring and seem less interactive than an online course.
If you want a video component and assignments to test your knowledge as you acquire it, I’d recommend checking out Udacity’s Business Analytics program to get a better sense of how Excel will be applied in the overall business context v. other data analysis and visualization tools.
The Best SQL Training
Much like Excel, SQL is a must-learn skill. It’s everywhere and in almost every organization that stores data, you’re going to find you need it.
We’re highly biased to using the free datasets that are available within Google Cloud Platform’s BigQuery database to get started. Not only is BigQuery used by many digital and cloud based organizations, it also is readily accessible through any existing Google account by signing up for free with immediate access to many datasets where you can start learning to analyze real datasets.
Check out this super lengthy, and free, course to get started. It includes both actual SQL queries to run, sample datasets, and follow-along videos you can watch to learn from.
If you want a more formal online certificate course tey this Udacity Program. It’s a long course but contains more intermediate and advanced SQL commands that you won’t find in the free BigQuery SQL tutorial previously mentioned.
The Best Python Training
It’s very useful, if not essential, to learn a scripting/programming language if you want to be a true Data Analyst. You’ll want to focus your time, after your SQL knowledge, directly on choosing one scripting language.
When it comes to statistical analysis, languages and platforms like STATA, SPSS, or SAS are generally a waste of time for beginners to learn because they are only applicable skills to an ever-shrinking number of firms and industries. While they will teach you about basic statistical techniques, they have their gaps.
When it comes to narrowing down the field, there are basically two options: R or Python. R is second to Python in the eyes of many data scientists. Were I to go back in time, I would have focused on only Python and mastered it earlier, but I was in an academic program which pointed me in several directions. Don’t let this happen to you. Choose Python and jump into the deep-end.
There are several aspects of Python you need to know as a Data Analyst, mainly Python’s Standard Library and the Pandas library. The quickest way to learn these two things from my experience was to read the two books below and to jump into some real-world number crunching to apply those learnings as fast as possible.
Learn Python the Hard Way – This book is really focused on teaching you programming basics so that you understand terms such as libraries, variables, classes, and more. It doesn’t cover very exciting topics, but it is a foundation for training your mind to work with a programming language effectively. the Hard Way is that it’s basically set up to make you struggle a little bit and learn quickly from your failures. It’s worth it.
Python for Data Analysis – The Pandas library is by far the most used library by data analysts, scientists, and engineers and is likely the most popular Python library there is. Why? It allows you to use Python to manipulate data very similar to what you’d find in Excel, except instead of dealing with MBs of data, it can easily process GBs. This book was written by Wes McKinney who created the library and it’s an end-to-end fantastic read.
Another option for you is to dive in the deep end with a full course from Udacity. I recommend the Python for Data Science course to get ramped up quickly. It covers both basic as well as advanced data analysis libraries.
For more on getting setup with Python, we’ve covered some of the basics in a few other articles.
Soft Skills For Data Analysts
While you’re focusing on the aspects of your role to actually be able to crunch data, you can’t forget what comes before and after that step. You’ll need to spend some time thinking about the below questions and about how your general workflow will take place in a professional environment:
- How much time have you spent validating requirements for an analysis project?
- What’s the expected output for my analysis? Just a raw data dump or are they looking for analytical findings?
- How will you communicate with your colleagues if what they’re asking for simply isn’t possible with the available data?
- Which parts of this analysis are most impactful to the business?
- How will you present issues or challenges that may make actionable takeaways difficult for the business?
- What is the best medium of communicating my results to a stakeholder?
- Will my data visualizations match the type of insights my client is looking for?
While these are just some of the considerations at stake when you’re learning on the job. You’ll figure a lot of these out in the context of the organization you’re working in and once you encounter the real-world-data that your academic or technical training may not have prepared you for (Hint: Data in the wild can be very messy). There is a reason analytics professionals often refer to themselves as “data janitors“
Areas where you can focus your soft skills can include data visualization and basic consulting skills.
This is an internship and your hard skills will be improved by working on the job. Do some background studying Excel, Python, and SQL to get yourself as ready as you can be.
Remember the basics of being a professional and ensure you’re getting acquainted with the industry.
Lastly, reach out to your employer before you start to understand what skills they’re expecting from you to focus your attention on skills you’ll need on day 1.