To get rows and column counts in Pandas is a simple operation that we take to understand how much data is present within any given dataset. It is usually the first preliminary step in performing EDA on a dataset. In this tutorial, we’ll show how to pull data from an open-source dataset from FSU to perform these operations on a DataFrame, as seen below:
DataFrame.Shape is the fastest way to get both the number of columns and the number of rows from a DataFrame in Pandas. Once you have initialized your DataFrame variable, you can quickly get both the row and column counts from the output of the tuple. The first value will be the number of rows and the second value will be the number of columns in the DataFrame.
When you would like to see not only the count of rows but the count of rows by a specific column DataFrame.count() is the most useful approach to getting DataFrames. Calling the function results in a Pandas Series with the column names as the index and the count of records in each variable (that are not NULL).
One drawback of the prior approach is that if you are dealing with a dataset with sparse data is that the DataFrame.count() function could potentially return a non-holistic count of the records. As not all records can in a single column can be complete, this leads to issues. Counting the DataFrame.index object however will result in getting you the complete number of rows/records in the DataFrame. len(DataFrame.index) returns an integer value with the number of records in the index.
Counting Columns with DataFrame.columns
When calling the DataFrame.columns object in Pandas we get a list of column names back. The simplistic approach to getting the count of columns is to count this object using the standard Python len() function.
Counting Columns and Rows with DataFrame.axes
DataFrame.axes is an object for all DataFrames that when called returns a list with information about the DataFrame objects columns and rows. The first object provides an object representing the number of columns in the index, which can be counted using the len() function as seen below:
The second object contains the list of columns, similar to what we see using DataFrame.columns, that can also be counted using the len() function to get us the count of columns
Get Rows counts with the DataFrame Object
The most simplistic approach is to simply count the DataFrame object as it provides a count of the index object when counted using the len() function.
After this tutorial you should be able to get row and column counts in Pandas. For the code surfaced in the screenshots above, please visit our GitHub repository on Data Analysis. More information on common Pandas operations can be found in our detailed tutorials.