# Exploring pandas DataFrame.median(): A Comprehensive Guide

## Introduction The median is a vital statistical measure that helps to understand the central tendency of a dataset. In pandas, a popular Python library for data manipulation and analysis, the ` median() ` function is used to calculate the median value of each column in a DataFrame. This guide will explore how to use the ` median() ` function in pandas, discuss its parameters, and provide examples to illustrate its application.

## Understanding the Median The median is the middle value of a dataset when it is ordered from smallest to largest. If the dataset has an even number of observations, the median is the average of the two middle numbers.

## Using the ` median() ` Function in pandas The ` median() ` function in pandas can be applied to a DataFrame to calculate the median of all the numeric columns. The syntax is as follows:

``DataFrame.median(axis=0, skipna=True, level=None, numeric_only=None) ``
• ` axis ` : {0 or ‘index’, 1 or ‘columns’}, default 0. The axis for which the median is calculated.
• ` skipna ` : Exclude NA/null values when computing the result.
• ` level ` : If the axis is a MultiIndex, count along a particular level, collapsing into a scalar.
• ` numeric_only ` : Include only float, int, boolean data.

## Examples Let’s go through some examples to understand how to use the ` median() ` function.

Example 1: Calculating Median of a DataFrame

``````import pandas as pd
import numpy as

np data = {
"A": [1, 2, 3, 4, 5],
"B": [5, 4, 3, 2, 1],
"C": [2, 3, np.nan, 3, 2]
}

df = pd.DataFrame(data)
median_values = df.median()
print(median_values) ``````

Output:

``````A 3.0
B 3.0
C 2.5
dtype: float64 ``````

In the above example, the median of each column is calculated, excluding the ` NaN ` value in column C.

Example 2: Calculating Median along Rows

To calculate the median along the rows, set the ` axis ` parameter to 1.

``````median_values_rows = df.median(axis=1)
print(median_values_rows) ``````

Output:

``````0 2.0
1 3.0
2 3.0
3 3.0
4 2.0
dtype: float64 ``````

Example 3: Handling Missing Values

By default, the ` median() ` function skips null values. To include them in calculations, set ` skipna ` to False.

``````median_values_with_na = df.median(skipna=False)
print(median_values_with_na) ``````

Output:

``````A 3.0
B 3.0
C NaN
dtype: float64 ``````

## Conclusion The ` median() ` function in pandas is a powerful tool for statistical analysis, helping to understand the central tendency of a dataset. By following this guide, users should feel confident in their ability to implement and leverage the ` median() ` function within pandas to enhance their data analysis processes. Remember to handle missing values according to your dataset's needs and the context of your analysis to ensure accurate and meaningful results. Understanding the pandas DataFrame.median() function is crucial for any data scientist or analyst, as it provides key insights into the distribution of data and helps make more informed decisions.