? Python `statistics` Module

⚡ Quick Overview

The Python statistics module provides ready-made functions for basic descriptive statistics on numeric data. You can quickly calculate averages, middle values, most frequent values, and how spread out your data is using a few simple function calls.

? Key Concepts

Mean – Arithmetic average of a dataset.
Median – Middle value of the sorted data.
Mode – Most frequently occurring value.
Variance – Measures how far data values spread out from the mean.
Standard Deviation – Square root of variance, also a measure of spread.
Sample vs Population – Use variance()/stdev() for samples, and pvariance()/pstdev() for population data.

? Syntax & Theory

To use the statistics module, you must import it first:

? View Import Syntax

# Import the statistics module
import statistics

Most functions in this module accept an iterable (like a list or tuple) of numeric values:

statistics.mean(data)
statistics.median(data)
statistics.mode(data) or statistics.multimode(data)
statistics.variance(data), statistics.pvariance(data)
statistics.stdev(data), statistics.pstdev(data)

? Code Examples

➕ Mean (Average)

Calculates the arithmetic average of a dataset.

? View Code Example

# Import the statistics module
import statistics

# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]

# Calculate the mean of the dataset
mean_value = statistics.mean(data)

# Print the mean value
print("Mean:", mean_value)

? Median (Middle Value)

Returns the middle value in a sorted dataset.

? View Code Example

# Import the statistics module
import statistics

# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]

# Calculate the median of the dataset
median_value = statistics.median(data)

# Print the median value
print("Median:", median_value)

? Mode (Most Frequent Value)

Returns the most common value in the dataset.

? View Code Example

# Import the statistics module
import statistics

# Define a dataset with repeated values
data2 = [1, 2, 2, 3, 4, 4, 4, 5]

# Calculate the mode (most frequent value)
mode_value = statistics.mode(data2)

# Print the mode value
print("Mode:", mode_value)

⚙️ Variance (Spread of Data)

Measures how much the data varies from the mean.

? View Code Example

# Import the statistics module
import statistics

# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]

# Calculate the variance of the dataset
variance_value = statistics.variance(data)

# Print the variance value
print("Variance:", variance_value)

? Standard Deviation

Measures how much the data values deviate from the mean on average.

? View Code Example

# Import the statistics module
import statistics

# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]

# Calculate the standard deviation of the dataset
std_dev = statistics.stdev(data)

# Print the standard deviation value
print("Standard Deviation:", std_dev)

? Live Output & Explanation

For the dataset [10, 20, 30, 40, 50]:

? Sample Output

Mean: 30
Median: 30
Variance: 250
Standard Deviation: ≈ 15.81

Here, the values are evenly spread around the mean (30). A higher variance or standard deviation would indicate that the data points are more spread out from the mean.

? Use Cases

Analyzing exam scores to find average performance and spread.
Summarizing sales, prices, or sensor readings.
Finding the most popular choice in survey or poll data using mode.
Comparing variability between different datasets (e.g., performance of two classes).

? Tips & Best Practices

Use mean() when your data is fairly balanced without extreme outliers.
Prefer median() when your data has outliers or is skewed.
Use mode() (or multimode()) for categorical or repeated values.
Remember that variance() and stdev() need at least two data points.
For population data, use pvariance() and pstdev() instead of sample versions.
If mode() raises an error (no unique most common value), switch to statistics.multimode() to get all frequent values.

? Try It Yourself

Create a list of exam marks and calculate the mean, median, and mode using the statistics module.
Compute the variance and standard deviation for a dataset of monthly expenses.
Use statistics.multimode() to find multiple modes in a dataset like [1, 1, 2, 2, 3, 4].
Experiment with adding outliers to your dataset and observe how mean and median change differently.