← Back to Chapters

Python statistics Module

? Python statistics Module

⚡ Quick Overview

The Python statistics module provides ready-made functions for basic descriptive statistics on numeric data. You can quickly calculate averages, middle values, most frequent values, and how spread out your data is using a few simple function calls.

? Key Concepts

  • Mean – Arithmetic average of a dataset.
  • Median – Middle value of the sorted data.
  • Mode – Most frequently occurring value.
  • Variance – Measures how far data values spread out from the mean.
  • Standard Deviation – Square root of variance, also a measure of spread.
  • Sample vs Population – Use variance()/stdev() for samples, and pvariance()/pstdev() for population data.

? Syntax & Theory

To use the statistics module, you must import it first:

? View Import Syntax
# Import the statistics module
import statistics

Most functions in this module accept an iterable (like a list or tuple) of numeric values:

  • statistics.mean(data)
  • statistics.median(data)
  • statistics.mode(data) or statistics.multimode(data)
  • statistics.variance(data), statistics.pvariance(data)
  • statistics.stdev(data), statistics.pstdev(data)

? Code Examples

➕ Mean (Average)

Calculates the arithmetic average of a dataset.

? View Code Example
# Import the statistics module
import statistics

# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]

# Calculate the mean of the dataset
mean_value = statistics.mean(data)

# Print the mean value
print("Mean:", mean_value)

? Median (Middle Value)

Returns the middle value in a sorted dataset.

? View Code Example
# Import the statistics module
import statistics

# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]

# Calculate the median of the dataset
median_value = statistics.median(data)

# Print the median value
print("Median:", median_value)

? Mode (Most Frequent Value)

Returns the most common value in the dataset.

? View Code Example
# Import the statistics module
import statistics

# Define a dataset with repeated values
data2 = [1, 2, 2, 3, 4, 4, 4, 5]

# Calculate the mode (most frequent value)
mode_value = statistics.mode(data2)

# Print the mode value
print("Mode:", mode_value)

⚙️ Variance (Spread of Data)

Measures how much the data varies from the mean.

? View Code Example
# Import the statistics module
import statistics

# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]

# Calculate the variance of the dataset
variance_value = statistics.variance(data)

# Print the variance value
print("Variance:", variance_value)

? Standard Deviation

Measures how much the data values deviate from the mean on average.

? View Code Example
# Import the statistics module
import statistics

# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]

# Calculate the standard deviation of the dataset
std_dev = statistics.stdev(data)

# Print the standard deviation value
print("Standard Deviation:", std_dev)

? Live Output & Explanation

For the dataset [10, 20, 30, 40, 50]:

? Sample Output

  • Mean: 30
  • Median: 30
  • Variance: 250
  • Standard Deviation: ≈ 15.81

Here, the values are evenly spread around the mean (30). A higher variance or standard deviation would indicate that the data points are more spread out from the mean.

? Use Cases

  • Analyzing exam scores to find average performance and spread.
  • Summarizing sales, prices, or sensor readings.
  • Finding the most popular choice in survey or poll data using mode.
  • Comparing variability between different datasets (e.g., performance of two classes).

? Tips & Best Practices

  • Use mean() when your data is fairly balanced without extreme outliers.
  • Prefer median() when your data has outliers or is skewed.
  • Use mode() (or multimode()) for categorical or repeated values.
  • Remember that variance() and stdev() need at least two data points.
  • For population data, use pvariance() and pstdev() instead of sample versions.
  • If mode() raises an error (no unique most common value), switch to statistics.multimode() to get all frequent values.

? Try It Yourself

  • Create a list of exam marks and calculate the mean, median, and mode using the statistics module.
  • Compute the variance and standard deviation for a dataset of monthly expenses.
  • Use statistics.multimode() to find multiple modes in a dataset like [1, 1, 2, 2, 3, 4].
  • Experiment with adding outliers to your dataset and observe how mean and median change differently.