statistics ModuleThe Python statistics module provides ready-made functions for basic descriptive statistics on numeric data. You can quickly calculate averages, middle values, most frequent values, and how spread out your data is using a few simple function calls.
variance()/stdev() for samples, and pvariance()/pstdev() for population data.To use the statistics module, you must import it first:
# Import the statistics module
import statistics
Most functions in this module accept an iterable (like a list or tuple) of numeric values:
statistics.mean(data)statistics.median(data)statistics.mode(data) or statistics.multimode(data)statistics.variance(data), statistics.pvariance(data)statistics.stdev(data), statistics.pstdev(data)Calculates the arithmetic average of a dataset.
# Import the statistics module
import statistics
# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]
# Calculate the mean of the dataset
mean_value = statistics.mean(data)
# Print the mean value
print("Mean:", mean_value)
Returns the middle value in a sorted dataset.
# Import the statistics module
import statistics
# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]
# Calculate the median of the dataset
median_value = statistics.median(data)
# Print the median value
print("Median:", median_value)
Returns the most common value in the dataset.
# Import the statistics module
import statistics
# Define a dataset with repeated values
data2 = [1, 2, 2, 3, 4, 4, 4, 5]
# Calculate the mode (most frequent value)
mode_value = statistics.mode(data2)
# Print the mode value
print("Mode:", mode_value)
Measures how much the data varies from the mean.
# Import the statistics module
import statistics
# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]
# Calculate the variance of the dataset
variance_value = statistics.variance(data)
# Print the variance value
print("Variance:", variance_value)
Measures how much the data values deviate from the mean on average.
# Import the statistics module
import statistics
# Define a sample dataset of numbers
data = [10, 20, 30, 40, 50]
# Calculate the standard deviation of the dataset
std_dev = statistics.stdev(data)
# Print the standard deviation value
print("Standard Deviation:", std_dev)
For the dataset [10, 20, 30, 40, 50]:
Mean: 30Median: 30Variance: 250Standard Deviation: ≈ 15.81Here, the values are evenly spread around the mean (30). A higher variance or standard deviation would indicate that the data points are more spread out from the mean.
mean() when your data is fairly balanced without extreme outliers.median() when your data has outliers or is skewed.mode() (or multimode()) for categorical or repeated values.variance() and stdev() need at least two data points.pvariance() and pstdev() instead of sample versions.mode() raises an error (no unique most common value), switch to statistics.multimode() to get all frequent values.statistics module.statistics.multimode() to find multiple modes in a dataset like [1, 1, 2, 2, 3, 4].