Newsletter

Sign up to our newsletter to receive the latest updates

Rajiv Gopinath

Skewness and Kurtosis: Understanding Data Distribution Shape

Last updated:   April 05, 2025

Statistics and Data Science HubSkewnessKurtosisData DistributionProbabilityStatistical AnalysisExploratory Data AnalysisNormal DistributionLeptokurticPlatykurticData Science
Skewness and Kurtosis: Understanding Data Distribution ShapeSkewness and Kurtosis: Understanding Data Distribution Shape

Skewness and Kurtosis

In data analysis and exploratory data analysis (EDA), skewness and kurtosis are important measures that describe the shape of a data distribution. While both provide insights into the nature of the distribution, they focus on different characteristics. Skewness quantifies the asymmetry of the distribution, whereas kurtosis measures the degree of peak or flatness compared to a normal distribution.

Skewness helps identify whether a dataset is symmetrically distributed around the central value or whether it exhibits a longer tail on one side. A perfectly symmetric distribution has zero skewness, while positive or negative skewness indicates an imbalance in the tail lengths.

Kurtosis, on the other hand, describes how peaked or flat a distribution is. A normal distribution serves as a reference point, and deviations from this indicate either a sharper peak (leptokurtic) or a flatter shape (platykurtic).

Table of Contents

  1. Understanding Skewness and Kurtosis
  2. Applications of Skewness and Kurtosis
  3. Importance of Skewness and Kurtosis in Data Analysis
  4. Implementing Skewness and Kurtosis in Python
  5. Conclusion

Understanding Skewness and Kurtosis

Skewness

Skewness describes the asymmetry of a dataset's distribution. Many datasets do not follow a perfectly symmetrical distribution and may have data points more concentrated on one side. There are two main types of skewness:

  1. Positive Skewness (Right-Skewed Distribution):
    • The tail extends towards the right side of the graph.
    • The mean is greater than the median, which is greater than the mode: Mean>Median>Mode
  2. Negative Skewness (Left-Skewed Distribution):
    • The tail extends towards the left side of the graph.
    • The mean is less than the median, which is less than the mode: Mean<Median<Mode

Mathematically, skewness (Sk) is defined as:

Sk=

Where:

  • n is the number of observations
  • xix_​ represents individual data points
  • xˉ is the sample mean
  • σ is the standard deviation

 

What is Kurtosis?

Kurtosis measures the "tailedness" of a distribution, describing whether data points are more concentrated around the mean or dispersed. It is categorized into three types:

  1. Mesokurtic (Normal Distribution):
    • The kurtosis value is approximately 3.
    • The distribution has a similar peak to a normal distribution.
  2. Leptokurtic (High-Peak Distribution):
    • The kurtosis value is greater than 3.
    • The distribution has a higher peak and fatter tails, indicating extreme outliers.
  3. Platykurtic (Flat Distribution):
    • The kurtosis value is less than 3.
    • The distribution is flatter, meaning data points are more evenly spread.

Mathematically, kurtosis (KKK) is calculated as:

Where:

  • n is the number of observations
  • xix represents individual data points
  • xˉ is the sample mean
  • σ is the standard deviation

 

Types of Kurtosis:

  • Mesokurtic: A distribution that exhibits moderate kurtosis, similar to a normal distribution. It is characterized by a balanced shape where the tails are neither too long nor too short, indicating a symmetrical data distribution.
  • Leptokurtic: A distribution with higher kurtosis than a normal distribution. It has longer and more pronounced tails, suggesting that a greater proportion of data points are concentrated in the tails. This indicates the presence of more extreme values compared to a mesokurtic distribution.
  • Platykurtic: A distribution with lower kurtosis than a normal distribution. It has shorter, less pronounced tails, meaning fewer extreme values exist. The data is more evenly spread, resulting in a flatter distribution compared to a mesokurtic one.

 

Key Differences Between Skewness and Kurtosis

  • Skewness quantifies the asymmetry of a distribution, whereas kurtosis measures the extent of peakedness or flatness in a distribution.
  • Skewness represents the third moment of a distribution, while kurtosis is the fourth moment.
  • Both skewness and kurtosis can take values ranging from negative infinity to positive infinity.
  • A value of zero for both skewness and kurtosis signifies a perfectly symmetrical and normal distribution.
  • Skewness impacts the central tendency of a distribution, whereas kurtosis influences the tails of the distribution.
  • Both skewness and kurtosis are essential for understanding the shape and behavior of a dataset.

Comparison Between Skewness and Kurtosis

ParameterSkewnessKurtosis
DefinitionMeasures the asymmetry of a distribution.Measures the degree of peakedness or flatness of a distribution.
CalculationRepresents the third moment of the distribution.Represents the fourth moment of the distribution.
Range of ValuesCan take values from -∞ to +∞.Can take values from -∞ to +∞.
Interpretation- Negative Skewness: Left tail is longer.
- Zero Skewness: Symmetrical distribution.
- Positive Skewness: Right tail is longer.
- Platykurtic (Low Kurtosis): Flatter than a normal distribution.
- Zero Kurtosis: Normal distribution.
- Leptokurtic (High Kurtosis): More peaked than a normal distribution.
Impact on DistributionAffects the central tendency, as asymmetry influences the mean more than the median.Affects the tails of the distribution, with high kurtosis indicating more extreme values.
Examples- Positively Skewed: Income, stock returns, wealth distribution.
- Negatively Skewed: Retirement age, exam failure rates.
- Leptokurtic: IQ scores, reaction times, standardized test scores.
- Platykurtic: Income levels, height, and weight distributions.

This structured comparison highlights the fundamental differences and significance of skewness and kurtosis in statistical analysis.

Applications of Skewness and Kurtosis

Skewness and kurtosis are fundamental statistical measures used to analyze the shape and characteristics of probability distributions. They are widely applied across various fields to gain insights into data behavior and distributional properties.

Applications of Skewness

  1. Financial Analysis
    • In finance, skewness is used to evaluate the symmetry of return distributions for stocks, portfolios, and other financial instruments. This helps investors understand potential risks and returns associated with different investments.
  2. Economics
    • Skewness plays a role in analyzing the distribution of income and wealth, allowing economists to assess disparities and asymmetries in economic data.
  3. Risk Management
    • Risk managers utilize skewness to analyze the asymmetry of return distributions in investment portfolios. Understanding skewness helps in evaluating the likelihood of extreme losses or gains.
  4. Biostatistics and Epidemiology
    • In medical research, skewness is used to study the distribution of health-related variables, such as patient recovery times or disease progression patterns.
  5. Marketing and Consumer Behavior
    • Market researchers use skewness to analyze consumer preferences and behaviors. For instance, skewness can help interpret survey responses related to product satisfaction.
  6. Environmental Studies
    • Skewness is applied to study environmental data distributions, such as pollution levels across different regions, helping researchers identify anomalies or extreme values.

Applications of Kurtosis

  1. Finance and Investment
    • Kurtosis is essential in finance for assessing investment risk and return volatility. It helps investors understand the likelihood of extreme fluctuations or tail risks.
  2. Actuarial Science
    • In the insurance industry, kurtosis is used to model the distribution of insurance claims and assess the probability of extreme losses.
  3. Quality Control and Manufacturing
    • Kurtosis helps in quality control by examining the distribution of product measurements. It assists manufacturers in determining process consistency and product reliability.
  4. Stock Market Analysis
    • Analysts use kurtosis to evaluate the probability of extreme stock price movements, helping in risk assessment for market indices and individual stocks.
  5. Machine Learning and Data Science
    • In data science, kurtosis aids in preprocessing by identifying datasets with heavy tails, which can impact model selection and performance.

These applications highlight the widespread utility of skewness and kurtosis in various industries, aiding analysts and researchers in understanding the distributional properties of their data.

Significance of Skewness and Kurtosis

Both skewness and kurtosis provide deeper insights into data distributions beyond simple measures like mean and standard deviation. Their significance lies in their ability to describe the shape and behavior of datasets.

Significance of Skewness

  1. Assessing Symmetry
    • Skewness helps determine whether a dataset is symmetrical. A skewness value of zero indicates a perfectly balanced distribution, while positive and negative values indicate right and left asymmetry, respectively.
  2. Impact on Mean and Median
    • In skewed distributions, the mean and median differ. Positively skewed distributions typically have a mean greater than the median, while negatively skewed distributions have a mean lower than the median. This distinction is essential for accurately summarizing central tendency.
  3. Risk Analysis
    • Skewness is particularly important in finance, as it helps assess the potential for extreme positive or negative returns, influencing investment strategies and risk management.
  4. Understanding Distribution Shape
    • Skewness provides insight into the presence of extreme values and outliers, which is valuable in fields such as economics, epidemiology, and environmental science.

Significance of Kurtosis

  1. Measuring Tailedness and Outliers
    • Kurtosis quantifies the heaviness of a distribution’s tails. A normal distribution has a kurtosis value of 3, while values higher than 3 (leptokurtic) suggest heavier tails, and values lower than 3 (platykurtic) indicate lighter tails.
  2. Financial Risk Assessment
    • Investors use kurtosis to analyze the likelihood of extreme market events. Leptokurtic distributions indicate a greater chance of significant price fluctuations, impacting portfolio management.
  3. Decision-Making in Quality Control
    • In manufacturing, kurtosis helps determine whether product variations fall within acceptable limits, ensuring quality control in production processes.
  4. Detecting Deviations from Normality
    • Kurtosis helps identify departures from normality, which can influence the choice of statistical methods and impact the accuracy of hypothesis testing.
  5. Influence on Statistical Methods
    • Many statistical tests assume normality in data distributions. Understanding kurtosis ensures that researchers select appropriate models and methods for analysis.

Implementation of Skewness and Kurtosis in python

import numpy as np
from scipy.stats import skew, kurtosis

# Sample data
data = np.array([10, 15, 20, 25, 30, 35, 40, 45, 50, 100])

# Calculate skewness and kurtosis
skewness_value = skew(data)
kurtosis_value = kurtosis(data)

# Display results
print(f'Skewness: {skewness_value}')
print(f'Kurtosis: {kurtosis_value}')

Skewness: 1.499634538383203
Kurtosis: 1.769245965282967
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm, skew, kurtosis

# Function to plot skewness and kurtosis
def plot_distribution(data, title):
    plt.figure(figsize=(6,4))
    plt.hist(data, bins=30, alpha=0.7, color='blue', density=True)
    x = np.linspace(min(data), max(data), 100)
    plt.plot(x, norm.pdf(x, np.mean(data), np.std(data)), 'r--')
    plt.title(title)
    plt.xlabel('Value')
    plt.ylabel('Density')
    plt.grid(True)
    plt.show()

# Generating different distributions

# 1. Symmetric (Normal) Distribution - Zero Skewness
data_symmetric = np.random.normal(loc=0, scale=1, size=1000)
plot_distribution(data_symmetric, f'Symmetric Distribution (Skewness={skew(data_symmetric):.2f}, Kurtosis={kurtosis(data_symmetric):.2f})')

# 2. Right-Skewed Distribution (Positive Skewness)
data_right_skew = np.random.exponential(scale=1, size=1000)
plot_distribution(data_right_skew, f'Right-Skewed Distribution (Skewness={skew(data_right_skew):.2f}, Kurtosis={kurtosis(data_right_skew):.2f})')

# 3. Left-Skewed Distribution (Negative Skewness)
data_left_skew = -np.random.exponential(scale=1, size=1000)
plot_distribution(data_left_skew, f'Left-Skewed Distribution (Skewness={skew(data_left_skew):.2f}, Kurtosis={kurtosis(data_left_skew):.2f})')

# 4. Leptokurtic Distribution (High Kurtosis)
data_leptokurtic = np.random.laplace(loc=0, scale=1, size=1000)
plot_distribution(data_leptokurtic, f'Leptokurtic Distribution (Kurtosis={kurtosis(data_leptokurtic):.2f})')

# 5. Platykurtic Distribution (Low Kurtosis)
data_platykurtic = np.random.uniform(low=-2, high=2, size=1000)
plot_distribution(data_platykurtic, f'Platykurtic Distribution (Kurtosis={kurtosis(data_platykurtic):.2f})')

 

Google Colab Code

Conclusion

Skewness and kurtosis are essential statistical measures that help describe the shape and characteristics of probability distributions. These measures provide deeper insights into data behavior, aiding in analysis across various fields. Below is a summary of their significance:

Skewness

  1. Evaluating Symmetry
    • Skewness quantifies the asymmetry of a distribution. A value of zero signifies a perfectly symmetrical distribution, while positive and negative values indicate right and left skewness, respectively.
  2. Effect on Central Tendency
    • The degree of skewness influences the relationship between the mean and median. A right-skewed distribution (positive skewness) shifts the mean higher than the median, whereas a left-skewed distribution (negative skewness) pulls the mean below the median.
  3. Risk Analysis
    • In financial contexts, skewness helps assess potential risks. A positively skewed distribution may indicate a greater likelihood of large positive returns, while a negatively skewed one suggests an increased probability of extreme negative returns.
  4. Understanding Distribution Shape
    • Skewness provides valuable insights into the shape of a dataset, helping analysts identify the presence of extreme values. This is particularly useful in fields like finance, economics, and environmental research.

Kurtosis

  1. Measuring Tailedness
    • Kurtosis assesses the prominence of a distribution’s tails. A kurtosis value of 3 represents a normal distribution, values above 3 (leptokurtic) indicate heavier tails with more extreme values, and values below 3 (platykurtic) suggest lighter tails.
  2. Financial Risk Evaluation
    • In investment and risk management, kurtosis helps gauge the probability of extreme fluctuations. Leptokurtic distributions are associated with higher peaks and a greater likelihood of sudden market movements.
  3. Quality Control and Manufacturing
    • Understanding kurtosis is useful in quality control as it helps determine whether product measurements follow a stable distribution. This ensures consistency and reliability in manufacturing processes.
  4. Detecting Deviations from Normality
    • Kurtosis helps determine whether a dataset significantly deviates from normality, influencing the choice of statistical models and analytical techniques.

By considering both skewness and kurtosis, analysts gain a more comprehensive understanding of data distribution. These measures play a crucial role in decision-making, risk evaluation, and selecting appropriate statistical methodologies. However, their interpretation should always align with the specific context and objectives of the analysis.