Probability Distributions
Join our community on Telegram!
Join the biggest community of Pharma students and professionals.
A probability distribution describes how the values of a random variable are distributed. It shows the likelihood of different outcomes occurring in an experiment or process. Probability distributions are widely used in statistics to model uncertainty and make predictions based on data.
There are two main types of probability distributions: discrete and continuous. Discrete distributions describe variables that take specific, separate values, while continuous distributions describe variables that can take any value within a given range.
| Type | Description | Example |
|---|---|---|
| Discrete Distribution | Represents countable outcomes | Number of heads in coin tosses |
| Continuous Distribution | Represents measurable values within a range | Height, weight, temperature |
In R, probability distributions are handled using built-in functions. Most distributions in R follow a common naming pattern:
| Prefix | Purpose |
|---|---|
| d | Density or probability function |
| p | Cumulative distribution function |
| q | Quantile function |
| r | Random number generation |
For example, the normal distribution in R uses the following functions:
dnorm(x, mean, sd) # Probability density
pnorm(x, mean, sd) # Cumulative probability
qnorm(p, mean, sd) # Quantile value
rnorm(n, mean, sd) # Random numbers
The normal distribution is one of the most commonly used probability distributions. It has a bell-shaped curve and is defined by its mean and standard deviation.
# Generate 100 random values from a normal distribution
data <- rnorm(100, mean = 50, sd = 10)
# Plot histogram
hist(data)
Another common discrete distribution is the binomial distribution, which is used for experiments with two possible outcomes, such as success or failure.
# Probability of 3 successes in 10 trials
dbinom(3, size = 10, prob = 0.5)
# Generate random binomial data
rbinom(20, size = 10, prob = 0.5)
Probability distributions help in understanding how data behaves, predicting future outcomes, and forming the foundation for statistical inference and hypothesis testing.
