$$ \newcommand{\defeq}{\stackrel{\small\bullet}{=}} \newcommand{\ra}{\rangle} \newcommand{\la}{\langle} \newcommand{\norm}[1]{\left\|#1\right\|} \newcommand{\abs}[1]{\left\lvert#1\right\rvert} \newcommand{\Abs}[1]{\Bigl\lvert#1\Bigr\rvert} \newcommand{\pr}{{\mathbb P}} \newcommand{\qr}{{\mathbb Q}} \newcommand{\xv}{{\boldsymbol{x}}} \newcommand{\av}{{\boldsymbol{a}}} \newcommand{\bv}{{\boldsymbol{b}}} \newcommand{\cv}{{\boldsymbol{c}}} \newcommand{\dv}{{\boldsymbol{d}}} \newcommand{\ev}{{\boldsymbol{e}}} \newcommand{\fv}{{\boldsymbol{f}}} \newcommand{\gv}{{\boldsymbol{g}}} \newcommand{\hv}{{\boldsymbol{h}}} \newcommand{\nv}{{\boldsymbol{n}}} \newcommand{\sv}{{\boldsymbol{s}}} \newcommand{\tv}{{\boldsymbol{t}}} \newcommand{\uv}{{\boldsymbol{u}}} \newcommand{\vv}{{\boldsymbol{v}}} \newcommand{\wv}{{\boldsymbol{w}}} \newcommand{\zerov}{{\mathbf{0}}} \newcommand{\onev}{{\mathbf{0}}} \newcommand{\phiv}{{\boldsymbol{\phi}}} \newcommand{\cc}{{\check{C}}} \newcommand{\xv}{{\boldsymbol{x}}} \newcommand{\Xv}{{\boldsymbol{X}\!}} \newcommand{\yv}{{\boldsymbol{y}}} \newcommand{\Yv}{{\boldsymbol{Y}}} \newcommand{\zv}{{\boldsymbol{z}}} \newcommand{\Zv}{{\boldsymbol{Z}}} \newcommand{\Iv}{{\boldsymbol{I}}} \newcommand{\Jv}{{\boldsymbol{J}}} \newcommand{\Cv}{{\boldsymbol{C}}} \newcommand{\Ev}{{\boldsymbol{E}}} \newcommand{\Fv}{{\boldsymbol{F}}} \newcommand{\Gv}{{\boldsymbol{G}}} \newcommand{\Hv}{{\boldsymbol{H}}} \newcommand{\alphav}{{\boldsymbol{\alpha}}} \newcommand{\epsilonv}{{\boldsymbol{\epsilon}}} \newcommand{\betav}{{\boldsymbol{\beta}}} \newcommand{\deltav}{{\boldsymbol{\delta}}} \newcommand{\gammav}{{\boldsymbol{\gamma}}} \newcommand{\etav}{{\boldsymbol{\eta}}} \newcommand{\piv}{{\boldsymbol{\pi}}} \newcommand{\thetav}{{\boldsymbol{\theta}}} \newcommand{\tauv}{{\boldsymbol{\tau}}} \newcommand{\muv}{{\boldsymbol{\mu}}} \newcommand{\phiinv}{\Phi^{-1}} \newcommand{\Fiinv}{F^{-1}} \newcommand{\giinv}{g^{-1}} \newcommand{\fhat}{\hat{f}} \newcommand{\ghat}{\hat{g}} \newcommand{\ftheta}{f_\theta} \newcommand{\fthetav}{f_{\thetav}} \newcommand{\gtheta}{g_\theta} \newcommand{\gthetav}{g_{\thetav}} \newcommand{\ztheta}{Z_\theta} \newcommand{\xtheta}{\Xv_\theta} \newcommand{\ytheta}{\Yv_\theta} \newcommand{\p}{\partial} \newcommand{\f}{\frac} \newcommand{\cf}{\cfrac} \newcommand{\e}{\epsilon} \newcommand{\indep}{\perp\kern-5pt \perp} \newcommand{\inner}[1]{\langle#1\rangle} \newcommand{\pa}[1]{\left(#1\right)} \newcommand{\pb}[1]{\left\{#1\right\}} \newcommand{\pc}[1]{\left[#1\right]} \newcommand{\pA}[1]{\Big(#1\Big)} \newcommand{\pB}[1]{\Big\{#1\Big\}} \newcommand{\pC}[1]{\Big[#1\Big]} \newcommand{\ty}[1]{\texttt{#1}} \newcommand{\borel}[1]{\mathscr{B}\pa{#1}} \newcommand{\scr}{\mathcal} \newcommand{\scrb}{\mathscr} \newcommand{\argmin}{\mathop{\text{arg}\ \!\text{min}}} \newcommand{\arginf}{\mathop{\text{arg}\ \!\text{inf}}} \newcommand{\argmax}{\mathop{\text{arg}\ \!\text{max}}} \newcommand{\argsup}{\mathop{\text{arg}\ \!\text{sup}}} \newcommand{\bigo}[1]{\mathcal{O}_{p}\!\left(#1\right)} \newcommand{\f}{\frac} \newcommand{\e}{\epsilon} \newcommand{\inv}{^{-1}} \newcommand{\phiinv}{\Phi^{-1}} \newcommand{\Fiinv}{F^{-1}} \newcommand{\giinv}{g^{-1}} \newcommand{\fhat}{\hat{f}} \newcommand{\ghat}{\hat{g}} \newcommand{\ftheta}{f_\theta} \newcommand{\fthetav}{f_{\thetav}} \newcommand{\gtheta}{g_\theta} \newcommand{\gthetav}{g_{\thetav}} \newcommand{\ztheta}{Z_\theta} \newcommand{\xtheta}{\Xv_\theta} \newcommand{\ytheta}{\Yv_\theta} \newcommand{\absdet}[1]{\abs{\det\pa{#1}}} \newcommand{\jac}[1]{\Jv_{#1}} \newcommand{\absdetjx}[1]{\abs{\det\pa{\Jv_{#1}}}} \newcommand{\absdetj}[1]{\norm{\Jv_{#1}}} \newcommand{\sint}{sin(\theta)} \newcommand{\cost}{cos(\theta)} \newcommand{\sor}[1]{S\mathcal{O}(#1)} \newcommand{\ort}[1]{\mathcal{O}(#1)} \newcommand{\A}{{\mathcal A}} \newcommand{\C}{{\mathbb C}} \newcommand{\E}{{\mathbb E}} \newcommand{\F}{{\mathcal{F}}} \newcommand{\N}{{\mathbb N}} \newcommand{\R}{{\mathbb R}} \newcommand{\Q}{{\mathbb Q}} \newcommand{\Z}{{\mathbb Z}} \newcommand{\X}{{\mathbb{X}}} \newcommand{\Y}{{\mathbb{Y}}} \newcommand{\G}{{\mathcal{G}}} \newcommand{\M}{{\mathcal{M}}} \newcommand{\betaequivalent}{\beta\text{-equivalent}} \newcommand{\betaequivalence}{\beta\text{-equivalence}} \newcommand{\Mb}{{\boldsymbol{\mathsf{M}}}} \newcommand{\Br}{{\mathbf{\mathsf{Bar}}}} \newcommand{\dgm}{{\mathfrak{Dgm}}} \newcommand{\Db}{{\mathbf{\mathsf{D}}}} \newcommand{\Img}{{\mathbf{\mathsf{Img}}}} \newcommand{\mmd}{{\mathbf{\mathsf{MMD}}}} \newcommand{\Xn}{{\mathbb{X}_n}} \newcommand{\Xm}{{\mathbb{X}_m}} \newcommand{\Yn}{{\mathbb{Y}_n}} \newcommand{\Ym}{Y_1, Y_2, \cdots, Y_m} \newcommand{\Xb}{{\mathbb{X}}} \newcommand{\Yb}{{\mathbb{Y}}} \newcommand{\s}{{{\sigma}}} \newcommand{\fnsbar}{{\bar{f}^n_\s}} \newcommand{\fns}{{f^n_\s}} \newcommand{\fs}{{f_\s}} \newcommand{\fsbar}{{\bar{f}_\s}} \newcommand{\barfn}{{{f}^n_\sigma}} \newcommand{\barfnm}{{{f}^{n+m}_\sigma}} \newcommand{\barfo}{{{f}_\sigma}} \newcommand{\fn}{{f^n_{\rho,\sigma}}} \newcommand{\fnm}{{f^{n+m}_{\rho,\sigma}}} \newcommand{\fo}{{f_{\rho,\sigma}}} \newcommand{\K}{{{K_{\sigma}}}} \newcommand{\barpn}{{\bar{p}^n_\sigma}} \newcommand{\barpo}{{\bar{p}_\sigma}} \newcommand{\pn}{{p^n_\sigma}} \newcommand{\po}{{p_\sigma}} \newcommand{\J}{{\mathcal{J}}} \newcommand{\B}{{\mathcal{B}}} \newcommand{\pt}{{\tilde{\mathbb{P}}}} \newcommand{\Winf}{{W_{\infty}}} \newcommand{\winf}{{W_{\infty}}} \newcommand{\HH}{{{\scr{H}_{\sigma}}}} \newcommand{\D}{{{\scr{D}_{\sigma}}}} \newcommand{\Ts}{{T_{\sigma}}} \newcommand{\Phis}{{\Phi_{\sigma}}} \newcommand{\nus}{{\nu_{\sigma}}} \newcommand{\Qs}{{\mathcal{Q}_{\sigma}}} \newcommand{\ws}{{w_{\sigma}}} \newcommand{\vs}{{v_{\sigma}}} \newcommand{\ds}{{\delta_{\sigma}}} \newcommand{\fp}{{f_{\pr}}} \newcommand{\prs}{{\widetilde{\pr}_{\sigma}}} \newcommand{\qrs}{{\widetilde{\qr}_{\sigma}}} \newcommand{\Inner}[1]{\Bigl\langle#1\Bigr\rangle} \newcommand{\innerh}[1]{\langle#1\rangle_{\HH}} \newcommand{\Innerh}[1]{\Bigl\langle#1\Bigr\rangle_{\HH}} \newcommand{\normh}[1]{\norm{#1}_{\HH}} \newcommand{\norminf}[1]{\norm{#1}_{\infty}} \newcommand{\gdelta}{{\G_{\delta}}} \newcommand{\supgdelta}{{\sup\limits_{g\in\gdelta}\abs{\Delta_n(g)}}} \newcommand{\id}{\text{id}} \newcommand{\supp}{\text{supp}} \newcommand{\cech}{\v{C}ech} \newcommand{\Zz}{{\scr{Z}}} \newcommand{\psis}{\psi_\s} \newcommand{\phigox}{\Phis(\xv)-g} \newcommand{\phigoy}{\Phis(\yv)-g} \newcommand{\fox}{{f^{\epsilon,{\xv}}_{\rho,\sigma}}} \newcommand{\prx}{{\pr^{\epsilon}_{\xv}}} \newcommand{\pro}{{\pr_0}} \newcommand{\dotfo}{\dot{f}_{\!\!\rho,\s}} \newcommand{\phifo}{{\Phis(\yv)-\fo}} \newcommand{\phifox}{{\Phis(\xv)-\fo}} \newcommand{\kinf}{{\norm{\K}_{\infty}}} \newcommand{\half}{{{\f{1}{2}}}} \newcommand{\Jx}{\J_{\epsilon,{\xv}}} \newcommand{\dpy}{\text{differential privacy}} \newcommand{\edpy}{$\epsilon$--\text{differential privacy}} \newcommand{\eedpy}{$\epsilon$--edge \text{differential privacy}} \newcommand{\dpe}{\text{differentially private}} \newcommand{\edpe}{$\epsilon$--\text{differentially private}} \newcommand{\eedpe}{$\epsilon$--edge \text{differentially private}} \newcommand{\er}{Erdős-Rényi} \newcommand{\krein}{Kreĭn} % \newcommand{\grdpg}{\mathsf{gRDPG}} % \newcommand{\rdpg}{\mathsf{RDPG}} % \newcommand{\eflip}{{\textsf{edgeFlip}}} % \newcommand{\grdpg}{\text{gRDPG}} % \newcommand{\rdpg}{\text{RDPG}} \newcommand{\grdpg}{\mathsf{gRDPG}} \newcommand{\rdpg}{\mathsf{RDPG}} \newcommand{\eflip}{{\text{edgeFlip}}} \newcommand{\I}{{\mathbb I}} \renewcommand{\pa}[1]{\left(#1\right)} \renewcommand{\pb}[1]{\left\{#1\right\}} \renewcommand{\pc}[1]{\left[#1\right]} \renewcommand{\V}{\mathbb{V}} \renewcommand{\W}{\mathbb{W}} %%%%%%%%%%%%%%%%%%%%%%%%%%% \providecommand{\fd}{\frac 1d} % \renewcommand{\fpp}{{\frac 1p}} \providecommand{\pfac}{\f{p}{p-1}} \providecommand{\ipfac}{\f{p-1}{p}} \providecommand{\dbq}{\Delta b_{n,m,Q}\qty(\qty{\xvo})} \providecommand{\db}{\Delta b_{n,m}\qty(\qty{\xvo})} \providecommand{\bbv}{{{\mathbb{V}}}} \providecommand{\bbw}{{{\mathbb{W}}}} \providecommand{\md}{\textsf{MoM Dist}} \providecommand{\bF}{{\mathbf{F}}} \providecommand{\sub}{{\text{Sub}}} \providecommand{\samp}{\text{$\pa{\scr{S}}$}} \providecommand{\tp}{{2^{\f{p-1}{p}}}} %%%%%%%%%%%%%%%%%%%%%%%%%% \providecommand{\Xmn}{{\mathbb{X}_{n+m}}} \newcommand{\Dnmq}{\D[n+m, Q]} \newcommand{\Dnmh}{\D[n+m, \H]} \newcommand{\Dn}{\D[n]} \providecommand{\xvo}{\xv_0} \providecommand{\bn}[1][\null]{b^{#1}_{n}\pa{\pb{\xvo}}} \providecommand{\bnm}[1][\null]{b^{#1}_{n+m}\pa{\pb{\xvo}}} \providecommand{\bnq}[1][\null]{b^{#1}_{n,Q}\pa{\pb{\xvo}}} \providecommand{\bnmq}[1][\null]{b^{#1}_{n+m,Q}\pa{\pb{\xvo}}}\providecommand{\prq}{\pr_q} \providecommand{\dxvo}{{\delta_{\xvo}}} \providecommand{\sq}{S_q} \providecommand{\Sq}{\abs{S_q}} \providecommand{\no}{{n_o}} \providecommand{\mmdn}{\mmd\pa{\pr_n, \delta_{\xvo}}} \newcommand{\rqt}{\xi_{q}(t; n, Q)} \providecommand{\nq}{\f{n}{Q}} \providecommand{\Ot}{\Omega(t, n/Q)} \providecommand{\ut}[1]{U^{#1}} \providecommand{\vt}[1]{V^{#1}} \providecommand{\wt}[1]{W^{#1}} \providecommand{\but}[1]{\mathbb{U}^{#1}} \providecommand{\bvt}[1]{\mathbb{V}^{#1}} \providecommand{\bwt}[1]{\mathbb{W}^{#1}} \providecommand{\ball}[1]{B_{f\!, \rho}\pa{#1}} \newcommand*{\medcap}{\mathbin{\scalebox{0.75}{{\bigcap}}}}% \newcommand*{\medcup}{\mathbin{\scalebox{0.75}{{\bigcup}}}}% \providecommand{\dsf}{\mathsf{d}} \newcommand{\Dnh}{{\mathsf{D}_{n,\scr{H}}}} \newcommand{\Dph}{{\mathsf{D}_{\pr,\scr{H}}}} \newcommand{\D}[1][1={ },usedefault]{{\mathsf{D}_{#1}}} \newcommand{\Dnq}{{\mathsf{D}_{n, Q}}} \newcommand{\dnq}{{\mathsf{d}_{n, Q}}} \newcommand{\dn}{{\mathsf{d}_{n}}} \newcommand{\dnm}{{\mathsf{d}_{n-m}}} \newcommand{\dmn}{{\mathsf{d}_{n+m}}} \newcommand{\dx}{{\mathsf{d}_{\mathbb{X}}}} \providecommand{\med}{\text{median}} \providecommand{\median}{\text{median}} \providecommand{\Xnm}{{\mathbb{X}^*_{n-m}}} $$

Week-3

Math 183 • Statistical Methods • Spring 2026

Siddharth Vishwanath

Learning objectives

Conditional Probability
Random variables
Concept of discrete vs. continuous random variables

Random Variables

(Random) Variable

Random variable

A random variable is a variable which assumes values based on the outcome of a trial from a random phenomenon.

Tip

Think of a random variable as a placeholder for the different outcomes we can witness from a trial.

Notation

Random variables are usually denoted by capital letters (e.g., $X, Y, Z, V, W$) from the end of the alphabet.

Support of a random variable

Support

The support of a random variable is the universe of all possible values a random variable can assume.

Types of random variables

Since a random variable is simply a “placeholder” for the values a variable can take … random variables are also divided into groups based on its support.

Recall the main types of variables from Week-1:

Qualitative
- Nominal/Categorical
- Ordinal

Quantitative
- Discrete
- Continuous

Coin toss

You flip a coin. The random variable, $X$, is a placeholder for the outcome of the coin toss.

1) The support of the random variable is $\text{supp}(X) = \left\{H, T\right\}$, i.e., \[\left\{X=H\right\} \quad\text{or}\quad \left\{X=T\right\}\]
2) This is an example of a nominal categorical random variable

Computer-friendly Convention

Usually, a discrete outcome is “encoded” as {0, 1}, i.e.,
– $X=1$ is understood to be that the outcome is $H$, and
– $X=0$ is understood to be that the outcome is $T$.

For categorical variables, this usually evident from the problem (but not always ⚠️)

This class

You’re bored in your Math 183 class and you’ve also just had lunch. To keep yourself entertained you decide to count how many times you yawn. If $X$ denotes this random variable:

1) The support of the random variable is $\text{supp}(X) = \left\{1, 2, 3, \dots\right\}$, i.e., \[\left\{X=1\right\}, \quad\text{or} \left\{X=2\right\},\quad\text{or}\dots\quad\text{or} \left\{X=100\right\}, \quad \dots\] 2) This is an example of a discrete quantitative random variable

BMI

We pick a student at random and ask them what their BMI is (weight ÷ height$^2$). Let $X$ denote this random variable:

1) The support of the random variable is $\text{supp}(X) = {\mathbb R}$, i.e., \[\left\{X=x\right\}, \quad\text{for any}\quad x \in {\mathbb R}\] 2) This is an example of a continuous quantitive random variable

Anatomy of a random variable

Every random variable has:

A mathematical symbol representing it
- e.g., $X, Y, Z$

A support, $\text{supp}(X)$
- This determines the nature of the random variable

A probability distribution ${\mathbb P}_X$.
- This determines the probability of the random variable taking a specific set of values in its support

Measures of central tendency and dispersion.
- The measure of central tendency is called its expectation ${\mathbb E}(X)$
- The measure of dispersion is called its variance ${\text{Var}}(X)$

Part-I

Discrete Random Variables

Probability Distribution

Probability Distribution

A probability distribution, ${\mathbb P}$, associated with a random variable $X$ describes the probability with $X$ can take on the possible values in its support.

Tip

A probability distribution is the minimum amount of information you need in order to determine the probabilities for all possible events you can create from a random variable

Example

Suppose the random variable $X$ denotes a randomly chosen student’s favorite primary color.

$\text{supp}(X)$	Red	Blue	Green
${\mathbb P}(\left\{X=x\right\})$	0.20	0.55	0.25

You can use this information to find, for example,

\[ {\mathbb P}\left(\left\{X = \text{Red or Green}\right\}\right) = {\mathbb P}\left(\left\{X = \text{Red}\right\} \cup \left\{X = \text{Green}\right\}\right) \]

Probability Mass Function

For a discrete random variable $X$, the assignment \[x \mapsto {\mathbb P}(X=x) =: \color{red}{p(x)}\] for every $x \in \text{supp}(X)$ is called its probability mass function.

Visualizing PMFs

$\text{supp}(X)$	Red	Blue	Green
${\mathbb P}(\left\{X=x\right\})$	0.20	0.55	0.25

Properties of PMFs

The PMF is between $0$ and $1$, i.e., for any $x \in \text{supp}(X)$ \[ 0 \le p(x) \le 1 \]

The sum of the PMF over the support is $1$, i.e., \[ \sum_{x \in \text{supp}(X)} p(x) = 1 \]

The probability that $X \neq x$ is the complement of the probability that $X = x$, i.e., \[ {\mathbb P}(\{X \neq x\}) = 1 - p(x) \]

For $x, y \in \text{supp}(X)$, if $x \neq y$, then \[ {\mathbb P}(\{X=x\} \cup \{X=y\}) = p(x) + p(y) \]

Conditional Probability Mass Function

Let $X$ be a random variable with PMF $p_{X}(x) = {\mathbb P}(X=x) \quad \text{for all } x \in \text{supp}(X)$
Let $A$ be some event

Conditional Probability Mass Function

The distribution of $X$ conditional on the event $A$, denoted $X|A$, denoted $p_{X}(x|A)$, is given by the conditional probability mass function \[ p_{X}(x | A) = {\mathbb P}(X=x | A) = {\mathbb P}(\left\{X=x\right\} | A) \]

More than one Random Variable

Let $X$ be a random variable with PMF $p_{X}(x) = {\mathbb P}(X=x) \quad \text{for all } x \in \text{supp}(X)$
Let $Y$ be a random variable with PMF $p_{Y}(y) = {\mathbb P}(Y=y) \quad \text{for all } y \in \text{supp}(Y)$

Joint Probability Mass Function

The distribution of $X$ and $Y$ is given by the joint PMF \[ p_{X,Y}(x,y) = {\mathbb P}(X=x, Y=y) = {\mathbb P}(\left\{X=x\right\} \cap \left\{Y=y\right\}) \] and conditioned on the event $\left\{Y = y\right\}$ the conditional PMF of $X | \left\{Y=y\right\}$ is \[ p_{X|Y}(x | y) = {\mathbb P}(X=x | Y=y) = {\mathbb P}\left(\left\{X=x\right\} | \left\{Y=y\right\}\right) \]

$X$ and $Y$ are said to be independent, $X \perp\kern-5pt \perp Y$, if $p_{X|Y}(x|y) = p_{X}(x)$

Conditional Probability Mass Function

The distribution of $X$ is given by the joint probability mass function $p_{X,Y}(x, y)$ \[ p_{X,Y}(x,y) = {\mathbb P}(X=x, Y=y) = {\mathbb P}(\left\{X=x\right\} \cap \left\{Y=y\right\}) \]

More than one random variable (cont’d)

If $X \perp\kern-5pt \perp Y$: \[\begin{aligned} p_{X,Y}(x, y) &= {\mathbb P}(\left\{X=x\right\} \cap \left\{Y=y\right\})\\ &= {\mathbb P}(\left\{X=x\right\}) \times {\mathbb P}(\left\{Y=y\right\})\\ &= p_{X}(x) \times p_{Y}(y) \end{aligned}\]
If $X \not\perp\kern-5pt \perp Y$: \[\begin{aligned} p_{X,Y}(x, y) &= {\mathbb P}(\left\{X=x\right\} \cap \left\{Y=y\right\})\\ &= {\mathbb P}(\left\{X=x\right\} | \left\{Y=y\right\}) {\mathbb P}(\left\{Y=y\right\}) \quad\quad\text{recall week-2}\\ &= p_{X|Y}(x | y) p_{Y}(y)\\ \end{aligned}\]

\[ p_{X,Y}(x, y) = \begin{cases} p_{X}(x) \times p_{Y}(y) & X \perp\kern-5pt \perp Y\\ p_{X|Y}(x | y) \times p_{Y}(y) & X \not\perp\kern-5pt \perp Y \end{cases} \]

Discrete & Quantitative Random Variables

Cumulative Distribution Function

Given a quantitative random variable $X$ with PMF $p_{X}(x)$
The Cumulative Distribution Function (CDF) of a discrete random variable $X$ is defined as:

\[F_X(x) = P(X ≤ x)\]

Essentially, the CDF at a point x is the probability that the random variable X takes a value less than or equal to x.
For discrete random variables, the CDF is a step function.

Properties of CDFs

Normalized: \[\lim_{x \rightarrow -\infty}F(x) = 0 \quad \text{and}\quad \lim_{x \rightarrow \infty}F(x) = 1\]

Non-decreasing function: The function $F_X(x)$ is non-decreasing, i.e., if

\[a ≤ b \quad \text{then} \quad F(a) ≤ F(b)\]

Relationship with PMFs: The sum of all probabilities less than $x$ is the CDF, i.e., \[ F_X(x) = \sum_{y \in \text{supp}(X) \text{ such that } y \le x} p_{X}(y) \]

Example

#| standalone: true
#| viewerHeight: 600
#| components: viewer
#| layout: vertical

import numpy as np
import scipy
from scipy import stats
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
from shiny import App, render, ui
from matplotlib.patches import Patch

# Generate a random sample
np.random.seed(0)
sample_data = np.random.randint(-5, 7, 1000)
sample_data = np.array([*np.random.randint(-3, 2, 1000), *sample_data])
sample_data = np.array([*np.random.randint(3, 5, 1000), *sample_data])

# Define the UI
app_ui = ui.page_fluid(
    ui.layout_sidebar(
        ui.sidebar(
            ui.input_slider(
                "x_val",
                "x",
                min = float(np.floor(sample_data.min())-0.5),
                max = float(np.ceil(sample_data.max())+0.5),
                value = 0.0,
                step = 0.5,
                ticks=True,
                animate=False
            )
        ),
        ui.output_plot("plots", height="500px")
    )
)

# Define the server logic
def server(input, output, session):
    @output
    @render.plot
    def plots():
        x = input.x_val()
        
        # Create figure and axes
        fig, axes = plt.subplots(1, 2, figsize=(12, 5))
        
        # Histogram Plot
        ax_hist = axes[0]
        bins = 11
        counts, bins, patches = ax_hist.hist(sample_data, bins=bins, color='lightgrey', edgecolor='black', density=True, align='left', width=0.5)
        
        # Color the bars <= x in blue
        for patch, bin_edge in zip(patches, bins):
            if bin_edge <= x:
                patch.set_facecolor('dodgerblue')
        
        # Add vertical line at x
        ax_hist.axvline(x, color='red', linewidth=1, alpha=0.5)
        ax_hist.set_xticks(np.arange(-5, 6, 1))
        ax_hist.set_title("Probability Mass Function (PMF)")
        ax_hist.set_xlabel("Support")
        ax_hist.set_ylabel("p(x)")
        ax_hist.set_ylim(0, 1.0)
        
        # CDF Plot
        ax_cdf = axes[1]
        Xrange = np.linspace(-5.5, 5.5, 200)
        ecdf = lambda x: np.sum(sample_data <= x) / len(sample_data)
        Y = [ecdf(x) for x in Xrange]
        ax_cdf.plot(Xrange, Y, color='dodgerblue')
        ax_cdf.axhline(ecdf(x), color='red', linewidth=1, linestyle='--', alpha=0.25)
        ax_cdf.axvline(x, ymax=ecdf(x), color='red', linewidth=0.25, alpha=0.25)
        # Add vertical line at x
        ax_cdf.axvline(x, color='red', linewidth=2)
        ax_cdf.set_title("Cumulative Distribution Function (CDF)")
        ax_cdf.set_xlabel("Support")
        ax_cdf.set_xticks(np.arange(-5, 6, 1))
        ax_cdf.set_ylabel("F(x)")
        ax_cdf.set_ylim(-0.05, 1.05)
        ax_cdf.set_xlim(-5.5, 5.5)
        plt.tight_layout()
        return fig

# Create the Shiny app
app = App(app_ui, server)
app

Central Tendency of a random variable

You’ve just made an amazing (in your opinion) TikTok short which has a potential for going viral.

With probability $0.1$, the TikTok will really go viral, leading to $10,000$ additional followers
With probability $0.9$, your TikTok is not as good as you thought it was, and it leads to $0$ additional followers.

How many additional followers do you expect to have after you post your TikTok?

Expectation

Expectation

The expectation of a (quantitative) random variable $X$ is a weighted-average of all possible outcomes of $X$, weighted by the probability of each outcome over its support.

\[\begin{aligned} {\mathbb E}(X) &= \sum_{x \in \text{supp}(X)} x \times {\mathbb P}(X=x)\\ &= \sum_{x \in \text{supp}(X)} x \times p_{X}(x) \end{aligned}\]

Example revisited

Let $X$ be the random variable representing the number of additional followers you get on TikTok.

$\text{supp}(X)$	$10,000$ (viral)	$0$ (not viral)
$p(x)$	0.1	0.9

\[\begin{aligned} {\mathbb E}(X) &= \sum_{x \in \text{supp}(X)} x \times {\mathbb P}(X=x)\\ \\ &= (0.1 \times 10,000) + (0.9 \times 0)\\ \\ \therefore {\mathbb E}(X) &= 1,000. \end{aligned}\]

Linearity

For two random variables $X$ and $Y$, and for two constants $a, b \in {\mathbb R}$, \[ {\mathbb E}(aX + bY) = a {\mathbb E}(X) + b {\mathbb E}(Y) \]

Why?

\[ \begin{aligned} {\mathbb E}(aX + bY) &= \sum_{x \in \text{supp}(X)}\sum_{ y \in \text{supp}(Y)} (ax + by) \cdot p_{X,Y}(x, y)\\ &= a\sum_{x \in \text{supp}(X)}\sum_{ y \in \text{supp}(Y)} x \cdot p_{X,Y}(x,y) + b \sum_{x \in \text{supp}(X)}\sum_{ y \in \text{supp}(Y)} y \cdot p_{X,Y}(x,y)\\ &= a\sum_{x \in \text{supp}(X)} x \left(\sum_{ y \in \text{supp}(Y)} p_{X,Y}(x,y)\right) + b \sum_{y \in \text{supp}(Y)} y \left(\sum_{ x \in \text{supp}(X)} p_{X,Y}(x,y)\right)\\ &= a\sum_{x \in \text{supp}(X)} x \cdot p_{X}(x) + b \sum_{y \in \text{supp}(Y)} y p_{Y}(y)\\ &= a{\mathbb E}(X) + b {\mathbb E}(Y) \end{aligned} \]

Transformations

Let $X$ be a random variable and $f$ be some function on the support of $X$
Let $Y = f(X)$

The transformation $Y = f(X)$ is also a random variable!
The support, $\text{supp}(f(X))$ is simply the everything $f$ maps to from $\text{supp}(x)$

Expected value of $Y = f(X)$

\[ {\mathbb E}(f(X)) = \sum_{x \in \text{supp}(X)} {\mathbb P}(X=x) \times f(x) \]

Measure of uncertainty

Consider the following two gambling scenarios:

Coin Bet

You flip a fair coin.

If outcome is $H$, I pay you $\$10$
If outcome is $T$, you pay me $\$10$

\[\begin{aligned}{\mathbb E}(X) &= \Big(\frac{1}{2} \times \$10\Big) + \Big(\frac 12 \times \ -$10\Big)\\ \\ &= \$0\end{aligned}\]

Dice Bet

You roll a fair die:

If outcome is $\left\{6\right\}$, I pay you $\$60$
Otherwise, you pay me $\$12$

\[\begin{aligned}{\mathbb E}(X) &= \Big(\frac{1}{6} \times \$60\Big) + \Big(\frac 56 \times \ -$12\Big)\\ \\ &= \$0\end{aligned}\]

Which of these two senarios do you prefer? Still the same?

Variance

Variance

The variance of a (quantitative) random variable $X$ is a measure of the spread or dispersion of the random variable $X$ around its expected value.

\[ {\text{Var}}(X) = \sum_{x \in \text{supp}(X)} {\mathbb P}(X=x) \times (x - {\mathbb E}(X))^2 \]

Equivalent formulation

\[ {\text{Var}}(X) = {\mathbb E}(X^2) - {\mathbb E}(X)^2 \]

Example revisited

Let $X$ be the random variable representing your net gain.

Coin Bet

$\text{supp}(X)$	$10	-$10
${\mathbb P}(X=x)$	$\frac 12$	$\frac 12$

${\mathbb E}(X) = 0$

\[\begin{aligned}{\text{Var}}(X) &= \left(\frac 12 \times (10 - 0)^2\right) + \left(\frac 12 \times (-10 - 0)^2\right)\\ \\ &= 100\end{aligned}\]

Dice Bet

$\text{supp}(X)$	$60	-$12
${\mathbb P}(X=x)$	$\frac 16$	$\frac 56$

${\mathbb E}(X) = 0$

\[\begin{aligned}{\text{Var}}(X) &= \left(\frac 16 \times (60 - 0)^2\right) + \left(\frac 56 \times (-12 - 0)^2\right)\\ \\ &= 720\end{aligned}\]

Question

Which of the three distributions above has the largest variance?

Variance of a linear combination

Variance of a linear combination

For two independent random variables $X$ and $Y$ with $X \perp\kern-5pt \perp Y$, and
for two constants $a, b \in {\mathbb R}$, \[{\text{Var}}(aX + bY) = a^2 {\text{Var}}(X) + b^2 {\text{Var}}(Y).\]

Conditonal Expectation and Variance

Given a random variable $X$ and an event $A$

The conditional expectation of $X$ given the event $A$ is \[\begin{aligned} {\mathbb E}(X|A) &= \sum_{x \in \text{supp}(X)} x \times {\mathbb P}(X=x | A)\\ &= \sum_{x \in \text{supp}(X)} x \times p_{X}(x|A) \end{aligned}\]
The conditional variance of $X$ given the event $A$ is \[\begin{aligned} {\text{Var}}(X|A) &= \sum_{x \in \text{supp}(X)} (x - {\mathbb E}(X|A))^2 \times {\mathbb P}(X=x | A)\\ &= \sum_{x \in \text{supp}(X)} (x - {\mathbb E}(X|A))^2 \times p_{X}(x|A) \end{aligned}\]

Week-3

Learning objectives

Random Variables

(Random) Variable

Support of a random variable

Anatomy of a random variable

Part-I Discrete Random Variables

Probability Distribution

Example

Visualizing PMFs

Properties of PMFs

Conditional Probability Mass Function

More than one Random Variable

More than one random variable (cont’d)

Discrete & Quantitative Random Variables

Cumulative Distribution Function

Properties of CDFs

Example

Central Tendency of a random variable

Expectation

Example revisited

Linearity

Transformations

Measure of uncertainty

Coin Bet

Dice Bet

Variance

Example revisited

Coin Bet

Dice Bet

Variance of a linear combination

Conditonal Expectation and Variance

Part-I

Discrete Random Variables