Multivariate Distributions Joint Distributions The joint distribution of random variables \(X_1,...,X_n\) is defined as the collection of all possible probabilities, \(P((X_1,...,X_n)\in B)\) where \(B\) is a Borel set in \(\mathbb R^n\) . When, \(n=2\) , we can call it bivariate distribution .
Then, we can define the joint pmf as
\[\text{pmf}_{X,Y}(x,y) = P(X=x,Y=y)\]
Simiarly, the joint pdf is
\[P((X,Y) \in B) = \iint_B f(x,y)dxdy\]
Both of them still satisifies non-negativity and total probability.
Similarly, if we have more variables, the joint pmf is
\[\text{pmf}_{X_1,...,X_n}(x_1,...,x_n) = P(X_1=x_1, ..., X_n=x_n)\]
\[\text{pdf}_{X_1,...,X_n}(x_1,...,x_n) = \int\cdots\int_B f(x,y)dxdy\]
joint cdf is then defined as
\[\text{cdf}_{X,Y}(x,y) = P(X\leq x, Y\leq y)\]
Marginal Distributions Suppose \(X,Y\) are r.v. We can marginalize the density functions for \(X\) or \(Y\) from the joint density.
\[\text{cdf}_X(x) = \lim_{y\rightarrow\infty} \text{cdf}_{X,Y}(x,y)\]
\[\text{pdf}_X(x) = \int \text{pdf}_{X,Y}(x,y)dy\]
\[\text{pmf}_X(x) = \sum_{y} \text{pmf}_{X,Y}(x,y)\]
If we have more variables, we can marginalize one variable by taking its limit / integral / sum just as in bivariate case
random variable \(X,Y\) is independent (independence of r.v.) (\(X\perp Y\) ) IFF \(P(X\in A, Y\in B) = P(X\in A)P(Y\in B)\)
Theorem The following statements are equivalent:
\[X\perp Y\]
\[\text{cdf}_{X,Y}(x,y) = \text{cdf}_X(x) \text{cdf}_Y(y)\]
\[\text{pmf}_{X,Y}(x,y) = \text{pmf}_X(x) \text{pmf}_Y(y)\]
\[\text{pdf}_{X,Y}(x,y) = \text{pdf}_X(x) \text{pdf}_Y(y)\]
proof . (We will only prove for cdf since the idea is very similar using the definition)
\[\text{cdf}_{X,Y}(x,y) = P(X\leq x, Y\leq y) = P(X\in A)P(Y\in B) = \text{cdf}_X(x) \text{cdf}_Y(y)\]
If we have more variables, then they are (mutually) independent IFF
\[[cdf/pmf/pdf]_{X_1,...,X_n}(x_1,...,x_n) = \prod_i [cdf/pmf/pdf]_{X_i}(x_i)\]
Example \(X,Y\) are independent and has the same density \(2x \mathbb I(0\leq x\leq 1)\) , compute \(P(X+Y\leq 1)\)
\[\begin{align*} P(X+Y\leq 1) &= P(0\leq X\leq 1, 0\leq y\leq 1-x)\\ &= \int_0^1\int_0^{1-x}2x2ydydx\\ &= 4\int_0^1\int_0^{1-x}xydydx\\ &= 1/6 \end{align*}\]
Example \(\text{pdf}_{X,Y}(x,y) = kx^2y^2 \mathbb I(x^2+y^2\leq 1)\) , show \(X,Y\) is not independent.
Marginalize \(X\) ,
\[\begin{align*} \text{pdf}_X(x) &= \int_{-\infty}^\infty kx^2y^2 \mathbb I(x^2+y^2\leq 1)dy\\ &= kx^2\int_{-\infty}^\infty y^2 \mathbb I(x^2+y^2\leq 1)dy \\ &= kx^2\int_{-\sqrt{1-x^2}}^{\sqrt(1-x^2)}y^2dy\\ &= \frac{2}{3}kx^2(1-x^2)^\frac{3}{2} \end{align*}\]
Note that \(X,Y\) is symmetric, so that
\[\text{pdf}_X(x)\text{pdf}_Y(y) = \frac{2}{3}kx^2(1-x^2)^\frac{3}{2}\frac{2}{3}ky^2(1-y^2)^\frac{3}{2}\neq \text{pdf}_{X,Y}(x,y)\]
Example \(\text{pdf}_{X,Y}(x,y) = ke^{-(x+2y)}\mathbb I(x\geq 0, y\geq 0)\) , determine \(k\) and whether \(X,Y\) independent.
To determine \(k\) , using the total probability
\[\int_{-\infty}^\infty \int_{-\infty}^\infty \text{pdf}_{X,Y}(x,y) dxdy = \int_{0}^\infty \int_{0}^\infty ke^{-(x+2y)} dxdy = [-ke^{-2y}/2]^\infty_0 = \frac{0-(-k)}{2} = \frac{k}{2}\]
so that \(k/2 = 1\implies k=2\)
Marginalize \(X\) , we have
\[\text{pdf}_Y(y) = \int_{0}^\infty 2e^{-(x+2y)} dx = 2e^{-2y}\]
Marginalize \(Y\) , we have
\[\text{pdf}_X(x) = \int_{0}^\infty 2e^{-(x+2y)} dy = e^{-x}\]
Indeed, they are independent since
\[\text{pdf}_Y(y)\text{pdf}_X(x) = 2e^{-2y}\mathbb I(y\geq 0) e^{-x}\mathbb I(x\geq 0) = 2e^{-(x+2y)}\mathbb I(x\geq 0, y\geq 0) = \text{pdf}_{X,Y}(x,y)\]
Conditional Distributions The conditional pmf is defined as
\[\text{pmf}_{X|Y}(x|y) = P(X=x | Y=y) = \frac{P(X=x, Y=y)}{P(Y=y)} = \frac{\text{pmf}_{X,Y}(x,y)}{\text{pmf}_Y(y)}\]
The conditional pdf is given by
\[\text{pdf}_{X|Y}(x|y) = \frac{\text{pdf}_{X,Y}(x,y)}{\text{pdf}_Y(y)}\]
Since in continuous case, \(P(Y=y) = 0\) in general, we need to define a very small \(\delta\) so that
\[P(X\in A|Y\in B_\delta) =\frac{P(X\in A, Y\in B_\delta)}{P(Y\in B_\delta)} = \frac{\int_{B_\delta}\int_A \text{pdf}_{X,Y}(x,y)dxfy}{\int_{B_\delta}\text{pdf}_Y(y)dy}\]
Note that since \(B_\delta\) is arbitrarily small, we can approximate \(\int_{B_\delta}f(x)dx = 2\delta f(x)\)
January 9, 2023 January 9, 2023