Skip to content

Autoregressive (AR) and Moving Average (MA) model

Autoregressive(AR) and moving average(MA) model

A process {Xt}\{X_t\} is said to be an ARMA(p,q) process if

  • {Xt}\{X_t\} is stationary
  • t.Xtϕ1Xt1...ϕqXtp=at+θ1at1+...+θqatq\forall t. X_t - \phi_1X_{t-1}-...-\phi_qX_{t-p} = a_t + \theta_1a_{t-1}+...+\theta_qa_{t-q}
    using backward shift operation notation Bh=xthB^h=x_{t-h}:
    Φ(B)xt=(1ϕ1B...ϕpBp)xt=(1+θ1B+...+θqBq)at=Θ(B)at\Phi(B)x_t = (1-\phi_1B - ... - \phi_p B^p)x_t = (1+\theta_1B + ...+\theta_qB^q)a_t = \Theta(B)a_t where atNID(0,σ2)a_t \sim NID(0, \sigma^2)

{Xt}\{X_t\} is an ARMA(p,q) process with mean μ\mu if {Xtμ}\{X_t-\mu\} is an ARMA(p,q) process.

Moving average model(MA(q))

MA(\infty) If {at}NID(0,σ2)\{a_t\}\sim NID(0, \sigma^2) then we say that {Xt}\{X_t\} is a MA(\infty) process of {at}\{a_t\} if {ψn},ψj<\exists\{\psi_n\}, \sum^\infty |\psi_j|<\infty and Xt=ψjatjX_t = \sum^\infty \psi_j a_{t-j} where tZt\in\mathbb{Z}.

We can calculate ACF of a stochastic process {Xt}\{X_t\} a.l.s. {Xt}\{X_t\} can be writtin in the form of a MA(\infty) process

Also, MA(\infty) is a required condition for {Xt}\{X_t\} to be stationary.

Theorem The MA(\infty) process is stationary with 0 mean and autocovariance function γ(k)=σ2ψjψj+k\gamma(k) = \sigma^2 \sum^\infty \psi_j\psi_{j+|k|}

MA(q) Xt=i=0qθiati=Θ(B)atX_t = \sum_{i=0}^q \theta_i a_{t-i} = \Theta(B)a_t θ0=1,B\theta_0 = 1, B is the backward shift operator, BhXt=XthB^hX_t = X_{t-h} and atNID(0,σ2)a_t\sim NID(0, \sigma^2)

Under MA(q) model

γ(1)=cov(Xt,Xt+1)=cov(i=0qθiati,i=0qθiat+1i)=E(i=0q1θiθi+1atiati)aNID,cov(ai,aj)=0=σ2(i=0q1θiθi+1)\begin{align*} \gamma(1) = cov(X_t, X_{t+1})&=cov(\sum_{i=0}^q\theta_i a_{t-i}, \sum_{i=0}^q \theta_ia_{t+1-i})\\ &=E(\sum_{i=0}^{q-1}\theta_i\theta_{i+1}a_{t-i}a_{t-i}) &a\sim NID, cov(a_i,a_j) =0\\ &=\sigma^2(\sum_{i=0}^{q-1}\theta_i\theta_{i+1}) \end{align*}

Similarly,

γ(k)=cov(Xt,Xt+k)=cov(i=0qθtati,i=0qθiat+ki)=σ2i=0qkθiθi+kI(kq)\begin{align*} \gamma(k)=cov(X_t, X_{t+k}) &=cov(\sum_{i=0}^q \theta_t a_{t-i}, \sum_{i=0}^q \theta_i a_{t+k - i})\\ &=\sigma^2 \sum_{i=0}^{q-k}\theta_i\theta_{i+k}\mathbb{I}(|k|\leq q) \end{align*}

Then, the autocorrelation function (ACF) will be

ρk=γk/var(Xt)var(Xt+k)=γk/σ2i=0qθi2=σ2i=0qkθiθi+kI(kq)/σ2i=0qθi2=i=0qkθiθi+kI(kq)/i=0qθi2\begin{align*} \rho_k &= \gamma_k/\sqrt{var(X_t)var(X_{t+k})}\\ & = \gamma_k / \sigma^2\sum_{i=0}^{q} \theta_i^2\\ & =\sigma^2 \sum_{i=0}^{q-k}\theta_i\theta_{i+k}\mathbb{I}(|k|\leq q) / \sigma^2\sum_{i=0}^{q} \theta_i^2\\ & = \sum_{i=0}^{q-k}\theta_i\theta_{i+k}\mathbb{I}(k\leq q) / \sum_{i=0}^{q} \theta_i^2 \end{align*}

Autoregressive model of order p (AR(p))

Xtϕ1Xt1...ϕpXtp=Φ(B)Xt=atX_t - \phi_1X_{t-1}-...-\phi_pX_{t-p} = \Phi(B)X_t = a_t
where atNID(0,σ2),BhXt=Xth,hZ,Φ(B)=(1ϕ1B...ϕpBp)a_t\sim NID(0, \sigma^2), B^hX_t = X_{t-h}, h\in\mathbb{Z}, \Phi(B)=(1-\phi_1B-...-\phi_p B^p)

AR(1)

Notice that for a AR(1)AR(1) process, aNID(0,σ2)a\sim NID(0, \sigma^2) and ata_t is uncorrelated with all previous Xs,s<tX_s, s<t

Xt=ϕXt1+at=ϕ(ϕXt2+at1)+at replace Xt1...repeated replacing=0ϕiati\begin{align*} X_t &= \phi X_{t-1} + a_t\\ &=\phi(\phi X_{t-2}+a_{t-1})+a_t&\text{ replace }X_{t-1}\\ &...&\text{repeated replacing}\\ &=\sum_0^\infty \phi^i a_{t-i} \end{align*}

is a MA()MA(\infty) process

γ(k)=cov(Xt,Xt+k)=cov(0ϕiati,0ϕiat+ki)=cov(0ϕiati,0ϕi+kati+0k1ϕiat+ki)=ϕk0(ϕiati)2=ϕkγ(0)=ϕkvar(Xt)\begin{align*} \gamma(k) &= cov(X_t, X_{t+k})\\ &=cov(\sum_0^\infty \phi^i a_{t-i}, \sum_0^\infty \phi^i a_{t+k-i})\\ &=cov(\sum_0^\infty \phi^i a_{t-i}, \sum_0^\infty \phi^{i+k} a_{t-i} + \sum_0^{k-1} \phi^i a_{t+k-i})\\ &= \phi^k\sum_0^\infty (\phi^ia_{t-i})^2\\ &=\phi^k \gamma(0)=\phi^k var(X_t) \end{align*}
γ(0)=var(Xt)=ϕ2iati2=σ2((ϕ2)i)aNID(0,σ2)=σ2(1ϕ2)1when ϕ2<1, by Maclaurin’s series\begin{align*} \gamma(0) &=var(X_t)\\ &=\sum^\infty \phi^{2i}a^2_{t-i}\\ &=\sigma^2(\sum^\infty (\phi^2)^i) &a\sim NID(0,\sigma^2)\\ &=\sigma^2(1-\phi^2)^{-1} &\text{when }\phi^2<1 \text{, by Maclaurin's series} \end{align*}

Causal or future independent AR process when ϕ<1|\phi|< 1 for an AR(1)AR(1)

Checking stationarity of AR(p)

Φ(B)=1ϕ1B...ϕpBp=0\Phi(B) = 1-\phi_1B-...-\phi_pB^p=0 must have all the roots line outside the unit circle.

ACF

AR(1) Case

Xt=ϕXt1+at,atNID(0,σ2)X_t = \phi X_{t-1} + a_t, a_t\sim NID(0,\sigma^2)

For kZ+k\in\mathbb{Z}^+, multiply XtkX_{t-k} on both sides

XtXtk=ϕXt1Xtk+atXtkX_t X_{t-k} = \phi X_{t-1}X_{t-k} + a_t X_{t-k}

Taking expectation, consider E(atXtk)E(a_tX_{t-k})

\begin{align} cov(a_t, X_{t-k}) &= E(a_t X_{t-k})-E(a_t)E(X_{t-k})\ &= E(a_t X_{t-k}) - 0\ &= cov(a_t, \sum_0^\infty \phi^i a_{t-k-i}) = 0 \end{align}

ata_t is uncorrelated with previous aa's.

E(XtXtk)=ϕE(Xt1Xtk)E(X_t X_{t-k}) = \phi E(X_{t-1}X_{t-k})

since cov(Xt,Xtk)=E(XtXtk)0cov(X_t,X_{t-k}) = E(X_tX_{t-k})-0

γ(k)=ϕγ(k1)\gamma(k)=\phi\gamma(k-1)

By induction, γ(k)=ϕkγ(0)\gamma(k)=\phi^k\gamma(0)

AR(2) Case

Xt=ϕ1Xt1+ϕ2Xt2+atX_t = \phi_1 X_{t-1} + \phi_2 X_{t-2} + a_t

Multiple both sides by XtX_t

Xt2=ϕ1Xt1Xt+ϕ2Xt2Xt+XtatX_t^2 = \phi_1 X_{t-1}X_t + \phi_2 X_{t-2}X_t + X_t a_t

Taking expectation, note that XtX_t is a lin.comb of aa.

γ(0)=γ(1)+γ(2)+σ2\gamma(0) = \gamma(1) + \gamma(2) + \sigma^2
γ(0)(1ϕ1ρ(1)ϕ2ρ(2))=σ2 since ρ(k)=γ(k)/γ(0)\gamma(0)(1-\phi_1\rho(1)-\phi_2\rho(2)) = \sigma^2 \text{ since }\rho(k)=\gamma(k)/\gamma(0)

Multiple both sides by Xt1X_{t-1} and take expectations

E(XtXt1)=ϕ1E(Xt1Xt1)+ϕ2E(Xt2Xt1)+E(atXt1)E(X_tX_{t-1}) = \phi_1 E(X_{t-1}X_{t-1}) + \phi_2E(X_{t-2}X_{t-1}) + E(a_t X_{t-1})
γ(1)=ϕ1γ(0)+ϕ2γ(1)\gamma(1) = \phi_1\gamma(0) + \phi_2\gamma(1)
ρ(1)=ϕ1+ϕ2ρ(1)\rho(1) = \phi_1 + \phi_2\rho(1)
ρ(1)=ϕ11ϕ2\rho(1) = \frac{\phi_1}{1-\phi_2}

Multiple both sides by Xt2X_{t-2} and take expectations

E(XtXt2)=ϕ1E(Xt1Xt2)+ϕ2E(Xt2Xt2)+E(atXt2)E(X_tX_{t-2}) = \phi_1 E(X_{t-1}X_{t-2}) + \phi_2E(X_{t-2}X_{t-2}) + E(a_t X_{t-2})
γ(2)=ϕ1γ(1)+ϕ2γ(0)\gamma(2) = \phi_1\gamma(1) + \phi_2\gamma(0)
ρ(2)=ϕ1ρ(1)+ϕ2\rho(2) = \phi_1\rho(1) + \phi_2

... Using this pattern

ρ(h)=ϕ1ρ(h1)+ϕ2ρ(h2)\rho(h) = \phi_1\rho(h-1)+\phi_2\rho(h-2)

with base case

ρ(0)=1,ρ(1)=ϕ11ϕ2\rho(0)=1, \rho(1) = \frac{\phi_1}{1-\phi_2}

AR(p) case

Given Xt=(1pϕiXti)+atX_t = (\sum_1^p \phi_iX_{t-i}) + a_t, is stationary is all pp roots lie outside of the unit circle

Yule-Walker equations
For the first pp autocorrelations:

ρ(k)=1pϕiρki\rho(k) = \sum_1^p \phi_i\rho_{|k-i|}

Partial Autocorrelation Function (PACF)

ϕkk=corr(Xt,Xt+kXt+1,...,Xt+k1)\phi_{kk} = corr(X_t, X_{t+k}\mid X_{t+1},...,X_{t+k-1})
the correlation between Xt,Xt+kX_t, X_{t+k} after their mutual linear dependency on the intervening variables has been removed.

For a given lag kk, j{1,2,...,k}\forall j \in \{1,2,...,k\}.

ρi=1kϕkiρji\rho_i = \sum_1^k\phi_{ki}\rho_{j-i}

We regard the ACFs are given, take regression parameters ϕki\phi_{ki} and wish to solve for ϕkk\phi_{kk}.
which all together forms the Yule-Walker equations.

Example
For lag 1, ρ1=ϕ11,ρ0ρ1=ϕ11\rho_1 = \phi_{11},\rho_0\Rightarrow \rho_1=\phi_{11}

For lag 2,

ρ1=ϕ21+ϕ22ρ1\rho_1 = \phi_{21} + \phi_{22}\rho_1
ρ2=ϕ21ρ1+ϕ22\rho_2 = \phi_{21}\rho_1 + \phi_{22}
ϕ22=ρ2ρ121ρ12\Rightarrow \phi_{22} = \frac{\rho_2 - \rho_1^2}{1-\rho_1^2}

Causal and invertible

Causal/stationary if XtX_t can be expressed as an MA(\infty) process

Invertible if XtX_t can be expressed as an AR(\infty) process.

Duality between AR amd MA processes

A finite-order stationary AR(p) process corresponds to a MA(\infty) process, and a finite-order invertible MA(q) corresponds to an AR(\infty) process.

Example

Given model Xtϕ1Xt1ϕ2Xt2=at=θat1X_t - \phi_1 X_{t-1} - \phi_2 X_{t-2} = a_t = \theta a_{t-1}

Assume the process is causal, then Xt=0ψiati=at0ψiBi=ψ(B)atX_t = \sum_0^\infty \psi_i a_{t-i} = a_t\sum_0^\infty \psi_i B^i = \psi(B)a_t by causal process
ϕ(B)Xt=θ(B)atXt=θ(B)atϕ(B)\phi(B)X_t = \theta(B) a_t \Rightarrow X_t = \frac{\theta(B)a_t}{\phi(B)} by ARMA model
Θ(B)/Φ(B)=Ψ(B)\Rightarrow \Theta(B)/\Phi(B)=\Psi(B)

Replace back into the model 1+θB=(0ψiBi)(1ϕ1Bϕ2B2)1+\theta B = (\sum_0^\infty \psi_iB^i)(1-\phi_1B - \phi_2B^2)

Consider BB, θB=ψ1Bϕ1Bψ1=ϕ1+θ\theta B = \psi_1B -\phi_1B\Rightarrow \psi_1 = \phi_1 + \theta

Consider B2B^2, 0=θ2B2ψ1θ1B+ψ2B2ψ2=ϕ2+ϕ1(ϕ1+θ)0 = -\theta_2B^2-\psi_1\theta_1B +\psi_2B^2\Rightarrow \psi_2 = \phi_2 + \phi_1(\phi_1+ \theta)

Assume the process is invertible, then
at=0πiXti=Xt0πiBia_t = \sum_0^\infty \pi_i X_{t-i} = X_t\sum_0^\infty \pi_i B^i,
similarly we get Φ(B)=Θ(B)Π(B)\Phi(B)=\Theta(B)\Pi(B)

Wold Decomposition

Any zero-mean process {Xt}\{X_t\} wgucg us bit deterministic can be expressed as a sum of Xt=Ut+VtX_t = U_t + V_t where {Ut}\{U_t\} denotes an MA(\infty) process and {Vt}\{V_t\} is a deterministic process which is uncorrelated with {Ut}\{U_t\} - deterministic if the values Xn+j,j1X_{n+j}, j\geq 1 of the process {Xt}\{X_t\} were perfectly predicatable in term of μn=sp{Xt}\mu_n=sp\{X_t\} - If XnX_n comes from a deterministic process, it can be predicted (or determined) by its past observations of the process

Model identification

process ACF PACF
AR(p) tails off cuts off after lag p
MA(q) cuts off after lag q tails off
ARMA(p,q) tails off after (q-p) tails off after (p-q)

Model Adequacy

The overall tests that check an entire group of residual autocorrelation functions are called portmanteau tests.

Box and Pierce Q=n1mρ^k2χm(p+q)2Q = n \sum_1^m \hat\rho_k^2 \sim \chi^2_{m-(p+q)}
Ljung and Box Q=1mn(n+2)ρ^k2nkχm(p+q)2Q=\sum_1^m \frac{n(n+2)\hat\rho_k^2}{n-k}\sim \chi^2_{m-(p+q)}
nn is the number of observations
mm is the max lag
p,qp,q are fitted model

Model selection

AIC=2logML+2kAIC = -2\log ML + 2k
BIC=2logML+klognBIC = -2 \log ML + k \log n

BIC puts more penalties on the number of parameters

Example: Application of ARMA in Investment

Alternative assets modeling

yty_t and rtr_t denote observable appraisal and latent economic returns.

Goal to infer unobservable economic returns using appraisal returns

Geltner method commercial real state

yt=ϕyt1+(1ϕ)rt=ϕj(1ϕ)rtj=wjrtjy_t = \phi y_{t-1} + (1-\phi)r_t = \sum^\infty \phi^j(1-\phi)r_{t-j} = \sum^\infty w_jr_{t-j}

(by substitute yt1y_{t-1}) where ϕ(0,1),wj:=ϕj(1ϕ)\phi\in (0,1), w_j := \phi^j (1-\phi) is the weight

yt=ϕ^yt1+a^t,r^t=a^t1ϕ^y_t = \hat\phi y_{t-1}+\hat a_t , \hat r_t = \frac{\hat a_t}{1-\hat\phi}
var(r^t)=σ2(1ϕ^)2var(\hat r_t)=\frac{\sigma^2}{(1-\hat\phi)^2}

Gertmansky, Low, & Markorov

yt=qwirtiy_t = \sum^q w_i r_{t-i}

where wi(0,1),wi=1w_i\in(0,1), \sum w_i = 1
Since yty_t is a linear combination of white noise

yt=qθiati=iq(θijqθjjqθjati)=qwirtiy_t = \sum^q \theta_i a_{t-i} = \sum_i^q \left(\frac{\theta_i}{\sum_j^q \theta_j}\sum_j^q \theta_j a_{t-i}\right) = \sum^q w_i r_{t-i}

Factor Modeling
The economic returns can be regressed by the market returns

rt=α+βrMt+etr_t = \alpha + \beta r_{Mt} + e_t
yt=qwi(α+βrM,ti+eti)y_t = \sum^q w_i (\alpha + \beta r_{M_,t-i} + e_{t-i})
=qwia+βqwirM,ti+qwieti=α+βqwirM,ti+qwieti= \sum^q w_i a + \beta \sum^q w_i r_{M,t-i} + \sum^q w_i e_{t-i} =\alpha + \beta \sum^q w_i r_{M,t-i} + \sum^q w_i e_{t-i}