Skip to main content
izak

note: manova

Manova

Say \(\mu_l=\mu+(\mu_l-\mu)\) and say \(\tau_l=\mu_l+\mu\) where \(\mu\) is the grand mean vector and \(\tau_l\) is the \(l\)-th treatment effect. We want to test \(H_0:\tau_1=\tau_2=\cdots=\tau_k\) versus \(H_A: \text{not all} \ \tau_l \neq0 \ (\text{at least one} \ \tau=0)\). Then \(X_{lj}=\mu+(\mu_l-\mu)+(X_{lj}-\mu_l)=\mu+\tau_l+\epsilon_{lj}\) where \(\epsilon_{lj}=X_{lj}-\mu_l\) is deviation of observation \(j\) in sample \(l\) from population mean vector \(\mu_l\).

We can estimate parameters \(\mu\) and \(\mu_l\) with the constraint \(\sum_{l=1}^kn_l\tau_l=\sum_{l=1}^kn_l(\mu_l-\mu)=0\) where \(N=n_1+n_2+\cdots+n_k=\sum_{l=1}^kn_l\). To estimate grand mean vector \(\mu\) we see \(\hat{\mu}=\bar{X}=\frac1N\sum_{l=1}^k\sum_{l=1}^{n_l}X_{li}=\frac1N\sum_{l=1}^kn_l\bar{X}_l\). And estimate of \(l\)-th group mean vector is \(\hat{\mu}_l=\bar{X}_l=\frac{1}{n_l}\sum_{i=1}^{n_l}X_{li}\) so \(\mu_l\) is the mean vector of the observations in group \(l\) and estimation of the \(l\)-th treatment effect is \(\hat{\tau}=\bar{X_l}-\bar{X}\).

The estimate of the error for sample \(l\) and observation \(j\) is the residual \(\hat{\epsilon}_{lj}=X_{lj}-\bar{X}_l\) so we see that observation \(j\) in sample \(l\) is the sum of the mean, treatment effect, and residual \(X_{lj}=\bar{X}+(\bar{X_l}-\bar{X})+(X_{lj}-\bar{X}_l)\) or equivalently \(X_{lj}=\hat{\mu}+\hat{\tau}_l+\hat{\epsilon}_{lj}\).


Within Sum of Squares Matrix \(W\) with the difference of observation \(j\) in sample \(l\) and the sample \(l\) mean is \[ W=\sum_{l=1}^k \sum_{i=1}^{n_{l}}(X_{l i}-\bar{X}_{l})(X_{l i}-\bar{X}_{l})^T \]

Between Sum of Squares Matrix \({B}\) with the difference of the sample \(l\) mean and the grand mean is \[ B=\sum_{l=1}^k n_{l}(\bar{X}_{l}-\bar{X})(\bar{X}_{l}-\bar{X})^T \]

Total Sum of Squares Matrix \({T}\) with the difference of observation \(j\) in sample \(l\) and the grand mean is \[ T=\sum_{l=1}^k \sum_{i=1}^{n_{l}}(X_{l i}-\bar{X})(X_{l i}-\bar{X})^T \]

These three sum of squares matrices, \({W}, {B}\), and \({T}\) satisfy: \({T}={B}+{W}\).


Then the test statistic \(\Lambda^*=\frac{|W|}{|B+W|}=\frac{|W|}{|T|}\) has Wilks’ lambda distribution \(\Lambda(p,N-k,k-1)\) with null hypothesis \(H_0:\mu_1=\mu_2=\cdots=\mu_k\) assuming equal variance.


But there are special cases where \(\Lambda^*\) under \(H_0\) has an \(F\) distribution (assuming multivariate normal distributions with equal covariance matrices):

Number of Variables Number of Groups Sampling Distribution of \(\Lambda^*\) Test
\(p=1\) \(k \geq 2\) \(\displaystyle (\frac{N-k}{k-1})(\frac{1-\Lambda^*}{\Lambda^*}) \sim F_{(k-1,\, N-k)}\) One-way ANOVA
\(p=2\) \(k \geq 2\) \(\displaystyle (\frac{N-k-1}{k-1})(\frac{1-\sqrt{\Lambda^*}}{\sqrt{\Lambda^*}}) \sim F_{(2(k-1),\, 2(N-k-1))}\) MANOVA (2 variables)
\(p \geq 1\) \(k=2\) \(\displaystyle (\frac{N-p-1}{p})(\frac{1-\Lambda^*}{\Lambda^*}) \sim F_{(p,\, N-p-1)}\) Hotelling’s \(T^2\) test
\(p \geq 1\) \(k=3\) \(\displaystyle (\frac{N-p-2}{p})(\frac{1-\sqrt{\Lambda^*}}{\sqrt{\Lambda^*}}) \sim F_{(2p,\, 2(N-p-2))}\) MANOVA (3 groups)

By Wilks’ theorem, for large sample sizes we can approximate \(\Lambda^*\) as a Chi-square distribution based on the log liklihood test with \(-(N-1-\frac{p+k}{2})\log\Lambda^*\stackrel{\cdot}{\sim}\chi^2_{p(k-1)}\)