For each tissue we have gene expression \(X\) in that an element \(X_{gi}\) measures level of expression of gene \(g\) in individual \(i\). We assume \(\mathbb{E}[X] = UV\) under the null (without genetic effects). Without enforcing the rank of matrix \(U\) and \(V\) to a particular value, we aim to identify a suitable rank through Matrix Factorization with Bayesian Group Lasso (MFGL). Group Lasso prior 1 has been explored to select a group of features that models response variables in a regression problem. Here we use its Bayesian extension 2 to avoid the need of regularization parameter tuning. We could use genetic association to choose appropriate size of the factorization, but this can overfit to given genotype matrix.
We formulate the model likelihood by multivariate Gaussian distributions:
\[\mathcal{L} = \prod_{i=1}^{m} P(\mathbf{y}_{i}|\mathbf{U},\mathbf{v}_{i},\sigma^{2}) = \prod_{i=1}^{m} \mathcal{N}(\mathbf{y}_{i}|U\mathbf{v}_{i},\sigma^{2}I)\]
Prior distribution of column vector \(\mathbf{u}_{j}\) is modeled by \[P(\mathbf{u}_{j}|\lambda) = \mathcal{N}(\mathbf{u}_{j}|\mathbf{0},\sigma^{2}\tau_{u}^{2}I).\] and row vector \(\mathbf{v}_{j}\) is by \[P(\mathbf{v}_{j}|\lambda) \propto \exp\left( - \|\mathbf{v}_{j}\| \lambda/\sigma \right).\] A more convenient and equivalent formulation is proposed by previous work 2: \[P(\mathbf{v}_{j}|\tau_{v_{j}}^{2}) = \mathcal{N}(\mathbf{v}_{j}|\mathbf{0},\sigma^{2}\tau_{v_{j}}^{2}I_{m\times{m}})\] and \[P(\tau_{v_j}^{2}|\lambda) = \mathsf{Gam}\left(\tau_{v_j}^{2}|\frac{m +1}{2},\frac{\lambda^{2}}{2}\right).\] We can justify this hierarchical model by checking \(\int d\tau_{v_j}^{2} \, P(\mathbf{v}_{j}|\tau_{v_j}^{2})P(\tau_{v_j}^{2}|\lambda) \propto \exp(-\|\mathbf{v}_{j}\|\lambda/\sigma).\)
Given hyperparameters that regularize sparsity the model, we can estimate columns of \(U\) and rows of \(V\) by sampling from \[P(\mathbf{u}_{j}|\cdot) = \mathcal{N}((\mathbf{v}_{j}\mathbf{v}_{j}^{\top} + \tau_{u}^{-2})^{-1}R_{-j}\mathbf{v}_{j}^\top,\sigma^{2}(\mathbf{v}_{j}\mathbf{v}_{j}^{\top}+\tau_{u}^{-2})^{-1}I)\] and \[P(\mathbf{v}_{j}|\cdot) = \mathcal{N}((\mathbf{u}_{j}^{\top}\mathbf{u}_{j} + \tau_{v_j}^{-2})^{-1} \mathbf{u}_{j}^\top R_{-j},\sigma^{2}(\mathbf{v}_{j}\mathbf{v}_{j}^{\top}+\tau_{v_j}^{-2})^{-1}I),\] where \(R_{-j} = Y - UV + \mathbf{u}_{j}\mathbf{v}_{j}\).
Exact posterior probability of the unknown precision (inverse variance) can be derived analytically as inverse Normal distribution: \[P(\tau_{v_j}^{-2}|\cdot) = \mathcal{IN}\left(\lambda\sigma/\|\mathbf{v}_{j}\|, \lambda^{2} \right)\] where probability density function \(\mathcal{IN}(x|\mu,\gamma) = \gamma^{1/2}(2\pi x^{3})^{-1/2}\exp\left(-\gamma(x-\mu)^{2} / 2\mu^{2}x\right)\).
From \(\mathbb{E}[\tau_{v_j}^{2}] = \|\mathbf{v}_j\|/\lambda\sigma + 1/\lambda^{2}\) under the posterior distribution (inverse Normal), we derive empirical Bayes update of the regularization parameter \(\lambda\) as follows. \[\lambda^{-2} \gets \frac{\sum_{j}\mathbb{E}[\tau_{v_j}^{2}]}{mJ+J} = \frac{\sum_{j}\|\mathbf{v}_{j}\|/\sigma}{(m+1)J} \lambda^{-1} + \frac{1}{m + 1} \lambda^{-2}.\] At stationary point, we reach a rather simple solution \[\hat{\lambda} = \frac{m\sigma}{J^{-1}\sum_{j}\|\mathbf{v}_{j}\|},\,\left(\textrm{or trivially}\,\hat{\lambda} = 0\right).\]
We may use Monte-Carlo EM, but to simplify calculation we give rather radical empirical Bayes estimate of \(\lambda\). For each group \(j\) separately, instead of averaging over all \(J\) groups, we give \(\hat{\lambda}_{j} = m\sigma/\|\mathbf{v}_{j}\|.\) And with this plug-in estimate, we derive \[\mathbb{E}[\tau_{v_{j}}^{-2}|\cdot] \approx \frac{\hat{\lambda}_{j}\sigma}{\|\mathbf{v}_{j}\|} = \min\{\frac{m\sigma^{2}}{\|\mathbf{v}_{j}\|^{2}}, C\}\] where we bounded the estimate by some large constant \(C\) to prevent \(\infty\) caused by \(\|\mathbf{v}_{j}\| = 0\). As with sufficiently large \(C\), the inference results were not affected.
\[\mathbb{E}[\mathbf{m}^{(g)}|\cdot] =\frac{(Y^{(g)}- UV^{(g)})\mathbf{1}}{n_{\textrm{individual}}}\]
\[\mathbb{E}[\mathbf{v}_{j}^{(g)}|\cdot] = \frac{\mathbf{u}_{j}^{\top}Y_{-j}^{(g)}} {\|\mathbf{u}_{j}\|^{2} + \tau_{v_{j}}^{-2}},\,\, \mathbb{V}[\mathbf{v}_{j}^{(g)}|\cdot] = \frac{\sigma^{2}} {\|\mathbf{u}_{j}\|^{2} + \tau_{v_{j}}^{-2}}I\] where \(Y_{-j}^{(g)} = Y^{(g)} - U V^{(g)} - M^{(g)} + \mathbf{u}_{j}\mathbf{v}_{j}^{(g)}\) and \(\mathbb{E}[\tau_{v_{j}}^{-2}] = n_{\textrm{indvidual}}\sigma^{2}/ \|\mathbf{v}_{j}\|^{2}\) by empirical Bayes estimate.
\[\mathbb{E}[\mathbf{u}_{j}|\cdot] = \frac{\sum_{g} Y_{-j}^{(g)} {\mathbf{v}_{j}^{(g)}}^\top}{\sum_{g}\|\mathbf{v}_{j}^{(g)}\|^{2} + \tau_{u_j}^{-2}},\quad\textrm{and}\quad \mathbb{V}[\mathbf{u}_{j}|\cdot] = \frac{\sigma^{2}}{\sum_{g}\|\mathbf{v}_{j}^{(g)}\|^{2} + \tau_{u_j}^{-2}}I\] where \(Y_{-j}^{(g)} = Y^{(g)} - M^{(g)} - UV^{(g)} + \mathbf{u}_{j}\mathbf{v}_{j}^{(g)}\).
J x 1 \(\bar{\mathbf{v}}=V\mathbf{1}\)
\[Y(g) V(g)^\top - U V(g) V(g)^\top - \mathbf{m}(g)\bar{\mathbf{v}}(g)^\top\]
With accumulated sufficient statistics \(T(\{V(g)\}) = \sum_{g} \left[ Y(g) {V(g)}^{\top} - \mathbf{m}(g) {\bar{\mathbf{v}}(g)}^\top \right]\) and \(S(\{V(g)\}) = \sum_{g} V(g){V(g)}^\top\)
Looping through entire corpus of genes is very inefficient.
We use memoized online learning 3
Assume \(\mathbf{u}_{j} = \mathbf{z}_{j} u_{j}\)
spike-slab \[Y_{ti}^{(g)} \approx \sum_{j} Z_{tj} u_{j} V_{ji}^{(g)}\]
\[p(Z_{tj} = 1 | \cdot)\]
\[(Y_{ti}^{(g)} - \sum_{k\neq j} Z_{tk} u_{k} V_{ki}^{(g)})^{2}\]
\[(Y_{ti}^{(g)} - \sum_{k\neq j} Z_{tk} u_{k} V_{ki}^{(g)} - u_{j} V_{ji}^{(g)})^{2}\]
Let \(Y_{ti,-j}^{(g)} = Y_{ti}^{(g)} - \sum_{k\neq j} Z_{tk} u_{k} V_{ki}^{(g)}\)
\[-2 Y_{ti,-j}^{(g)} u_{j} V_{ji}^{(g)} + (u_{j} V_{ji}^{(g)})^{2}\]
Posterior probability of effect size:
First characterize distribution over \(\mathbf{u}_{j} \equiv \mathbf{z}_{j} u_{j}\)
\[\mathbb{E}[u_{j}|\cdot] = \frac{\mathbf{z}_{j}^\top (Y - \sum_{k\neq j} u_{k}\mathbf{z}_{k} \mathbf{v}_{k}) \mathbf{v}_{j}^\top} {\|\mathbf{v}_{j}\|^{2} \sum_{t} Z_{tj}^{2} + n_{\textrm{gene}} \tau_{u_j}^{-2}}\]
\[\mathbb{E}[u_{j}^{(s)}|\cdot] = \frac{\mathbf{z}_{j}^\top (Y\mathbf{v}_{j}^\top) - \mathbf{z}_{j}^\top U (V\mathbf{v}_{j}^\top) + u^{(s-1)}_{j} \|\mathbf{z}_{j}\|^{2} \|\mathbf{v}_{j}\|^{2}} {\|\mathbf{v}_{j}\|^{2} \|\mathbf{z}_{j}\|^{2} + n_{\textrm{gene}} \tau_{u_j}^{-2}}\] where we are in fact dealing with \(n_{\textrm{gene}}\) number of observations.
spike: \[\mathbb{E}[\mathbf{z}_{j}|\cdot]= \mathsf{sigmoid}\left( -\frac{1}{2\sigma^{2}} u_{j}^{2} \|\mathbf{v}_{j}\|^{2} + \frac{1}{\sigma^{2}} u_{j} (Y - \sum_{k\neq j} u_{k} \mathbf{z}_{k} \mathbf{v}_{k})\mathbf{v}_{j}^\top + n_{\textrm{gene}} \log \frac{\pi}{1-\pi}\right)\]
\[\mathbb{E}[\mathbf{z}_{j}^{(s)}|\cdot]= \mathsf{sigmoid}\left( -\frac{1}{2\sigma^{2}} u_{j}^{2} \|\mathbf{v}_{j}\|^{2} + \frac{u_{j}}{\sigma^{2}} \left( Y \mathbf{v}_{j}^\top - U (V \mathbf{v}_{j}^\top) + \mathbf{z}_{j}^{(s-1)} \|\mathbf{v}_{j}\|^{2} \right) + n_{\textrm{gene}} \log \frac{\pi}{1-\pi} \right)\]
\[p(\pi_{tj}|\cdot) \sim \mathsf{beta}\left(a_0/n_{\textrm{tissue}} + \mathbb{E}[z_{tj}], b_0 (n_{\textrm{tissue}}-1)/n_{\textrm{tissue}} + (1- \mathbb{E}[z_{tj}]) \right)\] as shown in 4
We have sought to identify global tissue factor matrix \(U\) and locally inferred gene-specific factor loading matrix \(V^{(g)}\), that factorizes each gene matrix as \(Y^{(g)} = U V^{(g)}\). However the factors are not readily interpretable and provide insights into regulatory mechanism of tissue-specific gene expressions. To ascertain biological mechanisms we link these factors to genetic variants, and then genetic variants to regulatory / GWAS SNP annotations that were identified in reference data sets.
Fixing the factor axis, \(j\), we can pull all genes and construct gene \(\times\) individual matrix \(V^{(j)}\). For simplicity we omit the superscript \((j)\) but the algorithm is identical.
We model factor activity of gene \(g\) in individual \(i\) by linear model with genetic \[V_{gi} \sim \mathcal{N}\left(\sum_{s\in S(g)} Z_{s} W_{gs} X_{si},\sigma^{2}\right)\]
We are interested in gene-SNP association matrix \(W_{gs}\).
Genetic variant selection process is a priori biased by epigenetic / GWAS marks \(A_{ks}\) on SNP \(s\). \[Z_{s} \sim \mathsf{sigmoid}\left(\theta_{0} + \sum_{k} A_{ks} \theta_{k} \right)\] This will choose sparse model along the SNP axis (column) of \(W\) since \(P(W_{gs}|Z_{s} =1)\) have mean zero. At the same time we shrink gene axis of genes using Bayesian group LASSO, \[\mathbf{w}_{g}^\top\sim\mathcal{N}\left(\mathbf{0},\sigma^{2}\tau_{w_{g}}^{2}I\right)\] where \(\tau_{w_{g}}^{2}\sim\mathsf{Gam}\left((|S(g)|+1)/2,\lambda^{2}/2\right)\).
With EB estimate of \(\lambda\) we discussed, \[\hat{\tau}_{w_{g}}^{-2} = \frac{|S(g)|\sigma^{2}}{\sum_{s\in S(g)} W_{gs}^{2}}.\]
\[\mathbb{E}\left[W_{gs}|z_{s} = 1\right] = \frac{\mathbf{v}_{g} \mathbf{x}_{s}^\top - \sum_{t\neq s} z_t W_{gt} \mathbf{x}_{t}\mathbf{x}_{s}^\top }{\|\mathbf{x}_{s}\|^{2} + \tau_{w_{g}}^{-2}}\]
\[\mathbb{V}\left[W_{gs}|z_{s} = 1\right] = \frac{\sigma^{2}}{\|\mathbf{x}_{s}\|^{2} + \tau_{w_{g}}^{-2}}\]
\[\mathbb{V}\left[W_{gs}|z_{s} = 0\right] = \tau_{w_g}^{2}\]
Posterior probability of \(Z_{s}\)
Since for each independent \(g\) we have \[\frac{\mathbb{V}\left[W_{gs}|z_{s} = 0\right]}{\mathbb{V}\left[W_{gs}|z_{s} = 1\right]} =\frac{\|\mathbf{x}_{s}\|^{2} + \tau_{w_g}^{-2}}{\sigma^{2}\tau_{w_g}^{-2}},\] and because of independence log ratio of the determinant is simply \[\Lambda_{s} \equiv \frac{1}{2} \log \frac{\mathsf{det}\left(\mathbb{V}[\mathbf{w}_{s}|z_{s}=0]\right)}{\mathsf{det}\left(\mathbb{V}[\mathbf{w}_{s}|z_{s}=1]\right)} = \frac{1}{2} \sum_{g:\,s \in S(g)} \left[ \log(\|\mathbf{x}_{s}\|^{2}/\tau_{w_g}^{-2} +1 ) - \log \sigma^{2} \right].\]
\[M_{s} = \sigma^{-2}\left[ \mathbf{x}_{s} \left( V - \sum_{t\neq s} z_{t} \mathbf{w}_{t} \mathbf{x}_{t} \right)^\top \mathbf{w}_{s} - \frac{1}{2} \|\mathbf{x}_{s}\|^{2} \|\mathbf{w}_{s}\|^{2}\right]\]
\[P(z_{s}=1|\cdot) = \mathsf{sigmoid}\left(M_{s} + \theta_{0} + \theta^\top \mathbf{a}_{s} + \Lambda_{s} \right)\]
\[Y_{gi} | H_{gk} = 1 \sim \sum_{s} Z_{sk} W_{gs}^{k} X_{si}\]
just coordinate step \[q(H_{gk} = 1 |\cdot) \propto \mathcal{N}\left(\mathbf{y}_{g} \vert X^\top \hat\beta_{k}, \sigma^{2} I \right)\]
where \(\hat\beta_{k} = \mathbf{w}_{g} \circ \mathbf{z}_{k}\)
\[\propto \exp\left( \sigma^{-2} \hat\beta_{k}^\top(X\mathbf{y}_{g}) - \frac{1}{2} \sigma^{-2} (X^\top \hat\beta_{k})^\top (X^\top \hat\beta_{k}) \right)\]
this doesn’t take into account of variance, but this can be fully vectorized
\(\sigma^{-2} \hat{B}^\top X Y\) will be cluster \(\times\) gene and the second term does not depend on genes.
or we could do Locally collapsed variational inference:
\[q(H_{gk}=1|\cdot) \propto \int d \beta\, \mathcal{N}\left(\mathbf{y}_{g}|X^\top \beta,\sigma^{2}I\right) \mathcal{N}\left(\beta|\hat\beta_{k},\hat\Sigma_{k}\right)\] where \(\hat\beta_{k} = \mathbf{w}_{g} \circ \mathbf{z}_{k}\) and \(\hat\Sigma_{k} = \sigma^{2}D_{0}^{-1}\), and \(D_{0} \equiv \mathsf{diag}\left[\ldots,\|\mathbf{x}_{s}\|^{2} Z_{sk} + \tau_{w_g}^{-2},\ldots\right]\).
Let \[\tilde{\Lambda} = \sigma^{-2} X X^\top + \sigma^{-2}D_{0}\] and \[\tilde{\mathbf{m}} = \tilde{\Lambda}^{-1} \sigma^{-2} (X\mathbf{y} + D_{0} \hat\beta_{k})\]
After integrating out: \[q \propto \vert \hat{\Sigma}_{k} \vert^{-1/2} \vert\tilde{\Lambda}\vert^{-1/2} \exp\left( -\frac{1}{2} \hat\beta_{k}^\top \Sigma_{k}^{-1} \hat\beta_{k} + \frac{1}{2} \tilde{\mathbf{m}}^\top \tilde{\Lambda} \tilde{\mathbf{m}}\right)\]
Effect size: \[\mathbb{E}[W_{gs}|Z_{sk}=1] = \frac{\mathbf{y}_{g} \mathbf{x}_{s}^\top - \sum_{s'\neq s} Z_{s'k} W_{gs'} \mathbf{x}_{s'} \mathbf{x}_{s}^\top} {\|\mathbf{x}_{s}\|^{2} + \tau_{w_{g}}^{-2}}\] and \[\mathbb{V}[W_{gs}|Z_{sk}=1] = \frac{\sigma^{2}}{\|\mathbf{x}\|^{2} + \tau_{w_g}^{-2}}\]
Let \(\tilde{\mathbf{w}}_{s} = \mathbf{h}_{k} \circ \mathbf{w}_{s}\)
SNP selection: \[M_{sk} = \mathbf{x}_{s} \left(Y - \sum_{s' \neq s} Z_{s'k} \tilde{\mathbf{w}}_{s'} \mathbf{x}_{s'}\right)^\top \tilde{\mathbf{w}}_{s} - \frac{1}{2}\|\mathbf{x}_{s}\|^{2} \mathbb{E}\|\tilde{\mathbf{w}}_{s}\|^{2}\] where \(\mathbb{E}\|\tilde{\mathbf{w}}_{s}\|^{2} = \sum_{g} H_{gk} (W_{gs})^{2}\)
\[Z_{sk} | \cdot \sim \mathsf{sigmoid}\left( M_{sk}\sigma^{-2} + \theta_{0} + \theta^\top \mathbf{a}_{s} + \Lambda_{sk}\right)\]
where
\[\Lambda_{sk} = \frac{1}{2} \sum_{g} H_{gk} \left[\log(\|\mathbf{x}_{s}\|^{2}/\tau_{w_{g}}^{-2} + 1) - \log \sigma^{2}\right]\]
\(Z_{s} \sim \mathsf{sigmoid}\left(\theta_{0} + \theta^\top \mathbf{a}_{s}\right)\)
We assume sparse prior \(P(\theta|\lambda) \propto \exp\left(-\|\theta\|_{1}\lambda/2\right)\)
Bayesian Lasso:
\[\mathbb{E}\left[\tau_{\theta_{k}}^{-2}|\cdot\right] = \frac{\lambda}{\sqrt{\theta_{k}^{2}}}\]
Here \[\mathcal{L} = \sum_{s=1}^{n_{\textrm{SNP}}} \left[ Z_{s} (\theta_{0} + \theta^\top\mathbf{a}_{s}) - \log\left(1+e^{-\theta_{0}-\theta^\top\mathbf{a}_{s}}\right) \right] - \frac{1}{2} \sum_{k} \frac{\theta_{k}^{2}}{\tau_{\theta_{k}}^{2}}\]
Approximate log likelihood: \[\mathcal{L}_{Q} = - \frac{1}{2} \sum_{s} \omega_{s}(\kappa_{s} - \theta_{0} - \theta^\top \mathbf{a}_{s})^{2} - \frac{1}{2}\sum_{k} \frac{\theta_{k}^{2}}{\tau_{\theta_{k}}^{2}}\] where \(p_{s} = \mathsf{sigmoid}\left(\theta_0 + \theta^\top\mathbf{a}_{s}\right)\), \(\omega_{s} = p_{s}(1-p_{s})\), and \[\kappa_{s} = \hat{\theta}_{0} + \hat{\theta}^\top \mathbf{a}_{s} + \frac{z_s - p_s}{p_s(1-p_s)}\]
We find coordinate descent update of \(\theta\):
\[\hat\theta_{0} \gets \frac{\sum_{s} \omega_{s}(\kappa_{s} - \theta^\top\mathbf{a}_{s})}{\sum_{s} \omega_{s}}\]
\[\hat{\theta}_{k} \gets \frac{\sum_{s} \omega_{s} \gamma_{s,-k} A_{ks} }{\sum_{s}\omega_{s} A_{ks}^{2} + \tau_{\theta_k}^{-2}}\] where \(\gamma_{s,-k} = \kappa_{s} - \theta_0 - \sum_{k'\neq k} \theta_{k'}A_{k's}\).
Empirical Bayese solution: \[\lambda^{-2} = \frac{1}{2K}\sum_{k} \mathbb{E}\left[\tau_{\theta_{k}}^{2}|\cdot\right] = \frac{\lambda^{-1}}{2K} \sum_{k} \sqrt{\theta_{k}^{2}} + \frac{1}{2}\lambda^{-2}\] At stationary point: \[\hat{\lambda} = \frac{K}{\sum_{k=1}^{K}\sqrt{\theta_{k}^{2}}}\]
More radical choice could be \(\hat{\lambda}_{k} = \sqrt{\theta_{k}^{2}}\) and \[\mathbb{E}\left[\tau_{\theta_k}^{-2}|\cdot\right] \to 1/\mathbb{E}\left[\theta_{k}^{2}\right] \le \hat{\theta}_{k}^{-2}\]
Let \(\mathbb{V}_{\textrm{MLE}}[\theta_{k}] = (\sum_{s=1}^{n_{\textrm{SNP}}}\omega_{s} A_{ks}^{2})^{-1}\)
We can estimate standard error of \(\theta_{k}\): \[\mathbb{V}\left[\theta_{k}\right]^{-1} = \mathbb{V}_{\textrm{MLE}}^{-1} + \mathbb{E}\left[\tau_{\theta_k}^{-2}|\cdot\right] \le \mathbb{V}_{\textrm{MLE}}^{-1} + \hat{\theta}_k^{-2}\]
Or taking into full account of \(\mathbb{E}\left[\theta_{k}^{2}|\cdot\right] = \hat\theta_{k}^{2} + \mathbb{V}\), we have a stationary point solution: \[\mathbb{V}_{\textrm{EB}} = - \frac{\hat\theta_k^2}{2} + \hat\theta_k\sqrt{\mathbb{V}_{\textrm{MLE}} + \hat\theta_k^2/4}\]
\(\mathbb{V}_{\textrm{EB}}[\theta_{k}] \ge (\mathbb{V}_{\textrm{MLE}}[\theta_{k}]^{-1} + \hat\theta_{k}^{-2})^{-1}\)
If \(\mathbb{V}_{\textrm{MLE}}\to 0\), \(\mathbb{V}_{\textrm{EB}} \to 0\). Also if \(\theta \to 0\), \(\mathbb{V}_{\textrm{EB}} \to 0\).
5
6
7
1.Yuan, M. & Lin, Y. et al. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68, 49–67 (2006).
2.Kyung, M., Gill, J., Ghosh, M. & Casella, G.et al. Penalized regression, standard errors, and Bayesian lassos. Bayesian Analysis 5, 369–411 (2010).
3.Hughes, M. C. & Sudderth, E. et al. in Advances in neural information processing systems 26 (eds. Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z. & Weinberger, K. Q.) 1133–1141 (Curran Associates, Inc., 2013). at <http://papers.nips.cc/paper/4969-memoized-online-variational-inference-for-dirichlet-process-mixture-models.pdf>
4.Paisley, J. & Carin, L. et al. Nonparametric factor analysis with beta process priors. in Proceedings of the 26th annual international conference on machine learning - iCML ’09 1–8 (ACM Press, 2009). doi:10.1145/1553374.1553474
5.Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R.et al. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nature protocols 7, 500–7 (2012).
6.Shabalin, A. A. et al. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics (Oxford, England) 28, 1353–8 (2012).
7.Qi, J., Asl, H. F., Björkegren, J. & Michoel, T.et al. kruX: matrix-based non-parametric eQTL discovery. BMC bioinformatics 15, 11 (2014).