Let me integrate you out so that no one would notice you

An old fashioned mathematical technique might be to integrate out parameters from equations, and as more the equation is complicated as more it is difficult to get the analytical solution.

Basic idea

Consider a variable exchange x \mapsto z that satisfies \begin{align} \int_{-\infty}^{\infty} f(x) dx \Longrightarrow \int_{-1}^{1} f\Bigl(x(z)\Bigr) x'(z) dz. \nonumber \end{align} An easy solution for that is \begin{align} z = 2 \left(\frac{1}{1+\exp(-x)} \right)-1 \Longleftrightarrow x = \log{\left ( \frac{1+z}{1-z} \right )} \nonumber \end{align} and its first order derivative is \begin{align} x'(z) = \frac{2}{1 - z^{2}} \nonumber \end{align} It is well known that \begin{align} \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi}} \exp\left(-\frac{x^2}{2} \right) dx = 1, \nonumber \end{align} and apply the variable exchange and the number we get by approximating 7-point Gauss rule with a 15-point Kronrod rule is: \begin{align} \int_{-1}^{1} \frac{1}{\sqrt{2 \pi}} \exp\left(-\frac{\left(\log{\left ( \frac{1+z}{1-z} \right )}\right)^2}{2} \right) \left|\frac{2}{1 - z^{2}}\right| dz = 0.99999692068548 \approx 1 \nonumber \end{align} Note that 1-0.99999692068548=3.07931451959398 \times 10^{-6}.

Test model

\begin{align} y = a_i x + \epsilon_i \\ a_i \sim {\cal N}(a,\omega) \\ \epsilon_i \sim {\cal N}(0,\sigma) \nonumber \end{align} Likelihood is \begin{align} \prod_{i=1}^n{\cal N}_d(y \mid a_i x, \sigma){\cal N}_d(a_i \mid a x,\omega) \nonumber \end{align}

Analytical solution

Integrate out a_i from the model \begin{align} p_i &= \int_{-\infty}^{\infty}{\cal N}_d(y \mid a_i x, \sigma){\cal N}_d(a_i \mid a x,\omega) d a_i \\ &= \int_{-\infty}^{\infty} \left[ \frac{1}{\sqrt{2 \pi} \sigma}\exp \left(-\frac{y - a_i x}{2 \sigma^2} \right) \frac{1}{\sqrt{2 \pi} \omega}\exp \left(-\frac{a_i - a}{2 \omega^2} \right) \right]d a_i \\ &=\frac{1}{\sqrt{2 \pi (\omega^2 x^2 + \sigma^2)}} \exp \left( -\frac{(y-ax)^2}{2(\omega^2 x^2 + \sigma^2)} \right) \nonumber \end{align}
  1. > summary(fit1)$summary
  2. mean se_mean sd 2.5% 25% 50%
  3. a 0.9888168 0.0006589843 0.03325222 0.9221687 0.9669033 0.9888208
  4. omega 0.4188159 0.0021747104 0.07486459 0.2419583 0.3795231 0.4274141
  5. sigma 0.5302711 0.0005498567 0.01996546 0.4930311 0.5166785 0.5294182
  6. lp__ -530.7041447 0.0428310403 1.40084589 -534.4240874 -531.3030933 -530.2699704
  7. 75% 97.5% n_eff Rhat
  8. a 1.0112437 1.0524001 2546.192 1.002017
  9. omega 0.4695590 0.5430313 1185.086 1.000445
  10. sigma 0.5427861 0.5725930 1318.438 1.002972
  11. lp__ -529.6919616 -529.2135198 1069.704 1.002579

Numerical solution

Compare the results with one using the following numerical integral. \begin{align} \int_{-\infty}^{\infty} f(x) dx \Longrightarrow \int_{-1}^{1} f\left(\log{\left ( \frac{1+z}{1-z} \right )}\right) \left|\frac{2}{1 - z^{2}}\right| dz. \nonumber \end{align}
  1. > summary(fit2)$summary
  2. mean se_mean sd 2.5% 25% 50%
  3. a 0.9895701 0.0007101089 0.03497257 0.9192264 0.9664499 0.9899563
  4. omega 0.4267222 0.0013115691 0.06287198 0.3079255 0.3800736 0.4278155
  5. sigma 0.5293555 0.0004180750 0.01877470 0.4938143 0.5165164 0.5290596
  6. lp__ -877.2053140 0.0278415504 1.21027072 -880.2556886 -877.7230284 -876.8940771
  7. 75% 97.5% n_eff Rhat
  8. a 1.0127247 1.0576070 2425.521 1.000030
  9. omega 0.4714495 0.5469434 2297.904 1.000609
  10. sigma 0.5421843 0.5668702 2016.683 1.000074
  11. lp__ -876.3328600 -875.8352797 1889.636 1.000688

Script

Data generation

  1. set.seed(1234)
  2.  
  3. n = 1000
  4. a = 1
  5. omega = 0.5
  6. sigma = 0.5
  7.  
  8. x = runif(n,-1,1)
  9. a_i = rnorm(n,a,omega)
  10. e_i = rnorm(n,0,sigma)
  11. y = x*a_i + e_i
  12.  
  13. ## 7-point Gauss rule with a 15-point Kronrod rule
  14. X = c(
  15. -0.991455371120813,
  16. -0.949107912342759,
  17. -0.864864423359769,
  18. -0.741531185599394,
  19. -0.586087235467691,
  20. -0.405845151377397,
  21. -0.207784955007898,
  22. 0,
  23. 0.207784955007898,
  24. 0.405845151377397,
  25. 0.586087235467691,
  26. 0.741531185599394,
  27. 0.864864423359769,
  28. 0.949107912342759,
  29. 0.991455371120813
  30. )
  31. C = c(
  32. 0.022935322010529,
  33. 0.063092092629979,
  34. 0.104790010322250,
  35. 0.140653259715525,
  36. 0.169004726639267,
  37. 0.190350578064785,
  38. 0.204432940075298,
  39. 0.209482141084728,
  40. 0.204432940075298,
  41. 0.190350578064785,
  42. 0.169004726639267,
  43. 0.140653259715525,
  44. 0.104790010322250,
  45. 0.063092092629979,
  46. 0.022935322010529
  47. )
  48.  
  49. d = list(
  50. n = n,
  51. x = x,
  52. y = y,
  53. X = X,
  54. C = C
  55. )
  56.  
  57. plot(x,y)

Model for analytical solution

  1. model1 = '
  2. functions{
  3. real ll(real a, real sigma, real omega, real x, real y){
  4. real res;
  5. res = -0.5*log(pi())-0.5*(log(omega^2*x^2+sigma^2)) - 0.5*(y - a*x)^2/(omega^2*x^2+sigma^2);
  6. return res;
  7. }
  8. }
  9.  
  10. data{
  11. int n;
  12. real x[n];
  13. real y[n];
  14. }
  15.  
  16. parameters{
  17. real a;
  18. real omega;
  19. real sigma;
  20. }
  21. model{
  22. for (i in 1:n)
  23. target += ll(a,sigma,omega,x[i],y[i]);
  24. }
  25. '
  26. StanModel1 <- stan_model(model_code = model1)
  27. fit1 <- sampling(StanModel1
  28. ,data = d
  29. )

Model for numerical integral

  1. model2 = '
  2. functions{
  3. real pdf(real exz, real z, real a, real sigma, real omega, real x, real y){
  4. real res;
  5. res = 1/(sqrt(2*pi())*sigma)*exp(-(y-exz*x)^2/(2*sigma^2))*1/(sqrt(2*pi())*omega)*exp(-(exz-a)^2/(2*omega^2))*(2/(1-z^2));
  6. return res;
  7. }
  8.  
  9. real int_pdf(vector X, vector C, real a, real sigma, real omega, real x, real y){
  10. real res;
  11. vector[15] Z;
  12.  
  13. for (i in 1:15)
  14. Z[i] = log((1+X[i])/(1-X[i]));
  15.  
  16. res = 0;
  17. for (i in 1:15)
  18. res += C[i]*pdf(Z[i],X[i],a,sigma,omega,x,y);
  19. return log(res);
  20. }
  21.  
  22. }
  23.  
  24. data{
  25. int n;
  26. real x[n];
  27. real y[n];
  28. vector[15] X;
  29. vector[15] C;
  30. }
  31.  
  32. parameters{
  33. real a;
  34. real omega;
  35. real sigma;
  36. }
  37. model{
  38. for (i in 1:n)
  39. target += int_pdf(X,C,a,sigma,omega,x[i],y[i]);
  40. }
  41. '
  42. StanModel2 <- stan_model(model_code = model2)
  43. fit2 <- sampling(StanModel2
  44. ,data = d
  45. )

Comments