Cubic Spline with 6 nots in a regression context.

Cubic spline is a way to approximate a function and that is typically used as an approximation for a baseline hazard function in joint modeling. The cubic spline is kind of like a strong weapon like "LR1K sonic cannon". However, there are almost no relevant information with a simple example, then I would like to provide something like that here.


True model

Suppose the true function is \begin{align} \hat{y}=2 \sin(x) -0.5\cos(x) +1.5\sin(2x)-3\cos(2x), \nonumber \end{align} and the data follows \begin{align} y \sim {\cal N} \left( \hat{y}, \sigma\right).\nonumber \end{align} The dataset comprising with 200 observations was generated with the interval x:[-\pi,\pi].
  1. rm(list=ls())
  2.  
  3. # basic pararmeters
  4. nobs = 200 # number of observations
  5. minx = -pi # lower limit of x
  6. maxx = pi # upper limit of x
  7.  
  8. # generate data
  9. set.seed(1234)
  10.  
  11. cwidth = (maxx-minx)/4
  12. cut = seq(minx,maxx,cwidth)
  13.  
  14. x = runif(nobs,minx,maxx)
  15. x = x[order(x)]
  16.  
  17. f1 = function(x){
  18. haty = 2*sin(x) - 0.5*cos(x) + 1.5*sin(2*x) - 3*cos(2*x)
  19. return(haty)
  20. }
  21.  
  22. y = f1(x) + rnorm(nobs,0,1)
  23.  
  24. miny = min(y)
  25. maxy = max(y)
  26.  
  27. # Visualization
  28. plot(x,y,xlim=c(minx,maxx),ylim=c(miny-2,maxy+2))
  29. curve(f1,minx,maxx,add=T,lty=2)
The data generated is shown in the figure below.

Cubic spline with 6 nots

Since cubic(3rd) b-spline is assumed, 3 nots for both side, 3 \times 2=6, should be added, then total 6+6=12 nots is going to be used. In this case, the following nots can be obtained. \begin{align} \left[ -3.14 ,-3.14 ,-3.14 ,-2.24 ,-1.35 ,-0.45 ,0.45 ,1.35 ,2.24 ,3.14 ,3.14 ,3.14 \right] \nonumber \end{align}
  1. # basic pararmeters
  2. nnots = 6 # number of nots
  3. t = c(rep(minx,2),seq(minx,maxx,(maxx-minx)/(nnots+1)),rep(maxx,2))
Basic idea for cubic spline is to 1) divide interval, 2) assume a 3rd order polynomial for each interval, 3) connect them smoothly so that their neighbor 0 to 2nd order derivatives are equal. However, directly applying those condition to get the best estimates is not efficient, there is an alternative derivation which is mathematically equivalent to the original one as follows:

0-order basis function

\begin{align} b^0_j(x)= \left\{ \begin{array}{ll} 1 & (t_j \le x \le t_{j+1})\\ 0 & otherwise \end{array} \right.\nonumber \end{align} Let T be the number of nots. Number of functions is T−1.
  1. # 0-order basis function
  2. for (i in 1:((length(t))-1)){
  3. eval(parse(text=paste(
  4. "B0",i,"= function(x){
  5. if ((x>=t[",i,"])&(x<=t[",i+1,"])){
  6. res=1
  7. } else {
  8. res=0
  9. }
  10. return(res)
  11. }", sep="") ))
  12. }

k-order basis function (k>0)

\begin{align} b_j^n(x) = \frac{x - t_j}{t_{j+n} - t_j}b_j^{n-1}(x) + \frac{t_{j+n+1}-x}{t_{j+n+1}-t_{j+1}}b_{j+1}^{n-1}(x) \nonumber \end{align} Number of functions is T−1−k.
  1. # 1st-order basis function
  2. for (i in 1:((length(t))-2)){
  3. eval(parse(text=paste(
  4. "B1",i," = function(x){
  5. if ((t[",i+1,"]-t[",i,"])!=0){
  6. res = (x-t[",i,"])/(t[",i+1,"]-t[",i,"])*B0",i,"(x)
  7. } else { res = 0 }
  8. if ((t[",i+2,"]-t[",i+1,"])!=0){
  9. res = res + (t[",i+2,"]-x)/(t[",i+2,"]-t[",i+1,"])*B0",i+1,"(x)
  10. } else { res = res + 0 }
  11. return(res)
  12. }", sep="") ))
  13. }
  14.  
  15. # 2nd-order basis function
  16. for (i in 1:((length(t))-3)){
  17. eval(parse(text=paste(
  18. "B2",i," = function(x){
  19. if ((t[",i+2,"]-t[",i,"])!=0){
  20. res = (x-t[",i,"])/(t[",i+2,"]-t[",i,"])*B1",i,"(x)
  21. } else { res = 0 }
  22. if ((t[",i+3,"]-t[",i+1,"])!=0){
  23. res = res + (t[",i+3,"]-x)/(t[",i+3,"]-t[",i+1,"])*B1",i+1,"(x)
  24. } else { res = res + 0 }
  25. return(res)
  26. }", sep="") ))
  27. }
  28.  
  29.  
  30. # 3rd-order basis function
  31. for (i in 1:((length(t))-4)){
  32. eval(parse(text=paste(
  33. "B3",i," = function(x){
  34. if ((t[",i+3,"]-t[",i,"])!=0){
  35. res = (x-t[",i,"])/(t[",i+3,"]-t[",i,"])*B2",i,"(x)
  36. } else { res = 0 }
  37. if ((t[",i+4,"]-t[",i+1,"])!=0){
  38. res = res + (t[",i+4,"]-x)/(t[",i+4,"]-t[",i+1,"])*B2",i+1,"(x)
  39. } else { res = res + 0 }
  40. return(res)
  41. }", sep="") ))
  42. }


By using those basis functions, cubic b-spline function can be obtained as \begin{align} f(x) = \sum_{i=1}^{T-4}b_i^3(x)\theta_i + \theta_{T-3}x+ \theta_{T-2}, \end{align}
  1. # cubic b-spline curve function
  2. f = function(theta,x){
  3. res = 0
  4. for (i in 1:((length(t))-4)){
  5. eval(parse(text=paste("res = res + theta[i]*B3",i,"(x)",sep="") ))
  6. }
  7. res = res + theta[(length(t))-3]*x + theta[(length(t))-2]
  8. return(res)
  9. }
the number of parameters is 10, that is \theta_1, \dots, \theta_{10} And the corresponding log-likelihood function is \begin{align} ll(\theta)=\sum_{i=1}^{nobs} \log \left[ {\cal N}_d \left(y \mid f(x_i), \theta_{T-1}\right) \right]. \end{align}
  1. # Log-likelihood function
  2. ll = function(theta){
  3. haty = c()
  4. for (i in 1:nobs){
  5. haty = c(haty,f(theta,x[i]))
  6. }
  7. res = -sum(dnorm(y-haty,0,abs(theta[length(t)-1]),log=T))
  8. return(res)
  9. }
Then we are ready to optimize, the total number of parameters to be estimated is 10+1=11. ("+1" is for the error standard deviation \sigma)
  1. # initial value for θ
  2. theta = rep(1,(length(t)-1))
  3.  
  4. # Optimize log-likelihood function w.r.t. θ
  5. optres = optim(theta,ll,control=list(maxit=10000),method="BFGS")
  6. theta = optres$par
  7.  
  8. # Visualization
  9. haty = function(theta){
  10. res = c()
  11. for (i in 1:nobs){
  12. res = c(res,f(theta,x[i]))
  13. }
  14. return(res)
  15. }
  16.  
  17. hatyy = haty(theta)
  18. div = seq(minx,maxx,(maxx-minx)/(nnots+1))
  19. plot(x, y, xlim= c(minx,maxx), ylim = c(miny-2,maxy+2), cex = 0.5, pch = 20)
  20. curve(f1,minx,maxx,add=T,lty=2)
  21. par(new=T)
  22. plot(x[order(x)],hatyy[order(x)],xlim=c(minx,maxx),ylim=c(miny-2,maxy+2),col="red",xlab="",ylab="",type="l",lwd = 2)
  23. par(new=T)
  24. plot(x[order(x)],hatyy[order(x)]+theta[length(t)-1]*1.96,xlim=c(minx,maxx),ylim=c(miny-2,maxy+2),col="red",xlab="",ylab="",type="l",lty=2)
  25. par(new=T)
  26. plot(x[order(x)],hatyy[order(x)]-theta[length(t)-1]*1.96,xlim=c(minx,maxx),ylim=c(miny-2,maxy+2),col="red",xlab="",ylab="",type="l",lty=2)
  27. for (i in 1:length(div)){
  28. abline(v=div[i],lty=2)
  29. }
The black dash line is \hat{y}=2 \sin(x) -0.5\cos(x) +1.5\sin(2x)-3\cos(2x) and the red bold line is our estimated function. The red dash lines on the both side of the estimated curve are 95% confidence intervals with the normal error \hat{\sigma}=-0.986.

Comments