Cubic Spline with 6 nots in a regression context.

Cubic spline is a way to approximate a function and that is typically used as an approximation for a baseline hazard function in joint modeling. The cubic spline is kind of like a strong weapon like "LR1K sonic cannon". However, there are almost no relevant information with a simple example, then I would like to provide something like that here.


True model

Suppose the true function is \begin{align} \hat{y}=2 \sin(x) -0.5\cos(x) +1.5\sin(2x)-3\cos(2x), \nonumber \end{align} and the data follows \begin{align} y \sim {\cal N} \left( \hat{y}, \sigma\right).\nonumber \end{align} The dataset comprising with \(200\) observations was generated with the interval \(x:[-\pi,\pi]\).
rm(list=ls())

# basic pararmeters
nobs = 200 # number of observations
minx = -pi  # lower limit of x
maxx = pi   # upper limit of x

# generate data
set.seed(1234)

cwidth = (maxx-minx)/4
cut = seq(minx,maxx,cwidth)

x = runif(nobs,minx,maxx)
x = x[order(x)]

f1 = function(x){
  haty = 2*sin(x) - 0.5*cos(x) + 1.5*sin(2*x) - 3*cos(2*x)
  return(haty)
}

y = f1(x) + rnorm(nobs,0,1)

miny = min(y)
maxy = max(y)

# Visualization
plot(x,y,xlim=c(minx,maxx),ylim=c(miny-2,maxy+2))
curve(f1,minx,maxx,add=T,lty=2)
The data generated is shown in the figure below.

Cubic spline with 6 nots

Since cubic(3rd) b-spline is assumed, \(3\) nots for both side, \(3 \times 2=6\), should be added, then total \(6+6=12\) nots is going to be used. In this case, the following nots can be obtained. \begin{align} \left[ -3.14 ,-3.14 ,-3.14 ,-2.24 ,-1.35 ,-0.45 ,0.45 ,1.35 ,2.24 ,3.14 ,3.14 ,3.14 \right] \nonumber \end{align}
# basic pararmeters
nnots = 6   # number of nots
t = c(rep(minx,2),seq(minx,maxx,(maxx-minx)/(nnots+1)),rep(maxx,2))
Basic idea for cubic spline is to 1) divide interval, 2) assume a 3rd order polynomial for each interval, 3) connect them smoothly so that their neighbor 0 to 2nd order derivatives are equal. However, directly applying those condition to get the best estimates is not efficient, there is an alternative derivation which is mathematically equivalent to the original one as follows:

\(0\)-order basis function

\begin{align} b^0_j(x)= \left\{ \begin{array}{ll} 1 & (t_j \le x \le t_{j+1})\\ 0 & otherwise \end{array} \right.\nonumber \end{align} Let \(T\) be the number of nots. Number of functions is \(T−1\).
# 0-order basis function
for (i in 1:((length(t))-1)){
    eval(parse(text=paste(
        "B0",i,"= function(x){
        if ((x>=t[",i,"])&(x<=t[",i+1,"])){
            res=1
        } else {
            res=0
        }
        return(res)
        }", sep="") ))
}

\(k\)-order basis function \((k>0)\)

\begin{align} b_j^n(x) = \frac{x - t_j}{t_{j+n} - t_j}b_j^{n-1}(x) + \frac{t_{j+n+1}-x}{t_{j+n+1}-t_{j+1}}b_{j+1}^{n-1}(x) \nonumber \end{align} Number of functions is \(T−1−k\).
# 1st-order basis function
for (i in 1:((length(t))-2)){
    eval(parse(text=paste(
    "B1",i," = function(x){
        if ((t[",i+1,"]-t[",i,"])!=0){
            res = (x-t[",i,"])/(t[",i+1,"]-t[",i,"])*B0",i,"(x)
        } else { res = 0 }
        if ((t[",i+2,"]-t[",i+1,"])!=0){
            res = res + (t[",i+2,"]-x)/(t[",i+2,"]-t[",i+1,"])*B0",i+1,"(x)
        } else { res = res + 0 }
        return(res)
    }", sep="") ))
}

# 2nd-order basis function
for (i in 1:((length(t))-3)){
    eval(parse(text=paste(
    "B2",i," = function(x){
        if ((t[",i+2,"]-t[",i,"])!=0){
            res = (x-t[",i,"])/(t[",i+2,"]-t[",i,"])*B1",i,"(x)
        } else { res = 0 }
        if ((t[",i+3,"]-t[",i+1,"])!=0){
            res = res + (t[",i+3,"]-x)/(t[",i+3,"]-t[",i+1,"])*B1",i+1,"(x)
        } else { res = res + 0 }
        return(res)
    }", sep="") ))
}


# 3rd-order basis function
for (i in 1:((length(t))-4)){
    eval(parse(text=paste(
    "B3",i," = function(x){
        if ((t[",i+3,"]-t[",i,"])!=0){
            res = (x-t[",i,"])/(t[",i+3,"]-t[",i,"])*B2",i,"(x)
        } else { res = 0 }
        if ((t[",i+4,"]-t[",i+1,"])!=0){
            res = res + (t[",i+4,"]-x)/(t[",i+4,"]-t[",i+1,"])*B2",i+1,"(x)
        } else { res = res + 0 }
        return(res)
    }", sep="") ))
}


By using those basis functions, cubic b-spline function can be obtained as \begin{align} f(x) = \sum_{i=1}^{T-4}b_i^3(x)\theta_i + \theta_{T-3}x+ \theta_{T-2}, \end{align}
# cubic b-spline curve function
f = function(theta,x){
    res = 0
    for (i in 1:((length(t))-4)){
        eval(parse(text=paste("res = res + theta[i]*B3",i,"(x)",sep="") ))
    }
    res = res + theta[(length(t))-3]*x + theta[(length(t))-2]
    return(res)
}
the number of parameters is \(10\), that is \(\theta_1, \dots, \theta_{10}\) And the corresponding log-likelihood function is \begin{align} ll(\theta)=\sum_{i=1}^{nobs} \log \left[ {\cal N}_d \left(y \mid f(x_i), \theta_{T-1}\right) \right]. \end{align}
# Log-likelihood function
ll = function(theta){
    haty = c()
    for (i in 1:nobs){
        haty = c(haty,f(theta,x[i]))
    } 
    res = -sum(dnorm(y-haty,0,abs(theta[length(t)-1]),log=T))
    return(res)
}
Then we are ready to optimize, the total number of parameters to be estimated is \(10+1=11\). ("+1" is for the error standard deviation \(\sigma\))
# initial value for θ
theta = rep(1,(length(t)-1))

# Optimize log-likelihood function w.r.t. θ
optres = optim(theta,ll,control=list(maxit=10000),method="BFGS")
theta = optres$par

# Visualization
haty = function(theta){
    res = c()
    for (i in 1:nobs){
        res = c(res,f(theta,x[i]))
    } 
    return(res)
}

hatyy = haty(theta)
div = seq(minx,maxx,(maxx-minx)/(nnots+1))
plot(x, y, xlim=  c(minx,maxx), ylim = c(miny-2,maxy+2), cex = 0.5, pch = 20)
curve(f1,minx,maxx,add=T,lty=2)
par(new=T)
plot(x[order(x)],hatyy[order(x)],xlim=c(minx,maxx),ylim=c(miny-2,maxy+2),col="red",xlab="",ylab="",type="l",lwd = 2)
par(new=T)
plot(x[order(x)],hatyy[order(x)]+theta[length(t)-1]*1.96,xlim=c(minx,maxx),ylim=c(miny-2,maxy+2),col="red",xlab="",ylab="",type="l",lty=2)
par(new=T)
plot(x[order(x)],hatyy[order(x)]-theta[length(t)-1]*1.96,xlim=c(minx,maxx),ylim=c(miny-2,maxy+2),col="red",xlab="",ylab="",type="l",lty=2)
for (i in 1:length(div)){
    abline(v=div[i],lty=2)
}
The black dash line is \(\hat{y}=2 \sin(x) -0.5\cos(x) +1.5\sin(2x)-3\cos(2x)\) and the red bold line is our estimated function. The red dash lines on the both side of the estimated curve are 95% confidence intervals with the normal error \(\hat{\sigma}=-0.986\).

Comments