Bivariate Kernel Density Estimation Using R Tool: 873071

Bivariate Kernel Density Estimation Using R Tool

The programming tool used for this assignment is R programming tool. The R code has detailed comment on every step taken above the code.

R code

# R code for conducting a CTR analysis using generated data

# Load dependencies

# Code requires package fdrtool & survival to be in library

# install.packages(“fdrtool”)

require (fdrtool);require(survival)

#create data

# Next step is to generate example data “radio tagged distance”, stored as object

x

mean(x)

sd(x)

# set random seed

set.seed(2009)

# generate data from lognormal distribution, 250 tracked seeds

# meanlog=log(15), sdlog=log(1.1)

x=rlnorm(250,log(15),log(1.1))

# truncate data after 20 units to create “tracked distances”

# with 20 m search radius

xtrunc=x[x<20]

# Prepare data for CTR

CTRdata=data.frame(

# all seeds that went beyond 20 meters are treated as censored events

# (distance > 20 meter)

d=c(xtrunc,rep(20,250-length(xtrunc))),

# classify events, found seeds = 1, censored seeds = 0

evnt=c(rep(1,length(xtrunc)),rep(0,250-length(xtrunc))))

# fitsurvival function

CTR_function=survfit(Surv(CTRdata$d, event=CTRdata$evnt) ~ 1)

# return survival probabilties (P) corresponding to distances (D)

P=summary(CTR_function)$surv;D=summary(CTR_function)$time

#Define dispersal kernels

# these kernels are then fit through OLS (Ordinary Least Squares) to objects P &D

# log normal

SSLN=function(param){

Ex=1-plnorm(D,meanlog=param[1],sdlog=param[2])

sum((Ex-P)^2)

}

# Weibull

SSW=function(param){

Ex=1-pweibull(D,shape=param[1],scale=param[2])

sum((Ex-P)^2)

}

# exponential

SSEX=function(param){

Ex=1-pexp(D,rate=param)

sum((Ex-P)^2)

}

# Normal

SSN=function(param){

Ex=1-phalfnorm(D,theta=param[1])

sum((Ex-P)^2)

}

#Obtain kernels with reconstructed tails

#Fit each model to the censored data and store

fitLN=optim(c(1,1),SSLN)

LNpsave=c(fitLN$par[1],fitLN$par[2])

fitW=optim(c(1.2,55),SSW)

WBpsave=c(fitW$par[1],fitW$par[2])

# Note: above the OLS function was optimized with the Nelder-Mead algorithm

# however this algorithm is optimal for optimization problems of 2 Dimensions

# or greater. The quasi-Newton method ‘BFGS’ is better suited for 1 D (or 1

# parameter) problems. Alternatively the function ‘optimize’ can be used

# however result will be the same either way.

fitN=optim(c(0.05),SSN,method=”BFGS”)

Npsave=c(fitN$par[1])

fitEX=optim(c(0.01),SSEX,method=”BFGS”)

EXpsave=c(fitEX$par[1])

# choose best model based on AIC score

OLSscores=c(fitLN$value,fitW$value,

fitN$value,fitEX$value)

# vector with number of parameters for each model

pars=c(2,2,1,1)

# calculating AIC from sum of squares

AICscores=(1500*log(OLSscores/1500) + 2*pars)

#FINAL

#selecting bestfitting model

Bestfit=c(“LN”,”WB”,”T”,”N”,”EX”)[which(AICscores==min(AICscores))]

#checking difference in between estimated and generating kernel

par(cex.axis=0.9,cex.lab=1.1,las=1,mar=c(4,5,1,1),mfrow=c(2,1))

# Density plots

curve(dlnorm(x,log(25),log(3)),0,150,col=’grey’,xlab=”distance”,

ylab=”probabilty density”,lwd=2)

curve(dlnorm(x,fitLN$par[1],fitLN$par[2]),col=’black’,add=T,lty=’dashed’,lwd=2)

legend(100,0.010,legend=c(‘True’, ‘CTR derived’),lty=c(‘solid’,’dashed’),

col=c(‘grey’,’black’),bty=’n’,lwd=2)

# Probability P of dispersal beyond distance D

curve(1-plnorm(x,log(50),log(3)),0,150,col=’grey’,xlab=”D”,

ylab=”P”, lwd=2)

curve(1-plnorm(x,fitLN$par[1],fitLN$par[2]),col=’black’,add=T,lty=’dashed’,lwd=2)

legend (100,0.90,legend=c(‘True’, ‘CTR derived’),lty=c(‘solid’,’dashed’),

col=c(‘grey’,’black’),bty=’n’,lwd=2)

R Output
 

3

References

Gentleman, R. (2009). R programming for bioinformatics. Boca Raton: CRC Press.

Baker, K., & Trietsch, D.(2009) Principles of sequencing and scheduling.

Braun, J., & Murdoch, D.(2012). A first course in statistical programming with R.

Grolemund, G., & Wickham, H. (2006). Hands-on programming with R.