Ensambles de modelos de nicho ecológico y distribución de especies

En esta práctica se va a generar modelos de distribución potencial usando varios algoritmos para un conjunto de datos de un ratón (Neotoma cinerea “bushy-tailed woodrat”) de la familia Cricetidae que se distribuye en Estados Unidos y Cánada. Usaremos el paquete de R “sdm” (Naimi & Aráujo 2016; https://github.com/babaknaimi/sdm) que es bastante versátil.

datos <- read.csv("datos.csv")

plot(datos$decimalLongitude,datos$decimalLatitude)

# install.packages(c("sdm","raster", "ecospat", "usdm", "biomod2"))

# cargar paquetes

library(sdm)

## Loading required package: sp

## sdm 1.0-89 (2020-04-22)

library(raster)
library(ecospat)

## Loading required package: ade4

## 
## Attaching package: 'ade4'

## The following object is masked from 'package:sdm':
## 
##     niche

## Loading required package: ape

## 
## Attaching package: 'ape'

## The following objects are masked from 'package:raster':
## 
##     rotate, zoom

## Loading required package: gbm

## Loaded gbm 2.1.8

## Registered S3 methods overwritten by 'adehabitatMA':
##   method                       from
##   print.SpatialPixelsDataFrame sp  
##   print.SpatialPixels          sp

library(usdm)
library(biomod2)

## biomod2 3.5.1 loaded.
## 
## Type browseVignettes(package='biomod2') to access directly biomod2 vignettes.

Cargar las bios

bios <- stack(list.files(path="./ClimaActual/", pattern = "tif", full.names = T))

# particionamos los datos en dos conjuntos ("calibración" y "validación")
pts <- datos[,c(2,3)]
selected <- sample(1:nrow(pts), nrow(pts) * 0.5)
train <- pts[selected,] # this is the selection to be used for model training
test <- pts[-selected,] # this is the opposite of the selection which will be used for model testing
pts.all <- rbind(train,test)

# convertir el primer raster de los bios en puntos 
bios.pts<-rasterToPoints(bios[[1]],fun=NULL, spatial=TRUE)
bios.all<-raster::extract(bios[[1]],bios.pts)
bios.all<-data.frame(coordinates(bios.pts),bios.all)

# generar pseudo-ausencias para los datos de calibración
presences_train<- train
dim(presences_train)

## [1] 216   2

216*10

## [1] 2160

pseudo_abs_train<- ecospat.rand.pseudoabsences(nbabsences=216,
                   glob=bios.all,colxyglob=1:2, colvar=1:2, presence=presences_train,
                   colxypresence=1:2, mindist=0.1)

pseudo_abs_train<- pseudo_abs_train[,1:2]
presences_train["Rep1"] <- 1
presences_train <- plyr::rename(presences_train, replace=c("decimalLongitude"="x", "decimalLatitude"="y"))
pseudo_abs_train["Rep1"] <- 0


# Combino las presencias y las pseudo-absences del conjunto de calibración
DataSpeciesTrain.spp <-rbind(presences_train, pseudo_abs_train)

# generar pseudo-ausencias para los datos de validación
presences_test <- test
presences_test["Rep1"] <- 1
presences_test <- plyr::rename(presences_test, replace=c("decimalLongitude"="x", "decimalLatitude"="y"))
dim(presences_test)

## [1] 217   3

217*10

## [1] 2170

pseudo_abs_test<- ecospat.rand.pseudoabsences(nbabsences=2170,
                   glob=bios.all,colxyglob=1:2, colvar=1:2, presence=presences_test,
                   colxypresence=1:2, mindist=0.1)

pseudo_abs_test<-pseudo_abs_test[,1:2]
pseudo_abs_test["Rep1"] <- 0

# Combino las presencias y las pseudo-absences del conjunto de validación
DataSpeciesTest.spp <- rbind(presences_test, pseudo_abs_test)


# Convierto los dataframes en objectos espaciales
DataSpecies <- SpatialPointsDataFrame(coords =DataSpeciesTrain.spp[,c(1,2)], data = DataSpeciesTrain.spp,
                                      proj4string = CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))
TestSpecies <- SpatialPointsDataFrame(coords =DataSpeciesTest.spp[,c(1,2)], data = DataSpeciesTest.spp,
                                      proj4string = CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))
DataSpecies <- DataSpecies[,c(-1,-2)]
TestSpecies <- TestSpecies[,c(-1,-2)]

Voy a generar un mapa con cada conjunto de datos

plot(DataSpecies)

plot(TestSpecies)

Voy a examinar la colinealidad entre las variables usando el VIF (https://en.wikipedia.org/wiki/Variance_inflation_factor)

# La función vifcor identifica las variables que deben ser excluidas
v1 <- vifcor(bios, th=0.7)

bios.red <- exclude(bios, v1)

# Ajustar los datos a los requirimientos específicos del paquete sdm
d1 <- sdmData(formula=Rep1~., train=DataSpecies, test=TestSpecies, predictors=bios.red)

## Loading required package: dismo

## 
## Attaching package: 'dismo'

## The following object is masked from 'package:biomod2':
## 
##     evaluate

## Loading required package: tree

## Loading required package: mda

## Loading required package: class

## Loaded mda 0.5-2

## Loading required package: mgcv

## Loading required package: nlme

## 
## Attaching package: 'nlme'

## The following object is masked from 'package:usdm':
## 
##     Variogram

## The following object is masked from 'package:raster':
## 
##     getData

## This is mgcv 1.8-37. For overview type 'help("mgcv-package")'.

## Loading required package: glmnet

## Loading required package: Matrix

## Loaded glmnet 4.1-1

## Loading required package: earth

## Loading required package: Formula

## Loading required package: plotmo

## Loading required package: plotrix

## Loading required package: TeachingDemos

## Loading required package: rJava

## Loading required package: RSNNS

## Loading required package: Rcpp

## Loading required package: randomForest

## randomForest 4.6-14

## Type rfNews() to see new features/changes/bug fixes.

## Loading required package: rpart

## Loading required package: kernlab

## 
## Attaching package: 'kernlab'

## The following objects are masked from 'package:raster':
## 
##     buffer, rotated

## Warning in proj4string(train): CRS object has comment, which is lost in output

## Warning in proj4string(train): CRS object has comment, which is lost in output

d1

## class                                 : sdmdata 
## =========================================================== 
## number of species                     :  1 
## species names                         :  Rep1 
## number of features                    :  3 
## feature names                         :  bio_05, bio_12, bio_15 
## type                                  :  Presence-Absence 
## has independet test data?             :  TRUE 
## number of records                     :  train-> 431; test-> 2384 
## has Coordinates?                      :  TRUE

Estrategia de submuestreo: 70%-30% (calibración-validación) con 2 (haga 10 o más en su casa) replicas de bootstrapping. Vamos a probar maxent con default settings (ya vimos en clase qué consecuencias trae eso), random forest (los mejores en mi opinión), glm y bioclim. Se pueden ajustar más algoritmos. Vaya a la ayuda de la función tecleando ?sdm para ver más opciones.

m1 <- sdm(Rep1~., data=d1, methods=c('maxent','rf','glm','bioclim'),
          replicatin='sub', test.percent=30, n=2)

## Loading required package: parallel

## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1

m1

## class                                 : sdmModels 
## ======================================================== 
## number of species                     :  1 
## number of modelling methods           :  4 
## names of modelling methods            :  maxent, rf, glm, bioclim 
## replicate.methods (data partitioning) :  subsampling 
## number of replicates (each method)    :  2 
## toral number of replicates per model  :  2 (per species) 
## test percentage (in subsampling)      :  30 
## ------------------------------------------
## model run success percentage (per species)  :
## ------------------------------------------
## method          Rep1             
## ---------------------- 
## maxent     :        100   %
## rf         :        100   %
## glm        :        100   %
## bioclim    :        100   %
## 
## ###################################################################
## model Mean performance (per species), using independent test dataset:
## -------------------------------------------------------------------------------
## 
##  ## species   :  Rep1 
## =========================
## 
## methods    :     AUC     |     COR     |     TSS     |     Deviance 
## -------------------------------------------------------------------------
## maxent     :     0.85    |     0.36    |     0.57    |     0.92     
## rf         :     0.87    |     0.43    |     0.64    |     0.96     
## glm        :     0.7     |     0.21    |     0.37    |     1.26     
## bioclim    :     0.75    |     0.28    |     0.4     |     0.96

Vamos a ver las curvas ROC que evaluan el balance entre la especificidad y sensibilidad. Recuerden la clase de la curva ROC.

roc(m1)

Generar un ensamble con los mejores modelos usando el método de promedio ponderado (weighted averaging) con base en la métrica (TSS: True Statistics Skill)

e1 <- ensemble(m1,newdata=bios.red,filename='e1.img', setting=list(method='weighted',stat='TSS'))

## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1

Graficamos el modelo

plot(e1)

Generamos un mapa binario usando un criterio de umbral de mínima área predicha y donde le asignamos un error de 10% a los datos de presencia que no usamos para calibrar.

xy <- train
coordinates(xy) <- ~decimalLongitude+decimalLatitude
mpa.e1 <- ecospat.mpa(e1, xy, perc = .9)
mpa.e1

##  10% 
## 0.45

bin.e1 <- ecospat.binary.model(e1, mpa.e1)
plot(bin.e1)

Vamos a generar unas proyecciones a futuro.

# Subir bioclimáticos de escenarios futuros (RCP8.5 2070)
fut.bios <- stack(list.files(path="./CMIP5", pattern = "tif", full.names = T))

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

# recortar las bios a la extensión
fut.bios.m <- raster::crop(fut.bios, bios.red[[1]])

# generar la proyección futura sobre el ensamble
e1.fut <- ensemble(m1,newdata=fut.bios.m,filename='e1_fut.img', setting=list(method='weighted',stat='TSS'))

## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1

# graficar
plot(e1.fut)

# Generar el binario para el escenario futuro
bin.e1.fut <- ecospat.binary.model(e1.fut, mpa.e1)

# graficar
plot(bin.e1.fut)

Con un escenario de emisiones más leve

# Subir bioclimáticos de escenarios futuros (RCP2.6 2070)
fut.bios <- stack(list.files(path="./CMIP5/", pattern = "tif", full.names = T))

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum unknown in Proj4 definition

# recortar las bios a la extensión
fut.bios.m <- raster::crop(fut.bios, bios.red[[1]])

# generar la proyección futura sobre el ensamble
e1.fut <- ensemble(m1,newdata=fut.bios.m,filename='e1_fut2.img', setting=list(method='weighted',stat='TSS'))

## This is MaxEnt version 3.4.1 
## This is MaxEnt version 3.4.1

# graficar
plot(e1.fut)

Ahora vamos a identificar áreas de estabilidad, ganancia y pérdida de áreas (i.e., número de pixeles). Ya tenemos los mapas binarios para el periodo actual y el futuro (0, 1). Usaremos una función del paquete Biomod2 (BIOMOD_RangeSize).

Primero tenemos que tener la misma resolución espacial en ambos rasters.

crs.geo <- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84")  # geographical, datum WGS84

bin.e1.r <- projectRaster(from=bin.e1, to=bin.e1.fut, crs=crs.geo, method="bilinear")

## Warning in projectRaster(from = bin.e1, to = bin.e1.fut, crs = crs.geo, : input
## and ouput crs are the same

cambio <- BIOMOD_RangeSize(bin.e1.r, bin.e1.fut, SpChange.Save=NULL)  

cambio

## $Compt.By.Models
##        Loss Stable0 Stable1  Gain PercLoss PercGain SpeciesRangeChange
## e1_fut 6048   60022   16950 16048   26.298    69.78             43.482
##        CurrentRangeSize FutureRangeSize.NoDisp FutureRangeSize.FullDisp
## e1_fut            22998                  16950                    32998
## 
## $Diff.By.Pixel
## class      : RasterStack 
## dimensions : 341, 685, 233585, 1  (nrow, ncol, ncell, nlayers)
## resolution : 0.1666667, 0.1666667  (x, y)
## extent     : -166.8333, -52.66667, 14.5, 71.33333  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## names      : e1_fut 
## min values :     -2 
## max values :      1

Vamos a mostrar el mapa.

Valores de -2 son los pixeles que se proyectan por perderse para la especie en el futuro. Valores de -1 son los pixeles que se proyectan por estar estables (no cambian). Valores de 0 son los pixeles que no ocupados actualmente pero tampoco en el futuro. Valores de 1 son los pixeles no ocupados actualmente pero proyectados por estarlo en el futuro.

plot(cambio$Diff.By.Pixel)

Hay varios paquetes para visualizar rasters.

https://oscarperpinan.github.io/rastervis/

library(rasterVis)

## Loading required package: terra

## terra version 1.3.22

## 
## Attaching package: 'terra'

## The following object is masked from 'package:kernlab':
## 
##     size

## The following object is masked from 'package:TeachingDemos':
## 
##     dots

## The following object is masked from 'package:plotrix':
## 
##     rescale

## The following objects are masked from 'package:dismo':
## 
##     convHull, voronoi

## The following object is masked from 'package:ape':
## 
##     trans

## Loading required package: lattice

## Loading required package: latticeExtra

levelplot(cambio$Diff.By.Pixel, par.settings=GrTheme())

TAREA:

Como tarea les queda hacer las proyecciones futuras con cada modelo invididual y evaluar qué tanta incertidumbre hay en cada algoritmo.

Eso es todo por ahora!

Ensambles de modelos de nicho ecológico y distribución de especies

Julián A. Velasco

10/06/2021