Vamos a buscar datos de especies de la base de datos disponible en línea y pública del Global Biodiversity Information Facility

Podemos usar unas funciones específicas que nos permitirán descargar los datos de las especies de interés.

Primero, cargaremos los paquetes necesarios: dismo, raster, sp

# Es necesario que instalen los paquetes antes: install.packages("dismo")
library(dismo)
## Loading required package: raster
## Loading required package: sp
library(raster)
library(sp)

sp.gbif <- gbif(genus="Phrynosoma", species="modestum", geo=TRUE, removeZeros=TRUE, download=TRUE)
## Loading required namespace: jsonlite
## 3032 records found
## 0-300-600-900-1200-1500-1800-2100-2400-2700-3000-3032 records downloaded
names(sp.gbif) # vamos a ver todos los nombres de las columnas
##   [1] "acceptedNameUsage"                                                   
##   [2] "accessRights"                                                        
##   [3] "adm1"                                                                
##   [4] "adm2"                                                                
##   [5] "associatedReferences"                                                
##   [6] "associatedSequences"                                                 
##   [7] "basisOfRecord"                                                       
##   [8] "bibliographicCitation"                                               
##   [9] "catalogNumber"                                                       
##  [10] "class"                                                               
##  [11] "classKey"                                                            
##  [12] "cloc"                                                                
##  [13] "collectionCode"                                                      
##  [14] "collectionID"                                                        
##  [15] "continent"                                                           
##  [16] "coordinateUncertaintyInMeters"                                       
##  [17] "country"                                                             
##  [18] "crawlId"                                                             
##  [19] "created"                                                             
##  [20] "datasetID"                                                           
##  [21] "datasetKey"                                                          
##  [22] "datasetName"                                                         
##  [23] "dateIdentified"                                                      
##  [24] "day"                                                                 
##  [25] "depth"                                                               
##  [26] "depthAccuracy"                                                       
##  [27] "disposition"                                                         
##  [28] "dynamicProperties"                                                   
##  [29] "elevation"                                                           
##  [30] "elevationAccuracy"                                                   
##  [31] "endDayOfYear"                                                        
##  [32] "establishmentMeans"                                                  
##  [33] "eventDate"                                                           
##  [34] "eventID"                                                             
##  [35] "eventRemarks"                                                        
##  [36] "eventTime"                                                           
##  [37] "family"                                                              
##  [38] "familyKey"                                                           
##  [39] "fieldNotes"                                                          
##  [40] "fieldNumber"                                                         
##  [41] "fullCountry"                                                         
##  [42] "gbifID"                                                              
##  [43] "genericName"                                                         
##  [44] "genus"                                                               
##  [45] "genusKey"                                                            
##  [46] "geodeticDatum"                                                       
##  [47] "georeferencedBy"                                                     
##  [48] "georeferencedDate"                                                   
##  [49] "georeferenceProtocol"                                                
##  [50] "georeferenceRemarks"                                                 
##  [51] "georeferenceSources"                                                 
##  [52] "georeferenceVerificationStatus"                                      
##  [53] "habitat"                                                             
##  [54] "higherClassification"                                                
##  [55] "higherGeography"                                                     
##  [56] "higherGeographyID"                                                   
##  [57] "http://unknown.org/classs"                                           
##  [58] "http://unknown.org/http_//rs.gbif.org/terms/1.0/Identifier"          
##  [59] "http://unknown.org/http_//rs.gbif.org/terms/1.0/Multimedia"          
##  [60] "http://unknown.org/http_//rs.tdwg.org/dwc/terms/Identification"      
##  [61] "http://unknown.org/http_//rs.tdwg.org/dwc/terms/ResourceRelationship"
##  [62] "http://unknown.org/occurrenceDetails"                                
##  [63] "identificationID"                                                    
##  [64] "identificationQualifier"                                             
##  [65] "identificationRemarks"                                               
##  [66] "identificationVerificationStatus"                                    
##  [67] "identifiedBy"                                                        
##  [68] "identifier"                                                          
##  [69] "individualCount"                                                     
##  [70] "informationWithheld"                                                 
##  [71] "installationKey"                                                     
##  [72] "institutionCode"                                                     
##  [73] "institutionID"                                                       
##  [74] "islandGroup"                                                         
##  [75] "ISO2"                                                                
##  [76] "key"                                                                 
##  [77] "kingdom"                                                             
##  [78] "kingdomKey"                                                          
##  [79] "language"                                                            
##  [80] "lastCrawled"                                                         
##  [81] "lastInterpreted"                                                     
##  [82] "lastParsed"                                                          
##  [83] "lat"                                                                 
##  [84] "license"                                                             
##  [85] "lifeStage"                                                           
##  [86] "locality"                                                            
##  [87] "locationAccordingTo"                                                 
##  [88] "locationID"                                                          
##  [89] "locationRemarks"                                                     
##  [90] "lon"                                                                 
##  [91] "modified"                                                            
##  [92] "month"                                                               
##  [93] "municipality"                                                        
##  [94] "nameAccordingTo"                                                     
##  [95] "namePublishedInYear"                                                 
##  [96] "nomenclaturalCode"                                                   
##  [97] "occurrenceID"                                                        
##  [98] "occurrenceRemarks"                                                   
##  [99] "occurrenceStatus"                                                    
## [100] "order"                                                               
## [101] "orderKey"                                                            
## [102] "organismID"                                                          
## [103] "otherCatalogNumbers"                                                 
## [104] "ownerInstitutionCode"                                                
## [105] "parentNameUsage"                                                     
## [106] "phylum"                                                              
## [107] "phylumKey"                                                           
## [108] "preparations"                                                        
## [109] "previousIdentifications"                                             
## [110] "protocol"                                                            
## [111] "publishingCountry"                                                   
## [112] "publishingOrgKey"                                                    
## [113] "recordedBy"                                                          
## [114] "recordNumber"                                                        
## [115] "references"                                                          
## [116] "reproductiveCondition"                                               
## [117] "rights"                                                              
## [118] "rightsHolder"                                                        
## [119] "scientificName"                                                      
## [120] "scientificNameID"                                                    
## [121] "sex"                                                                 
## [122] "species"                                                             
## [123] "speciesKey"                                                          
## [124] "specificEpithet"                                                     
## [125] "startDayOfYear"                                                      
## [126] "taxonID"                                                             
## [127] "taxonKey"                                                            
## [128] "taxonomicStatus"                                                     
## [129] "taxonRank"                                                           
## [130] "taxonRemarks"                                                        
## [131] "type"                                                                
## [132] "typeStatus"                                                          
## [133] "typifiedName"                                                        
## [134] "verbatimCoordinateSystem"                                            
## [135] "verbatimElevation"                                                   
## [136] "verbatimEventDate"                                                   
## [137] "verbatimLocality"                                                    
## [138] "verbatimSRS"                                                         
## [139] "vernacularName"                                                      
## [140] "year"                                                                
## [141] "downloadDate"
sp.coords <- na.omit(sp.gbif[,c(124, 90, 83)]) # indexamos los datos de coordenadas (lon, lat)
head(sp.coords)
##   specificEpithet       lon      lat
## 1        modestum -106.6985 32.29295
## 2        modestum -102.6655 30.25169
## 3        modestum -106.6103 32.70861
## 4        modestum -107.6247 33.30399
## 5        modestum -100.7055 22.49427
## 6        modestum -106.3079 31.63903

Graficamos las coordenadas para explorar posibles sesgos espaciales

plot(sp.coords$lon, sp.coords$lat)

Aquí vemos que hay muchos puntos muy agrupados. Este tipo de datos puede influir negativamente en la calibración del modelo de nicho y la predicción geográfica.

Vamos a tratar de reducir ese efecto adelgazando la muestra de puntos. Lo primero que hacemos es tratar de eliminar puntos a una distancia menor de 25 km. Como tenemos un tamaño de muestra grande (> 3000 registros) podemos hacer esto sin mucho problema. Usaremos el paquete spThin para esto.

# 25 km minimum distance
library(spThin)
## Warning: package 'spThin' was built under R version 3.5.1
## Loading required package: spam
## Loading required package: dotCall64
## Loading required package: grid
## Spam version 2.1-4 (2018-04-12) is loaded.
## Type 'help( Spam)' or 'demo( spam)' for a short introduction 
## and overview of this package.
## Help for individual functions is also obtained by adding the
## suffix '.spam' to the function name, e.g. 'help( chol.spam)'.
## 
## Attaching package: 'spam'
## The following objects are masked from 'package:base':
## 
##     backsolve, forwardsolve
## Loading required package: fields
## Loading required package: maps
## See www.image.ucar.edu/~nychka/Fields for
##  a vignette and other supplements.
## Loading required package: knitr
sp.thin <-thin(sp.coords, lat.col="lat", long.col="lon", spec.col="specificEpithet", 
                 thin.par=25, reps=1, locs.thinned.list.return=TRUE, write.files=TRUE,
                 max.files=1, out.dir=".", out.base = "modestum.train.25km", write.log.file=TRUE, log.file="modestum.log.txt",
                 verbose=TRUE)
## ********************************************** 
##  Beginning Spatial Thinning.
##  Script Started at: Sun Jul 29 13:43:02 2018
## lat.long.thin.count
## 307 
##   1 
## [1] "Maximum number of records after thinning: 307"
## [1] "Number of data.frames with max records: 1"
## [1] "Writing new *.csv files"
## Warning in thin(sp.coords, lat.col = "lat", long.col = "lon", spec.col =
## "specificEpithet", : Writing new *.csv files to output directory: .
## Warning in thin(sp.coords, lat.col = "lat", long.col = "lon", spec.col = "specificEpithet", : ./modestum.train.25km_thin1_new.csv ' already exists. Renaming file 
##                                    to avoid overwriting.
## [1] "Writing file: ./modestum.train.25km_thin1_new.csv"

Ahora vamos primero a identificar que tipo de objeto se acaba de generar:

class(sp.thin)
## [1] "list"
str(sp.thin)
## List of 1
##  $ :'data.frame':    307 obs. of  2 variables:
##   ..$ Longitude: num [1:307] -103 -108 -102 -106 -102 ...
##   ..$ Latitude : num [1:307] 30.3 33.3 30.1 31.5 30.7 ...
head(sp.thin[[1]])
##    Longitude Latitude
## 2  -102.6655 30.25169
## 4  -107.6247 33.30399
## 10 -102.3864 30.10102
## 11 -106.0404 31.45543
## 12 -101.8940 30.65892
## 14 -101.3129 29.68367

Tenemos que tener en cuenta como vienen organizados los datos para luego manipularlos como queramos. Vamos a graficar los puntos adelgazados a una distancia no menor de 25 km.

plot(sp.thin[[1]]$Longitude, sp.thin[[1]]$Latitude)

Si comparamos con el objeto que contenia todos los puntos podemos ver que hemos reducido el sobre-muestreo en algunas regiones y esto nos facilitará la calibración y posterior predicción de nuestro modelo de nicho. Vamos a ver donde caen nuestros puntos dentro de un mapa.

Vamos a convertir nuestros puntos en un objeto espacial.

coords <- sp.thin[[1]]
coordinates(coords)<- ~Longitude+Latitude

plot(coords)

coords
## class       : SpatialPoints 
## features    : 307 
## extent      : -118.1731, -97.43912, 21.20054, 38.03594  (xmin, xmax, ymin, ymax)
## coord. ref. : NA
plot(coords)
map(add = TRUE,  xlim=c(-118.1731,-97.43912), ylim=c(21.20054,38.03594))

Si solo nos interesan los puntos que se distribuyen en México, podríamos excluir los puntos que caen por fuera del país. Vamos a hacerlo. Primero necesitamos cargar el mapa de México.

library(rgdal)
## rgdal: version: 1.3-1, (SVN revision 747)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.2.3, released 2017/11/20
##  Path to GDAL shared files: C:/Users/juvel/Documents/R/win-library/3.5/rgdal/gdal
##  GDAL binary built with GEOS: TRUE 
##  Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493]
##  Path to PROJ.4 shared files: C:/Users/juvel/Documents/R/win-library/3.5/rgdal/proj
##  Linking to sp version: 1.2-7
mx <- readOGR(dsn="C:/Users/juvel/Dropbox/Cursos_Charlas/Zaragosa_2018/Curso_FES_Zaragosa18/Poligono_Mexico", layer="gadm36_MEX_1")
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\juvel\Dropbox\Cursos_Charlas\Zaragosa_2018\Curso_FES_Zaragosa18\Poligono_Mexico", layer: "gadm36_MEX_1"
## with 32 features
## It has 10 fields
plot(mx)

Ahora le ponemos los puntos encima al mapa…

plot(coords)
plot(mx, add=TRUE)

Con un clip vamos a quitar los puntos que caen por fuera. Pero primero definimos el sistema de coordenadas para nuestros puntos

mx
## class       : SpatialPolygonsDataFrame 
## features    : 32 
## extent      : -118.3689, -86.71014, 14.53292, 32.71804  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 10
## names       : GID_0, NAME_0,   GID_1,         NAME_1, VARNAME_1, NL_NAME_1,           TYPE_1,        ENGTYPE_1, CC_1, HASC_1 
## min values  :   MEX, Mexico, MEX.1_1, Aguascalientes,        NA,        NA, Distrito Federal, Federal District,   NA,  MX.AG 
## max values  :   MEX, Mexico, MEX.9_1,      Zacatecas,        NA,        NA,           Estado,            State,   NA,  MX.ZA
crdref <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
projection(coords)<-crdref 
coords
## class       : SpatialPoints 
## features    : 307 
## extent      : -118.1731, -97.43912, 21.20054, 38.03594  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
coords.clip <- coords[mx, ]
coords
## class       : SpatialPoints 
## features    : 307 
## extent      : -118.1731, -97.43912, 21.20054, 38.03594  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
coords.clip
## class       : SpatialPoints 
## features    : 136 
## extent      : -112.6083, -97.70008, 21.20054, 31.775  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0

Graficamos

plot(coords.clip)
plot(mx, add=TRUE)

Ahora tenemos nuestros puntos únicamente para México y los vamos a convertir en un objeto tipo data.frame con el fin de guardarlos en formato CSVy explorarlos con excel.

coords.clipDF <- as.data.frame(coords.clip)
coords.clipDF
##       Longitude Latitude
## 19   -101.05979 22.71030
## 20   -101.52213 26.31369
## 26   -105.91661 28.65161
## 65   -104.86634 26.89161
## 66   -100.97070 22.16421
## 71   -101.00324 27.92715
## 72    -99.12439 23.64901
## 74   -100.80920 25.25199
## 76   -100.23816 24.36905
## 79   -101.10850 23.89352
## 107  -101.89556 22.82944
## 116   -99.94752 23.52748
## 131  -105.00584 30.46371
## 144  -101.71500 23.85833
## 227  -102.57730 28.36036
## 255  -102.50253 28.72447
## 257  -100.70377 25.70094
## 258  -102.91481 27.19092
## 262  -102.80069 28.96800
## 280  -100.95000 28.53361
## 285  -101.67297 25.62062
## 306  -100.03678 22.54258
## 318  -103.29833 25.25583
## 328  -101.28340 24.06499
## 332  -100.94028 24.27906
## 333  -100.84439 23.77492
## 352  -101.27942 25.84200
## 359  -100.06598 24.06585
## 367  -107.12955 29.97947
## 381  -103.35092 26.96283
## 383  -103.48183 27.21056
## 384  -103.10339 26.20875
## 386  -103.15372 26.64569
## 415  -105.33147 29.54167
## 416  -103.65389 28.03661
## 418  -104.09783 28.11419
## 420  -104.31719 28.79531
## 429  -104.16469 27.81803
## 430  -104.38147 28.03411
## 434  -105.26272 29.97431
## 435  -106.08233 30.13389
## 448  -107.62167 31.60389
## 459  -100.50542 22.68775
## 469  -107.80564 31.39231
## 473  -106.58603 31.37706
## 480  -106.37528 31.15117
## 485   -99.23429 24.61317
## 489  -101.07000 26.31500
## 517  -100.15166 22.85944
## 520  -100.54528 23.11611
## 535  -104.51544 26.56981
## 549  -104.00000 27.52000
## 585  -101.54458 24.08967
## 597  -100.37972 24.09306
## 602  -100.45800 25.65000
## 606  -105.05944 25.07444
## 612  -108.06704 31.00012
## 635  -103.76877 26.63453
## 656  -101.67056 22.69444
## 657  -101.62278 22.94639
## 664  -101.24900 22.86800
## 669  -105.18814 27.76351
## 674  -105.19697 28.04955
## 702  -101.44018 24.61038
## 703  -100.64444 22.49167
## 736  -106.17268 28.53829
## 766  -102.15024 23.50033
## 771  -100.32028 24.84583
## 840  -100.52800 26.00000
## 862  -100.03200 25.24300
## 865  -103.76691 25.01703
## 872  -101.39018 24.35039
## 914  -105.20618 29.03603
## 930  -106.36032 29.57018
## 945  -106.73624 29.82405
## 951  -100.60167 23.43936
## 956  -106.50767 29.30628
## 1003 -107.30035 30.28348
## 1009 -106.11316 30.59684
## 1013 -105.76698 30.53348
## 1016 -104.86694 27.38359
## 1020 -102.37021 27.28027
## 1021 -102.27021 26.24031
## 1029 -102.55411 29.06306
## 1044 -109.48467 31.32772
## 1061 -100.38333 25.28378
## 1072 -102.84250 26.20719
## 1105 -100.94389 23.34583
## 1107 -101.22475 27.21897
## 1112 -102.06314 26.98422
## 1121 -106.51803 30.43917
## 1124 -101.42240 25.56363
## 1125 -101.05162 25.48513
## 1135 -102.02162 29.15383
## 1138 -102.50020 23.80041
## 1139 -103.81691 26.20031
## 1220 -107.31958 31.15539
## 1227 -101.22000 22.35000
## 1248 -112.60833 31.77500
## 1251 -100.70833 25.47500
## 1258 -103.24167 23.55833
## 1259 -106.41146 29.91441
## 1272 -104.94167 27.10833
## 1277 -103.14167 25.50833
## 1282  -97.70008 21.20054
## 1288 -100.05833 25.79167
## 1297 -102.87022 25.92032
## 1317 -100.85833 27.02500
## 1325 -106.23194 28.09008
## 1328 -100.02500 21.57500
## 1330 -107.47500 28.47500
## 1349 -103.62500 25.89167
## 1354 -107.64167 30.19167
## 1359 -106.17783 28.84009
## 1376  -99.70833 22.99167
## 1377 -108.29167 31.12500
## 1401 -103.15655 27.31756
## 1405 -102.18020 25.44035
## 1407 -102.30833 21.90833
## 1413 -101.44167 26.79167
## 1421 -105.08367 29.46339
## 1429 -102.89167 25.62500
## 1433 -102.80833 25.34167
## 1435 -103.22500 25.99167
## 1440 -104.45917 24.77750
## 1468 -106.10833 29.67500
## 1490 -103.20833 23.97500
## 1494 -103.59167 25.52500
## 1498 -109.67500 29.80833
## 1503 -107.77500 29.25833
## 1507  -99.47500 21.50833
## 1509 -108.06185 30.31081
## 1517 -101.52500 27.89167
## 1524 -101.54167 25.22500
## 1549 -105.49167 28.27500
## 1571 -103.25505 27.82208
write.csv(coords.clipDF, file="Coordenadas_GBIF_depuradas.csv")

Esto es todo por ahora!

Escrito por Julián A. Velasco, 29 de Julio de 2018.