Vamos a buscar datos de especies de la base de datos disponible en línea y pública del Global Biodiversity Information Facility
Podemos usar unas funciones específicas que nos permitirán descargar los datos de las especies de interés.
Primero, cargaremos los paquetes necesarios: dismo
, raster
, sp
# Es necesario que instalen los paquetes antes: install.packages("dismo")
library(dismo)
## Loading required package: raster
## Loading required package: sp
library(raster)
library(sp)
sp.gbif <- gbif(genus="Phrynosoma", species="modestum", geo=TRUE, removeZeros=TRUE, download=TRUE)
## Loading required namespace: jsonlite
## 3032 records found
## 0-300-600-900-1200-1500-1800-2100-2400-2700-3000-3032 records downloaded
names(sp.gbif) # vamos a ver todos los nombres de las columnas
## [1] "acceptedNameUsage"
## [2] "accessRights"
## [3] "adm1"
## [4] "adm2"
## [5] "associatedReferences"
## [6] "associatedSequences"
## [7] "basisOfRecord"
## [8] "bibliographicCitation"
## [9] "catalogNumber"
## [10] "class"
## [11] "classKey"
## [12] "cloc"
## [13] "collectionCode"
## [14] "collectionID"
## [15] "continent"
## [16] "coordinateUncertaintyInMeters"
## [17] "country"
## [18] "crawlId"
## [19] "created"
## [20] "datasetID"
## [21] "datasetKey"
## [22] "datasetName"
## [23] "dateIdentified"
## [24] "day"
## [25] "depth"
## [26] "depthAccuracy"
## [27] "disposition"
## [28] "dynamicProperties"
## [29] "elevation"
## [30] "elevationAccuracy"
## [31] "endDayOfYear"
## [32] "establishmentMeans"
## [33] "eventDate"
## [34] "eventID"
## [35] "eventRemarks"
## [36] "eventTime"
## [37] "family"
## [38] "familyKey"
## [39] "fieldNotes"
## [40] "fieldNumber"
## [41] "fullCountry"
## [42] "gbifID"
## [43] "genericName"
## [44] "genus"
## [45] "genusKey"
## [46] "geodeticDatum"
## [47] "georeferencedBy"
## [48] "georeferencedDate"
## [49] "georeferenceProtocol"
## [50] "georeferenceRemarks"
## [51] "georeferenceSources"
## [52] "georeferenceVerificationStatus"
## [53] "habitat"
## [54] "higherClassification"
## [55] "higherGeography"
## [56] "higherGeographyID"
## [57] "http://unknown.org/classs"
## [58] "http://unknown.org/http_//rs.gbif.org/terms/1.0/Identifier"
## [59] "http://unknown.org/http_//rs.gbif.org/terms/1.0/Multimedia"
## [60] "http://unknown.org/http_//rs.tdwg.org/dwc/terms/Identification"
## [61] "http://unknown.org/http_//rs.tdwg.org/dwc/terms/ResourceRelationship"
## [62] "http://unknown.org/occurrenceDetails"
## [63] "identificationID"
## [64] "identificationQualifier"
## [65] "identificationRemarks"
## [66] "identificationVerificationStatus"
## [67] "identifiedBy"
## [68] "identifier"
## [69] "individualCount"
## [70] "informationWithheld"
## [71] "installationKey"
## [72] "institutionCode"
## [73] "institutionID"
## [74] "islandGroup"
## [75] "ISO2"
## [76] "key"
## [77] "kingdom"
## [78] "kingdomKey"
## [79] "language"
## [80] "lastCrawled"
## [81] "lastInterpreted"
## [82] "lastParsed"
## [83] "lat"
## [84] "license"
## [85] "lifeStage"
## [86] "locality"
## [87] "locationAccordingTo"
## [88] "locationID"
## [89] "locationRemarks"
## [90] "lon"
## [91] "modified"
## [92] "month"
## [93] "municipality"
## [94] "nameAccordingTo"
## [95] "namePublishedInYear"
## [96] "nomenclaturalCode"
## [97] "occurrenceID"
## [98] "occurrenceRemarks"
## [99] "occurrenceStatus"
## [100] "order"
## [101] "orderKey"
## [102] "organismID"
## [103] "otherCatalogNumbers"
## [104] "ownerInstitutionCode"
## [105] "parentNameUsage"
## [106] "phylum"
## [107] "phylumKey"
## [108] "preparations"
## [109] "previousIdentifications"
## [110] "protocol"
## [111] "publishingCountry"
## [112] "publishingOrgKey"
## [113] "recordedBy"
## [114] "recordNumber"
## [115] "references"
## [116] "reproductiveCondition"
## [117] "rights"
## [118] "rightsHolder"
## [119] "scientificName"
## [120] "scientificNameID"
## [121] "sex"
## [122] "species"
## [123] "speciesKey"
## [124] "specificEpithet"
## [125] "startDayOfYear"
## [126] "taxonID"
## [127] "taxonKey"
## [128] "taxonomicStatus"
## [129] "taxonRank"
## [130] "taxonRemarks"
## [131] "type"
## [132] "typeStatus"
## [133] "typifiedName"
## [134] "verbatimCoordinateSystem"
## [135] "verbatimElevation"
## [136] "verbatimEventDate"
## [137] "verbatimLocality"
## [138] "verbatimSRS"
## [139] "vernacularName"
## [140] "year"
## [141] "downloadDate"
sp.coords <- na.omit(sp.gbif[,c(124, 90, 83)]) # indexamos los datos de coordenadas (lon, lat)
head(sp.coords)
## specificEpithet lon lat
## 1 modestum -106.6985 32.29295
## 2 modestum -102.6655 30.25169
## 3 modestum -106.6103 32.70861
## 4 modestum -107.6247 33.30399
## 5 modestum -100.7055 22.49427
## 6 modestum -106.3079 31.63903
Graficamos las coordenadas para explorar posibles sesgos espaciales
plot(sp.coords$lon, sp.coords$lat)
Aquí vemos que hay muchos puntos muy agrupados. Este tipo de datos puede influir negativamente en la calibración del modelo de nicho y la predicción geográfica.
Vamos a tratar de reducir ese efecto adelgazando la muestra de puntos. Lo primero que hacemos es tratar de eliminar puntos a una distancia menor de 25 km. Como tenemos un tamaño de muestra grande (> 3000 registros) podemos hacer esto sin mucho problema. Usaremos el paquete spThin
para esto.
# 25 km minimum distance
library(spThin)
## Warning: package 'spThin' was built under R version 3.5.1
## Loading required package: spam
## Loading required package: dotCall64
## Loading required package: grid
## Spam version 2.1-4 (2018-04-12) is loaded.
## Type 'help( Spam)' or 'demo( spam)' for a short introduction
## and overview of this package.
## Help for individual functions is also obtained by adding the
## suffix '.spam' to the function name, e.g. 'help( chol.spam)'.
##
## Attaching package: 'spam'
## The following objects are masked from 'package:base':
##
## backsolve, forwardsolve
## Loading required package: fields
## Loading required package: maps
## See www.image.ucar.edu/~nychka/Fields for
## a vignette and other supplements.
## Loading required package: knitr
sp.thin <-thin(sp.coords, lat.col="lat", long.col="lon", spec.col="specificEpithet",
thin.par=25, reps=1, locs.thinned.list.return=TRUE, write.files=TRUE,
max.files=1, out.dir=".", out.base = "modestum.train.25km", write.log.file=TRUE, log.file="modestum.log.txt",
verbose=TRUE)
## **********************************************
## Beginning Spatial Thinning.
## Script Started at: Sun Jul 29 13:43:02 2018
## lat.long.thin.count
## 307
## 1
## [1] "Maximum number of records after thinning: 307"
## [1] "Number of data.frames with max records: 1"
## [1] "Writing new *.csv files"
## Warning in thin(sp.coords, lat.col = "lat", long.col = "lon", spec.col =
## "specificEpithet", : Writing new *.csv files to output directory: .
## Warning in thin(sp.coords, lat.col = "lat", long.col = "lon", spec.col = "specificEpithet", : ./modestum.train.25km_thin1_new.csv ' already exists. Renaming file
## to avoid overwriting.
## [1] "Writing file: ./modestum.train.25km_thin1_new.csv"
Ahora vamos primero a identificar que tipo de objeto se acaba de generar:
class(sp.thin)
## [1] "list"
str(sp.thin)
## List of 1
## $ :'data.frame': 307 obs. of 2 variables:
## ..$ Longitude: num [1:307] -103 -108 -102 -106 -102 ...
## ..$ Latitude : num [1:307] 30.3 33.3 30.1 31.5 30.7 ...
head(sp.thin[[1]])
## Longitude Latitude
## 2 -102.6655 30.25169
## 4 -107.6247 33.30399
## 10 -102.3864 30.10102
## 11 -106.0404 31.45543
## 12 -101.8940 30.65892
## 14 -101.3129 29.68367
Tenemos que tener en cuenta como vienen organizados los datos para luego manipularlos como queramos. Vamos a graficar los puntos adelgazados a una distancia no menor de 25 km.
plot(sp.thin[[1]]$Longitude, sp.thin[[1]]$Latitude)
Si comparamos con el objeto que contenia todos los puntos podemos ver que hemos reducido el sobre-muestreo en algunas regiones y esto nos facilitará la calibración y posterior predicción de nuestro modelo de nicho. Vamos a ver donde caen nuestros puntos dentro de un mapa.
Vamos a convertir nuestros puntos en un objeto espacial.
coords <- sp.thin[[1]]
coordinates(coords)<- ~Longitude+Latitude
plot(coords)
coords
## class : SpatialPoints
## features : 307
## extent : -118.1731, -97.43912, 21.20054, 38.03594 (xmin, xmax, ymin, ymax)
## coord. ref. : NA
plot(coords)
map(add = TRUE, xlim=c(-118.1731,-97.43912), ylim=c(21.20054,38.03594))
Si solo nos interesan los puntos que se distribuyen en México, podríamos excluir los puntos que caen por fuera del país. Vamos a hacerlo. Primero necesitamos cargar el mapa de México.
library(rgdal)
## rgdal: version: 1.3-1, (SVN revision 747)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 2.2.3, released 2017/11/20
## Path to GDAL shared files: C:/Users/juvel/Documents/R/win-library/3.5/rgdal/gdal
## GDAL binary built with GEOS: TRUE
## Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493]
## Path to PROJ.4 shared files: C:/Users/juvel/Documents/R/win-library/3.5/rgdal/proj
## Linking to sp version: 1.2-7
mx <- readOGR(dsn="C:/Users/juvel/Dropbox/Cursos_Charlas/Zaragosa_2018/Curso_FES_Zaragosa18/Poligono_Mexico", layer="gadm36_MEX_1")
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\juvel\Dropbox\Cursos_Charlas\Zaragosa_2018\Curso_FES_Zaragosa18\Poligono_Mexico", layer: "gadm36_MEX_1"
## with 32 features
## It has 10 fields
plot(mx)
Ahora le ponemos los puntos encima al mapa…
plot(coords)
plot(mx, add=TRUE)
Con un clip
vamos a quitar los puntos que caen por fuera. Pero primero definimos el sistema de coordenadas para nuestros puntos
mx
## class : SpatialPolygonsDataFrame
## features : 32
## extent : -118.3689, -86.71014, 14.53292, 32.71804 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## variables : 10
## names : GID_0, NAME_0, GID_1, NAME_1, VARNAME_1, NL_NAME_1, TYPE_1, ENGTYPE_1, CC_1, HASC_1
## min values : MEX, Mexico, MEX.1_1, Aguascalientes, NA, NA, Distrito Federal, Federal District, NA, MX.AG
## max values : MEX, Mexico, MEX.9_1, Zacatecas, NA, NA, Estado, State, NA, MX.ZA
crdref <- CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
projection(coords)<-crdref
coords
## class : SpatialPoints
## features : 307
## extent : -118.1731, -97.43912, 21.20054, 38.03594 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
coords.clip <- coords[mx, ]
coords
## class : SpatialPoints
## features : 307
## extent : -118.1731, -97.43912, 21.20054, 38.03594 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
coords.clip
## class : SpatialPoints
## features : 136
## extent : -112.6083, -97.70008, 21.20054, 31.775 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
Graficamos
plot(coords.clip)
plot(mx, add=TRUE)
Ahora tenemos nuestros puntos únicamente para México y los vamos a convertir en un objeto tipo data.frame
con el fin de guardarlos en formato CSV
y explorarlos con excel.
coords.clipDF <- as.data.frame(coords.clip)
coords.clipDF
## Longitude Latitude
## 19 -101.05979 22.71030
## 20 -101.52213 26.31369
## 26 -105.91661 28.65161
## 65 -104.86634 26.89161
## 66 -100.97070 22.16421
## 71 -101.00324 27.92715
## 72 -99.12439 23.64901
## 74 -100.80920 25.25199
## 76 -100.23816 24.36905
## 79 -101.10850 23.89352
## 107 -101.89556 22.82944
## 116 -99.94752 23.52748
## 131 -105.00584 30.46371
## 144 -101.71500 23.85833
## 227 -102.57730 28.36036
## 255 -102.50253 28.72447
## 257 -100.70377 25.70094
## 258 -102.91481 27.19092
## 262 -102.80069 28.96800
## 280 -100.95000 28.53361
## 285 -101.67297 25.62062
## 306 -100.03678 22.54258
## 318 -103.29833 25.25583
## 328 -101.28340 24.06499
## 332 -100.94028 24.27906
## 333 -100.84439 23.77492
## 352 -101.27942 25.84200
## 359 -100.06598 24.06585
## 367 -107.12955 29.97947
## 381 -103.35092 26.96283
## 383 -103.48183 27.21056
## 384 -103.10339 26.20875
## 386 -103.15372 26.64569
## 415 -105.33147 29.54167
## 416 -103.65389 28.03661
## 418 -104.09783 28.11419
## 420 -104.31719 28.79531
## 429 -104.16469 27.81803
## 430 -104.38147 28.03411
## 434 -105.26272 29.97431
## 435 -106.08233 30.13389
## 448 -107.62167 31.60389
## 459 -100.50542 22.68775
## 469 -107.80564 31.39231
## 473 -106.58603 31.37706
## 480 -106.37528 31.15117
## 485 -99.23429 24.61317
## 489 -101.07000 26.31500
## 517 -100.15166 22.85944
## 520 -100.54528 23.11611
## 535 -104.51544 26.56981
## 549 -104.00000 27.52000
## 585 -101.54458 24.08967
## 597 -100.37972 24.09306
## 602 -100.45800 25.65000
## 606 -105.05944 25.07444
## 612 -108.06704 31.00012
## 635 -103.76877 26.63453
## 656 -101.67056 22.69444
## 657 -101.62278 22.94639
## 664 -101.24900 22.86800
## 669 -105.18814 27.76351
## 674 -105.19697 28.04955
## 702 -101.44018 24.61038
## 703 -100.64444 22.49167
## 736 -106.17268 28.53829
## 766 -102.15024 23.50033
## 771 -100.32028 24.84583
## 840 -100.52800 26.00000
## 862 -100.03200 25.24300
## 865 -103.76691 25.01703
## 872 -101.39018 24.35039
## 914 -105.20618 29.03603
## 930 -106.36032 29.57018
## 945 -106.73624 29.82405
## 951 -100.60167 23.43936
## 956 -106.50767 29.30628
## 1003 -107.30035 30.28348
## 1009 -106.11316 30.59684
## 1013 -105.76698 30.53348
## 1016 -104.86694 27.38359
## 1020 -102.37021 27.28027
## 1021 -102.27021 26.24031
## 1029 -102.55411 29.06306
## 1044 -109.48467 31.32772
## 1061 -100.38333 25.28378
## 1072 -102.84250 26.20719
## 1105 -100.94389 23.34583
## 1107 -101.22475 27.21897
## 1112 -102.06314 26.98422
## 1121 -106.51803 30.43917
## 1124 -101.42240 25.56363
## 1125 -101.05162 25.48513
## 1135 -102.02162 29.15383
## 1138 -102.50020 23.80041
## 1139 -103.81691 26.20031
## 1220 -107.31958 31.15539
## 1227 -101.22000 22.35000
## 1248 -112.60833 31.77500
## 1251 -100.70833 25.47500
## 1258 -103.24167 23.55833
## 1259 -106.41146 29.91441
## 1272 -104.94167 27.10833
## 1277 -103.14167 25.50833
## 1282 -97.70008 21.20054
## 1288 -100.05833 25.79167
## 1297 -102.87022 25.92032
## 1317 -100.85833 27.02500
## 1325 -106.23194 28.09008
## 1328 -100.02500 21.57500
## 1330 -107.47500 28.47500
## 1349 -103.62500 25.89167
## 1354 -107.64167 30.19167
## 1359 -106.17783 28.84009
## 1376 -99.70833 22.99167
## 1377 -108.29167 31.12500
## 1401 -103.15655 27.31756
## 1405 -102.18020 25.44035
## 1407 -102.30833 21.90833
## 1413 -101.44167 26.79167
## 1421 -105.08367 29.46339
## 1429 -102.89167 25.62500
## 1433 -102.80833 25.34167
## 1435 -103.22500 25.99167
## 1440 -104.45917 24.77750
## 1468 -106.10833 29.67500
## 1490 -103.20833 23.97500
## 1494 -103.59167 25.52500
## 1498 -109.67500 29.80833
## 1503 -107.77500 29.25833
## 1507 -99.47500 21.50833
## 1509 -108.06185 30.31081
## 1517 -101.52500 27.89167
## 1524 -101.54167 25.22500
## 1549 -105.49167 28.27500
## 1571 -103.25505 27.82208
write.csv(coords.clipDF, file="Coordenadas_GBIF_depuradas.csv")
Esto es todo por ahora!
Escrito por Julián A. Velasco, 29 de Julio de 2018.