Share

9.3 Combine Data Formats in R

14. Continue with same script loaded after finishing Exercise 9.1 and 9.2

15. Here we need to get a mean for each covariate for each 13 km2 cell of our sampling grid

#Extracts all rasters within each sampling grid cell
ext <- extract(r, grid, weights=TRUE, fun = mean)
head(ext) #view the results


# dem slope aspect mule.deer.HSI nlcd

#[1,] 2730.461 9.801106 227.6434 0.28312888 3.202686
#[2,] 2661.003 11.862190 120.0121 0.05178489 3.195367
#[3,] 2443.063 1.644629 136.0088 0.01859540 3.989930
#[4,] 2450.666 8.212403 199.1163 0.23681863 3.837849
#[5,] 2570.362 8.625199 212.5139 0.26878506 3.333714
#[6,] 2685.348 5.395405 205.6125 0.22692773 3.086528

###################################################NOTE above that for each grid cell in the sampling grid layer (i.e., grid),
# the "extract" function resulted in means for dem, slope, aspect, and HSI
#for all sampling grid cells but "nlcd" (last column of data) resulted in mean cover categories so need #to run nlcd separate with more appropriate code below.
##################################################

16. Mean land cover category is not appropriate here so we need to handle Land Cover layer (i.e., rc) separately
#Code below extracts nlcd by land cover category and determines how many
#cells of each type were in each sampling grid.
ovR = extract(rc,grid, byid=TRUE)
head(ovR)
#Summarize results by sampling grid ID
#Land Cover category and number of cells (30x30m raster cells)
tab <- lapply(ovR,table)
tab[[1]]
# 3
#18
tab[[48]]
# 3 4 5 7
#6618 80 95 167
############################################
#Code here thanks to Tyler Wagner, PA Coop Unit, for creating this loop
#to summarize proportions of habitat within each grid cell
############################################
# Created land use categories
lus <- 1:7
##### Loop through and append missing land use categories to each grid cell
ovRnew <- list()
for(i in 1:length(grid)[1] ){
# Land use cats in a given cell
temp1 <- unique(ovR[[i]])
# Give missing category 999 value
ma1 <- match(lus, temp1, nomatch = 999, incomparables = NULL)
# Get location (category of missing land use type)
miss <- which(ma1%in%999)
ovRnew[[i]] <- c(ovR[[i]], miss)
}

# New summary of land use in a grid cell
tab2 <- lapply(ovRnew,table)
tab2[[1]]
tab[[1]]
# Proportions of all land cover types per grid cell
prop <- list()
for(i in 1:length(grid)[1] ){
prop[[i]] <- round((margin.table(tab2[[i]],1)/margin.table(tab2[[i]])),digits = 6)
}
#Function coredata is from the zoo package to convert the
#the proportions from a list to a matrix
M <- coredata(do.call(cbind, lapply(prop, zoo)))
colnames(M) <- NULL
#Transpose matrix so land cover become separate columns of data
matrix <- t(M)
matrix
#Now convert the matrix to a data frame so it is easier to manipulate
dfland <- as.data.frame(matrix)
#Assigning column names to land cover
colnames(dfland) <-c("water","developed","forest","shrub","grass","crop","wetland")
dfland[1:5,]
17. Now that we have Land Cover in a similar format as the DEM-derived data, we want to convert ext (the combined extracted rasters) into a data frame so it is easier to manipulate as well. The "extract" function in the raster package is supposed to be able to do this but does not work for some reason.
elecov <- as.data.frame(ext)
head(elecov)

18. To bring all of these datasets together for Bayesian hierarchical modeling, we need to assign sampling grid identification numbers to each moose sampled based on where it was harvested. We also need to assign sampling grid cell identification for the covariate summaries so we can join all into one master data set

#Plot and look at sampling grid ID
plot(grid)
text(coordinates(grid), labels=sapply(slot(grid, "polygons"), function(i)
slot(i, "ID")), cex=0.8)
#Now assign actual sampling grid IDs to the covariate summaries in "elecov"
elecov$id <- paste(grid@data$id)#, function(x) slot(x,"ID"))
demdata <- elecov[-c(5)]#rename to cleanup by removing incorrect nlcd column
head(demdata)

#We can now combine dem data with Land Cover data to get our master dataset for each sampling grid cell

data <- cbind(demdata, dfland) # rbind list elements
head(data)
data$GRID <- data$id #This is mainly just to a check on matching grid IDs for later
data[1:10,]

19. We also need to identify the sampling grid cell that each moose was located and match the moose up with its respective covariate data based on grid cell ID

#First let’s assign the grid cell ID to each moose sampled
snowy2 <- over(snowy.spdf,grid)
snowy2[1:5,]
#Now add column to moose demographic data identifying sampling grid cell ID
new <- cbind(snowy.spdf@data,snowy2)
new[1:5,]

#Now we need to "join" the appropriate covariate data to each moose sampled based on the sampling grid cell it occurred in (e.g., g953)
data[1:5,]
data <- data[-c(13)]#remove duplicate id column or program throws an error
mydata <- merge(new, data, by=c("id"))
mydata[1:10,]


#Save data to use in exercise later
write.table(mydata,"SnowyData.txt", sep="\t")

#Save workspace so all data is available
save.image("Epi_dataprep.RData")