The burgeoning field of data science has provided a wealth of techniques for analyzing large and complex datasets, including methods for descriptive, explanatory, and predictive analytics. However, actually applying these methods is typically a small part of the overall data science workflow. Other critical tasks include screening for suspect data, handling missing values, harmonizing data from multiple sources, summarizing variables for analysis, and visualizing data and analysis results. Although there are now many books available on statistical and machine learning methods, there are fewer that address the broader topic of scientific workflows for geospatial data processing and analysis.
The purpose of Geographic Data Science with R (GDSWR) is to fill this gap. GDSWR provides a series of tutorials aimed at teaching good practices for using time series and geospatial data to address topics related to environmental change It is based on the R language and environment, which currently provides the best option for working with diverse sources of spatial and non-spatial data using a single platform. The book is not intended to provide a comprehensive overview of R. Instead, it uses an example-based approach to present practical approaches for working on diverse problems using a variety of datasets.
The material in GDSWR was originally developed for upper-level undergraduate and graduate courses in geospatial data science. It is also suitable for individual study by students or professionals who want to expand their capabilities for working with geospatial data in R. Although the book is not intended to be a comprehensive reference manual, it can also be useful for readers who are looking for examples of particular methods that can be modified for new applications. The tutorials focus on physical geography and draw upon a variety of data sources, including weather station data, gridded climate data, classified land cover data, and digital elevation models. It is my sincere hope that GDSWR will help readers increase their proficiency with R so that they can implement more sophisticated data science workflows that make effective use of diverse geographic data sources. These skills will allow them to address pressing scientific questions and develop new geospatial applications that can enhance our understanding of the changing world we inhabit.
The burgeoning field of data science has provided a wealth of techniques for analyzing large and complex datasets, including methods for descriptive, explanatory, and predictive analytics. However, actually applying these methods is typically a small part of the overall data science workflow. Other critical tasks include screening for suspect data, handling missing values, harmonizing data from multiple sources, summarizing variables for analysis, and visualizing data and analysis results. Although there are now many books available on statistical and machine learning methods, there are fewer that address the broader topic of scientific workflows for geospatial data processing and analysis.
The purpose of Geographic Data Science with R (GDSWR) is to fill this gap. GDSWR provides a series of tutorials aimed at teaching good practices for using time series and geospatial data to address topics related to environmental change It is based on the R language and environment, which currently provides the best option for working with diverse sources of spatial and non-spatial data using a single platform. The book is not intended to provide a comprehensive overview of R. Instead, it uses an example-based approach to present practical approaches for working on diverse problems using a variety of datasets.
The material in GDSWR was originally developed for upper-level undergraduate and graduate courses in geospatial data science. It is also suitable for individual study by students or professionals who want to expand their capabilities for working with geospatial data in R. Although the book is not intended to be a comprehensive reference manual, it can also be useful for readers who are looking for examples of particular methods that can be modified for new applications. The tutorials focus on physical geography and draw upon a variety of data sources, including weather station data, gridded climate data, classified land cover data, and digital elevation models. It is my sincere hope that GDSWR will help readers increase their proficiency with R so that they can implement more sophisticated data science workflows that make effective use of diverse geographic data sources. These skills will allow them to address pressing scientific questions and develop new geospatial applications that can enhance our understanding of the changing world we inhabit.