gjam

gjam

latest release gjam 2.3.2 on 6-15-20

Generalized Joint Attribute Modeling (GJAM) in R

Ecological attributes include species abundances, traits, and individual condition (e.g., growth or infection status), to name a few. They are multivariate data, but not all of one type.  They can be combinations of presence-absence, ordinal, continuous, discrete, composition, or zero-inflated.   gjam  provides inference on sensitivity to input variables, correlations between responses, model selection, prediction of responses, inverse prediction of predictors, and community classification by response to predictors.

gjam was motivated by species distribution and abundance data, but can provide an attractive alternative to traditional methods wherever observations are multivariate and combine multiple scales and mixtures of continuous and discrete data.

Importantly, analysis is done on the observation scale. That is, coefficients and covariances are interpreted on the same scale as the data. This contrasts with standard generalized linear models (GLMs), where coefficients and covariances are difficult to interpret and cannot be compared across responses that are modeled on different scales and with nonlinear link functions.

 gjam accommodates massive zeros in multivariate data by avoiding the standard mixtures used in zero-inflated GLMs. Instead, gjam relies on censoring.

gjam exploits censoring to combine multiple data types in a single model, including mixtures of continuous and discrete data.  For example, the microbial community (composition data) might be tracked together with host condition (continuous, categorical, binary, ordinal, …).

Model:

Clark, J.S., D. Nemergut, B. Seyednasrollah, P. Turner, and S. Zhang. 2017.  Generalized joint attribute modeling for biodiversity analysis: Median-zero, multivariate, multifarious data.   Ecological Monographs, 87, 34–56.  Clark2017EcolMonogrclarksupplement.

Presents the motivation and model; summarizes computation in gjam.  The Supplement file provides additional detail on algorithms.

Taylor-Rodríguez, D., Kaufeld, K., Schliep, E. M., Clark, J. S., Gelfand, A. E. 2017. Joint species distribution modeling: Dimension reduction using Dirichlet processes. Bayesian Analysis, doi: 10.1214/16-BA1031. http://projecteuclid.org/euclid.ba/1478073617  bayesanaly2016

Many applications require large numbers of response variables.  Microbiome studies bring the additional complication of composition data.  And most observed values can still be zero.  This paper describes the Dirichlet process prior implemented in gjam that finds a low-dimensional representation for the covariance between responses.

Dynamic species interactions:

Clark, J. S., C. L. Scher, and M. Swift. 2020. The emergent interactions that govern biodiversity change. Proceedings of the National Academy of Sciences, 202003852, https://doi.org/10.1073/pnas.2003852117. clarkPNAS2003852117.full

Quantifying species interactions requires dynamic data and the integration of environmental effects. This paper defines and estimates environment-species interactions (ESI) with full uncertainty.

Microbiome applications:

Wang, Z., D. L. Juarez, J.-F. Pan,2, S. K. Blinebry, J. Gronniger, J. S. Clark, Z. I. Johnson, and D. E. Hunt. 2019. Microbial communities across nearshore to offshore coastal transects are shaped by both distance from shore and seasonality. Environmental Microbiology, 21, 3862-3872.

16S rRNA gene libraries reveal distinct nearshore, continental shelf, and offshore oceanic communities. Water temperature and distance from shore both most influence community composition. However, at the phylotype level, the distribution of some taxa is linked to temperature, others to distance from shore and some by both.

Bachelot B., Uriarte M., Muscarella R., Forero-Montana J., Thompson J., McGuire K., Zimmerman J.K., Swenson N.G. and J.S. Clark. 2018. Associations among arbuscular mycorrhizal fungi and seedlings are predicted to change with tree successional status. Ecology 99: 607-620.

Seedlings of early‐successional tree species may not rely as much as mid‐ and late‐successional species on AM fungi, and AM fungi may accelerate forest succession.

Trait analysis:  

Seyednasrollah, B., and Clark, J. S. 2020. Where resource‐acquisitive species are located: The role of habitat heterogeneity. Geophysical Research Letters, 47, e2020GL087626. https://doi.org/10.1029/2020GL087626
Joint inference on traits demonstrates that N- and P-demanding species respond disproportionately to environmental gradients, and their response is largely explained by soil variation. A strong boundary of resource‐acquisitive species occurs near the last glacial limit that separates weathered soils to the south from young soils to the north. Although local soil moisture may reduce drought‐induced stress for moisture‐acquisitive species, nutrient‐acquisitive species remain vulnerable on wet soils in dry climates.

Clark, J.S. 2016.  Why species tell us more about traits than traits tell us about species: Predictive models. Ecology,97, 1979–1993, ecology2016ecology2016_AppendixS1

The joint distribution of ecological attributes (‘traits’) can be modeled together with species, separately, or predicted from the joint distribution of species.  This paper describes the model and computation implemented in gjam.

Vignette with R code and applications: gjam vignette

Below are cluster plots of the correlation matrix for a presence-absence model (a), continuous abundance model (b), and the response to environmental variables (d).  The cluster analysis in (c) is based on distances in (d).  These plots are obtained by specifying GRIDPLOTS=T in gjamPlot.

fig7a

fig7b

Main contributors:

Jim Clark wrote the GJAM model, the R and C++ code, and the GJAM package.

Alan Gelfand and Daniel Taylor-Rodrigues wrote the Dirichlet process model and algorithms for dimension reduction.

Daniel Taylor-Rodrigues implemented the Dirichlet process in R and C++ in GJAM.

Bene Bachelot, Chase Nuñes, and Brad Tomasek provided extensive testing and feedback through all stages of development.

Many others: Students in the course Bayesian Inference Environm Models (BIO/ENV 665) at Duke University and members of the Multivariate Modeling working group of the SAMSI Ecology program contributed many ideas, recommendations, and feedback.

Installation in R or RStudio:

> install.packages('gjam')
> library('gjam')

Documentation:

> help('gjam')
> browseVignettes('gjam')

Publications using GJAM

Clark, J. S., C. L. Scher, and M. Swift. 2020. The emergent interactions that govern biodiversity change. Proceedings of the National Academy of Sciences, in press.

Wang, Z., D. L. Juarez, J.-F. Pan,2, S. K. Blinebry, J. Gronniger, J. S. Clark, Z. I. Johnson, and D. E. Hunt. 2019. Microbial communities across nearshore to offshore coastal transects are shaped by both distance from shore and seasonality. Environmental Microbiology, in press.

Bachelot B., Uriarte M., Muscarella R., Forero-Montana J., Thompson J., McGuire K., Zimmerman J.K., Swenson N.G. and J.S. Clark. 2018. Associations among arbuscular mycorrhizal fungi and seedlings are predicted to change with tree successional status. Ecology, in press.

Clark, J.S. 2016.  Why species tell us more about traits than traits tell us about species: Predictive models. Ecology,97, 1979–1993,

Clark, J.S., D. Nemergut, B. Seyednasrollah, P. Turner, and S. Zhang. 2017.  Generalized joint attribute modeling for biodiversity analysis: Median-zero, multivariate, multifarious data.   Ecological Monographs, 87, 34–56.

Taylor-Rodríguez, D., Kaufeld, K., Schliep, E. M., Clark, J. S., Gelfand, A. E. 2017. Joint species distribution modeling: Dimension reduction using Dirichlet processes. Bayesian Analysis, doi: 10.1214/16-BA1031. http://projecteuclid.org/euclid.ba/1478073617