Reproducibility is a cornerstone
of modern science. This, in combination with a growing demand for the
application of FAIR
Guiding Principles for scientific data management and stewardship,
has led to the development of the ECOTOXr
package. The EPA
ECOTOX database provides the means to reuse data for multiple purposes.
However, studies that use curated data from this database often describe
the process of curating data implicitly or not at all. De Vries
(2024) proposes to explicitly document data curation process in the
form of an R script. Where the ECOTOXr
package can be used
to streamline this R code.
De Vries (2024) provides some rules of thumb for improving the reproducibility of your research when using the ECOTOXr package. These rules are repeated here with some explanation. Quoted text below is from De Vries (2024):
“… transparency and reproducibility is optimised when:”
cite_ecotox()
).” By using the CRAN release you ensure
that the version you use has passed all checks and balances required for
a CRAN submission. If you need the latest features from the development
version on GitHub or r-universe, you
could refer to it by mentioning it’s latest commit. You could also send
a request to the maintainer to submit the development version to
CRAN.download_ecotox_data()
).” The locally built copy of
the database is a ‘simple’ SQLite database located at
get_ecotox_path()
(or elsewhere if specified by the user).
The package only writes to this database while building it from the
downloaded files. All other operations are read-only. However, there is
nothing stopping the user to write to the database. In fact, it can be
considered one of the features of this package. It allows you to add
additional (meta)data to the database. But, if you want to ensure
reproducibility, you must make sure that the database is not corrupted
or data is overwritten. In other words: be careful.For more details and some demonstrations using case studies, please read De Vries (2024).