This repository contains R code and detailed documentation for investigating the impacts of sampling biases on commonly employed methods for analyzing radiocarbon date distributions, particularly in archaeological research examining past human occupation. Specifically, it addresses how different sampling strategies may influence frequency distributions (FDs) and summed probability distributions (SPDs), affecting interpretations of underlying population trends.
The repository is organized into three primary directories:
Contains the input data file:
Austarch_1-3_and_IDASQ_28Nov13-1.csv: Raw archaeological data utilized as input for the simulations and subsampling experiments.
All output data generated by the analyses, structured clearly into subfolders:
-
baseline_vs_subsamples: Contains results of comparisons between baseline simulated SPDs and SPDs derived from biased subsamples.
plot_data: CSV data files used to generate plots.plots: Graphical outputs (PNG) illustrating baseline versus subsample comparisons.
-
subsamples_vs_theoretical_growth_models: Contains results of model fitting analyses, evaluating the ability of different biased subsamples (FDs and SPDs) to accurately reflect known theoretical population growth models.
plot_data: CSV files supporting graphical summaries.plots: Visualization of subsamples compared to theoretical growth models (PNG files).
All R scripts, organized into logical workflows, alongside documentation using R Markdown (Rmd) and HTML-rendered reports:
-
R scripts (.R): Core analysis workflows for data simulation, sampling biases, calibration, model fitting, and visualization (as described in detail below).
-
rmd: R Markdown files that provide comprehensive narrative descriptions of the analysis pipelines, facilitating reproducibility.
-
html: HTML-rendered versions of R Markdown files for easy viewing of completed analytical workflows.
The analysis pipeline is explicitly structured into sequential stages, with clearly documented R scripts facilitating full reproducibility:
Files:
sampling_bias_in_radiocarbon_dating-simulation_study-[trend]_population_growth-baseline_SPD_vs_mean_subsample_SPDs.R
Tasks:
- Simulates synthetic baseline radiocarbon datasets under predefined theoretical population trajectories (uniform, linear, exponential, no change, and growth-then-decline).
- Generates biased subsamples from each simulated baseline dataset according to specified strategies (e.g., uniform random subsamples, singleton-biased subsamples targeting recent or ancient dates, and bracketed subsamples).
- Calibrates radiocarbon dates using the IntCal20 calibration curve.
- Computes SPDs for baseline datasets and compares these with SPDs from biased subsamples, quantifying discrepancies via envelope tests and summarizing results as statistical outputs (.csv files).
Files:
sampling_bias_in_radiocarbon_dating-[trend]_population_growth-subsample_FDs_vs_theoretical_growth_models.R
Tasks:
- Fits frequency distributions (FDs) and summed probability distributions (SPDs) from subsampled data to multiple theoretical population growth models (uniform, linear, exponential, no change, growth-then-decline).
- Evaluates how often the correct underlying population trajectory is accurately identified from subsampled datasets.
- Outputs model-fitting statistics and accuracy metrics to
.csvfiles, facilitating subsequent visualization and interpretation.
Files:
sampling_bias_in_radiocarbon_dating-simulation_study-create_subsamples_vs_baseline_plots.Rsampling_bias_in_radiocarbon_dating-simulation_study-create_plots.R
Tasks:
- Generates comprehensive graphical summaries (e.g., SPD ribbon plots, frequency distribution comparisons) illustrating differences between baseline SPDs and subsamples, as well as model-fitting results.
- Saves figures as high-resolution PNG files in the respective
results/.../plotssubfolders.
- R Version: All analyses were performed using R v4.3.0.
- Simulated and calibrated data: Complete datasets including raw simulations, calibrated data, frequency distributions, and SPDs are archived and accessible via the UTas Research Data Portal within the project titled "RD-SDBIRDS".
Please cite this repository when referencing methods or results presented here:
Wheatley, R. & Brook, B.W. (2024). Sampling Bias in Radiocarbon Dating: Simulation and Analytical Framework. GitHub Repository.
Questions, suggestions, or contributions to this repository are welcome. For further information, please contact:
- Rebecca Wheatley: [email protected]
- Barry Brook: [email protected]