Oyster_Geochronology/LocalCalCurveLogic.qmd at main · FloridaSEACAR/Oyster_Geochronology · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Local F14C Surface - Logic Summary"
author: "Stephen R. Durham, PhD"
format:
  html:
    embed-resources: true
    code-fold: true
    toc: true
editor_options:
  chunk_output_type: console
execute:
  echo: false
  warning: false
  error: false
  cache: false
  message: false
---

```{r}
#| echo: false

library(tidyverse)
library(data.table)
library(brms)
library(sf)
library(patchwork)
library(seacarb)
library(openxlsx)
library(plotly)

# knitr::opts_chunk$set(results = 'hide')

#Spatial setup
groy <- st_read(here::here("Oyster_Beds_in_Florida_05-25-2021/Oyster_Beds_in_Florida.shp"), quiet = T)
groy <- st_make_valid(groy)

#Filter reef polygons to GRMAP
rcp <- st_read(here::here("orcp_all_sites/ORCP_Managed_Areas.shp"), quiet = T)
grmap <- subset(rcp, rcp$LONG_NAME == "Guana River Marsh Aquatic Preserve")
grmap <- st_make_valid(grmap)
grmap <- st_transform(grmap, crs = 4326)

groy_grmap <- groy[grmap, , op = st_intersects]

#Filter just to undammed portion of Guana River
gr_poly <- data.table(loc = "Guana River",
                      lat = c(30.02228, 30.02362, 29.98398, 29.98201, 29.98907, 30.01137, 30.02228),
                      lon = c(-81.33295, -81.32386, -81.31378, -81.33014, -81.32796, -81.32957, -81.33295))
gr_coords <- as.matrix(gr_poly[, .(lon, lat)])

gr_poly2 <- st_polygon(list(gr_coords))

gr_poly_sf <- st_sf(
  id = 1,  # Example attribute
  geometry = st_sfc(gr_poly2, crs = 4326) # WGS84 CRS
)

# groy_gr <- groy[gr_poly_sf, , op = st_intersects]


#Filter out Guana Lake stations
gl_poly <- data.table(loc = "Guana Lake",
                      lat = c(30.02228, 30.02376, 30.15770, 30.15280, 30.02216, 30.02228),
                      lon = c(-81.33295, -81.32022, -81.34976, -81.37588, -81.33641, -81.33295))
gl_coords <- as.matrix(gl_poly[, .(lon, lat)])

gl_poly2 <- st_polygon(list(gl_coords))

gl_poly_sf <- st_sf(
  id = 1,  # Example attribute
  geometry = st_sfc(gl_poly2, crs = 4326) # WGS84 CRS
)

#Create sal and temp tables
#Temperature data
temp_con <- fread(here::here("CalCurve/SEACAR Combined Tables/Combined_WQ_WC_NUT_cont_Water_Temperature_NE-2025-Dec-10.txt"), sep = "|", na.strings = c("NA", "NULL", ""))
temp_dis <- fread(here::here("CalCurve/SEACAR Combined Tables/Combined_WQ_WC_NUT_Water_Temperature-2025-Dec-10.txt"), sep = "|", na.strings = c("NA", "NULL", ""))


#Salinity data
sal_con <- fread(here::here("CalCurve/SEACAR Combined Tables/Combined_WQ_WC_NUT_cont_Salinity_NE-2025-Dec-10.txt"), sep = "|", na.strings = c("NA", "NULL", ""))
sal_dis <- fread(here::here("CalCurve/SEACAR Combined Tables/Combined_WQ_WC_NUT_Salinity-2025-Dec-10.txt"), sep = "|", na.strings = c("NA", "NULL", ""))

#Merge salinity and temperature data sources
temp_con[, `:=` (ValueQualifier = as.character(ValueQualifier),
                 ActivityType = as.character(ActivityType),
                 DetectionUnit = as.character(DetectionUnit),
                 SampleFraction = as.character(SampleFraction))]
temp_gr <- bind_rows(temp_con[str_detect(ManagedAreaName, "Guana River Marsh Aquatic Preserve") & Include == 1, ],
                     temp_dis[str_detect(ManagedAreaName, "Guana River Marsh Aquatic Preserve") & Include == 1, ])
sal_con[, `:=` (ValueQualifier = as.character(ValueQualifier),
                ActivityType = as.character(ActivityType),
                DetectionUnit = as.character(DetectionUnit),
                SampleFraction = as.character(SampleFraction))]
sal_gr <- bind_rows(sal_con[str_detect(ManagedAreaName, "Guana River Marsh Aquatic Preserve") & Include == 1, ],
                    sal_dis[str_detect(ManagedAreaName, "Guana River Marsh Aquatic Preserve") & Include == 1, ])

# #Filter to undammed portion of GR
# ts_sf <- st_as_sf(unique(rbind(temp_gr[, .(ProgramLocationID, OriginalLatitude, OriginalLongitude)],
#                                sal_gr[, .(ProgramLocationID, OriginalLatitude, OriginalLongitude)])),
#                   coords = c("OriginalLongitude", "OriginalLatitude"),
#                   crs = 4326)
#
# ts_sf <- ts_sf[gr_poly_sf, , op = st_intersects]
#
# temp_gr <- temp_gr[ProgramLocationID %in% ts_sf$ProgramLocationID, ]
# sal_gr <- sal_gr[ProgramLocationID %in% ts_sf$ProgramLocationID, ]

#Filter out Guana Lake stations
ts_sf <- st_as_sf(unique(rbind(temp_gr[, .(ProgramLocationID, OriginalLatitude, OriginalLongitude)],
                               sal_gr[, .(ProgramLocationID, OriginalLatitude, OriginalLongitude)])),
                  coords = c("OriginalLongitude", "OriginalLatitude"),
                  crs = 4326)

ts_sf <- ts_sf[gl_poly_sf, , op = st_intersects]

temp_gr <- temp_gr[!(ProgramLocationID %in% ts_sf$ProgramLocationID), ]
sal_gr <- sal_gr[!(ProgramLocationID %in% ts_sf$ProgramLocationID), ]


#Merge salinity and temperature data by daily averages
sal_gr[, `:=` (day = yday(SampleDate),
               year = year(SampleDate))]
sal_d <- copy(sal_gr[, .(sal_med = median(ResultValue),
                         sal_q25 = quantile(ResultValue, 0.25),
                         sal_q75 = quantile(ResultValue, 0.75)), by = list(year, day)])

temp_gr[, `:=` (day = yday(SampleDate),
                year = year(SampleDate))]
temp_d <- copy(temp_gr[, .(temp_med = median(ResultValue),
                           temp_q25 = quantile(ResultValue, 0.25),
                           temp_q75 = quantile(ResultValue, 0.75)), by = list(year, day)])

env_d <- merge(sal_d, temp_d, by = c("year", "day"), all = T)

#Response surface regression models from Lowe et al. (2017), Table 3
#Spat (< 25mm): Growth = –26.18 + 2.91 (Temp) + –0.055 (Temp)2 + 0.012 (Sal)2
#Seed (25mm <= x <= 75mm): Growth = –19.49 + 1.97 (Temp) + 0.12 (Sal) + –0.036 (Temp)2
#Sack (> 75mm): Growth = –2.81 + 0.29 (Temp) + 0.22 (Sal) + –0.0074 (Temp)2 + –0.0068 (Sal)2

env_d[!is.na(temp_med) & !is.na(sal_med), `:=` (gr_spat = -26.18 + 2.91 * (temp_med) - 0.055 * (temp_med)^2 + 0.012 * (sal_med)^2,
                                                gr_seed = -19.49 + 1.97 * (temp_med) + 0.12 * (sal_med) - 0.036 * (temp_med)^2,
                                                gr_sack = -2.81 + 0.29 * (temp_med) + 0.22 * (sal_med) - 0.0074 * (temp_med)^2 - 0.0068 * (sal_med)^2)]


#Round median sal/temp values and calculate env_freq
env_d[!is.na(sal_med) & !is.na(temp_med), `:=` (sal_medr = round(sal_med),
                                                temp_medr = round(temp_med))]
env_d[!is.na(sal_med) & !is.na(temp_med) & year %in% names(table(env_d$year))[which(table(env_d$year) >= 329)], env_freq := .N, by = list(sal_medr, temp_medr)]

#Only consider years with data from at least 90% of days
env_d_yr90 <- env_d[!is.na(sal_med) & !is.na(temp_med) & year %in% names(table(env_d$year))[which(table(env_d$year) >= 329)] & sal_med > 0, ]

env_d_yr90[, `:=` (sal_medr2 = factor(sal_medr),
                   temp_medr2 = factor(temp_medr),
                   env_freq2 = env_freq/max(env_freq))]

#Weight the growth surface by the frequency of the relevant environmental conditions
env_d_yr90[, `:=` (gr_spat_s = ((gr_spat - min(gr_spat))/diff(range(gr_spat))),
                   gr_seed_s = ((gr_seed - min(gr_seed))/diff(range(gr_seed))),
                   gr_sack_s = ((gr_sack - min(gr_sack))/diff(range(gr_sack))),
                   gr_spat_swt = ((gr_spat - min(gr_spat))/diff(range(gr_spat))) * env_freq2,
                   gr_seed_swt = ((gr_seed - min(gr_seed))/diff(range(gr_seed))) * env_freq2,
                   gr_sack_swt = ((gr_sack - min(gr_sack))/diff(range(gr_sack))) * env_freq2)]

#Lowe et al. 2017 used 30.4 days/month to standardize values to monthly growth
env_d_yr90[, `:=` (gr_spat_d = gr_spat/30.4,
                   gr_seed_d = gr_seed/30.4,
                   gr_sack_d = gr_sack/30.4)]

env_d_r3 <- unique(env_d_yr90[, .(sal_medr2,
                                  temp_medr2,
                                  gr_spat_dtot = round(sum(gr_spat_d), 3),
                                  gr_seed_dtot = round(sum(gr_seed_d), 3),
                                  gr_sack_dtot = round(sum(gr_sack_d), 3)), by = list(sal_medr, temp_medr)])

env_d_r3[, `:=` (gr_spat_dtots = (gr_spat_dtot - min(gr_spat_dtot))/diff(range(gr_spat_dtot)),
                 gr_seed_dtots = (gr_seed_dtot - min(gr_seed_dtot))/diff(range(gr_seed_dtot)),
                 gr_sack_dtots = (gr_sack_dtot - min(gr_sack_dtot))/diff(range(gr_sack_dtot)),
                 gr_spat_dtotp = (gr_spat_dtot + abs(gr_spat_dtot))/sum(gr_spat_dtot + abs(gr_spat_dtot)),
                 gr_seed_dtotp = (gr_seed_dtot + abs(gr_seed_dtot))/sum(gr_seed_dtot + abs(gr_seed_dtot)),
                 gr_sack_dtotp = (gr_sack_dtot + abs(gr_sack_dtot))/sum(gr_sack_dtot + abs(gr_sack_dtot)))]

#Find median by calculating the cumulative sum and estimating where it reacheas 50% of total
n_med <- which(cumsum(sort(env_d_r3$gr_sack_dtotp, decreasing = F)) - 0.5 > 0)[1]
med_val <- sort(env_d_r3$gr_sack_dtotp, decreasing = F)[n_med]
n_med_seed <- which(cumsum(sort(env_d_r3$gr_seed_dtotp, decreasing = F)) - 0.5 > 0)[1]
med_val_seed <- sort(env_d_r3$gr_seed_dtotp, decreasing = F)[n_med_seed]
n_med_spat <- which(cumsum(sort(env_d_r3$gr_spat_dtotp, decreasing = F)) - 0.5 > 0)[1]
med_val_spat <- sort(env_d_r3$gr_spat_dtotp, decreasing = F)[n_med_spat]

env_d_r3[, `:=` (gr_sack_dtotp_med = med_val,
                 gr_seed_dtotp_med = med_val_seed,
                 gr_spat_dtotp_med = med_val_spat)]

#Need to calculate the growth-weighted median of the upper 50% growth conditions in the same way
n_med_gr50 <- which(cumsum(sort(env_d_r3[gr_sack_dtotp >= gr_sack_dtotp_med, gr_sack_dtotp], decreasing = F)) - 0.5 > 0)[1]
med_val_gr50 <- sort(env_d_r3[gr_sack_dtotp >= gr_sack_dtotp_med, gr_sack_dtotp], decreasing = F)[n_med_gr50]
n_med_seed_gr50 <- which(cumsum(sort(env_d_r3[gr_seed_dtotp >= gr_seed_dtotp_med, gr_seed_dtotp], decreasing = F)) - 0.5 > 0)[1]
med_val_seed_gr50 <- sort(env_d_r3[gr_seed_dtotp >= gr_seed_dtotp_med, gr_seed_dtotp], decreasing = F)[n_med_seed_gr50]
n_med_spat_gr50 <- which(cumsum(sort(env_d_r3[gr_spat_dtotp >= gr_spat_dtotp_med, gr_spat_dtotp], decreasing = F)) - 0.5 > 0)[1]
med_val_spat_gr50 <- sort(env_d_r3[gr_spat_dtotp >= gr_spat_dtotp_med, gr_spat_dtotp], decreasing = F)[n_med_spat_gr50]

env_d_r3[, `:=` (gr_sack_dtotp_med_gr50 = med_val_gr50,
                 gr_seed_dtotp_med_gr50 = med_val_seed_gr50,
                 gr_spat_dtotp_med_gr50 = med_val_spat_gr50)]

#Add seasons to env_d_yr90
maxd <- unique(env_d_yr90[, .(maxd = max(day)), by = year])
env_d_yr90[year %in% maxd[maxd != 366, year], season := fcase(day >= 334 | day < 60, "W",
                                                              day >= 60 & day < 152, "Sp",
                                                              day >= 152 & day < 244, "Su",
                                                              day >= 244 & day < 334, "F")]
env_d_yr90[year %in% maxd[maxd == 366, year], season := fcase(day >= 335 | day < 61, "W",
                                                              day >= 61 & day < 153, "Sp",
                                                              day >= 153 & day < 245, "Su",
                                                              day >= 245 & day < 335, "F")]
env_d_yr90[, season := factor(season, levels = c("W", "Sp", "Su", "F"))]

#Make a variable to identify the days with the correct conditions
saltemps <- unique(env_d_r3[gr_sack_dtotp >= med_val, paste0(sal_medr, "_", temp_medr)])
saltemps_seed <- unique(env_d_r3[gr_seed_dtotp >= med_val_seed, paste0(sal_medr, "_", temp_medr)])
saltemps_spat <- unique(env_d_r3[gr_spat_dtotp >= med_val_spat, paste0(sal_medr, "_", temp_medr)])

env_d_yr90[, saltemp := paste0(sal_medr, "_", temp_medr)]
env_d_yr90[saltemp %in% saltemps, gr50_inc_sack := T]
env_d_yr90[is.na(gr50_inc_sack), gr50_inc_sack := F]
env_d_yr90[saltemp %in% saltemps_seed, gr50_inc_seed := T]
env_d_yr90[is.na(gr50_inc_seed), gr50_inc_seed := F]
env_d_yr90[saltemp %in% saltemps_spat, gr50_inc_spat := T]
env_d_yr90[is.na(gr50_inc_spat), gr50_inc_spat := F]


#Create data.table of WIN variables to investigate carbon cycle
win_files <- list.files(here::here("CalCurve/WIN_data/"), full.names = T)

windat <- lapply(win_files, function(x){
  dat_x <- fread(x, sep = "|", na.strings = c("", "NA", "NULL"))

  dat_x[, hobs_loc := str_sub(x,
                              max(str_locate_all(x, "\\/")[[1]]) + 1,
                              min(str_locate_all(x, "_")[[1]][which(str_locate_all(x, "_")[[1]] > max(str_locate_all(x, "\\/")[[1]]))]) - 1)]

  return(dat_x)

})

windat <- rbindlist(windat, use.names = T, fill = T)
windat <- janitor::clean_names(windat)

#Filter out activity types and QA flags according to standard SEACAR rules
windat <- windat[!(hobs_loc %in% c("LB", "GR")) & activity_type %in% c("Field", "Field Msr/Obs", "Field Replicate", "Sample", "Sample-Composite", "Sample/Field"), ]
windat <- windat[!(value_qualifier %in% c("H", "Y", "J", "V")), ]

#filter for just relevant analytes
windat_as <- unique(windat[str_detect(dep_analyte_name, "Alkalinity|Salinity|Temperature, Water|pH|Carbon- Organic|^Hardness"), ])

# #remove GR stations that are above the dam
# windat_as <- windat_as[str_detect(monitoring_location_name, "DAM NORTH|above the Dam|Guana Lake South", negate = T), ]

#Filter out stations in Guana Lake
windat_sf <- st_as_sf(unique(windat_as[, .(monitoring_location_id, dep_latitude, dep_longitude)]),
                      coords = c("dep_longitude", "dep_latitude"),
                      crs = 4326)

windat_sf <- windat_sf[gl_poly_sf, , op = st_intersects]

windat_as <- windat_as[!(monitoring_location_id %in% windat_sf$monitoring_location_id), ]


#create simple, wide version for plotting
windat_as2 <- pivot_wider(unique(windat_as[, .(result_mn = mean(dep_result_value_number)), by = list(hobs_loc, monitoring_location_name, dep_latitude, dep_longitude, activity_start_date_time, activity_depth, dep_analyte_name, sample_fraction)]), names_from = "dep_analyte_name", values_from = "result_mn") #, activity_type, dep_result_id
setDT(windat_as2)
windat_as2 <- janitor::clean_names(windat_as2)
setnames(windat_as2, "p_h", "ph")

#need to keep only dissolved organic carbon but only total alkalinity and hardness
windat_as3 <- merge(unique(windat_as2[sample_fraction == "" & (!is.na(ph) | !is.na(temperature_water) | !is.na(salinity)), -c("sample_fraction", "carbon_organic", "alkalinity_ca_co3", "hardness_calculated_ca_co3", "hardness_ca_co3")]),
                    unique(windat_as2[sample_fraction == "Total", -c("sample_fraction", "ph", "temperature_water", "salinity", "carbon_organic")]), by = c("hobs_loc", "monitoring_location_name", "dep_latitude", "dep_longitude", "activity_start_date_time", "activity_depth"), all.x = T)
windat_as3 <- merge(windat_as3,
                    windat_as2[sample_fraction == "Dissolved", -c("sample_fraction", "ph", "temperature_water", "salinity", "alkalinity_ca_co3", "hardness_calculated_ca_co3", "hardness_ca_co3")], by = c("hobs_loc", "monitoring_location_name", "dep_latitude", "dep_longitude", "activity_start_date_time", "activity_depth"), all.x = T)

#properly format time and calculate year for plotting and analysis
windat_as3[, activity_start_date_time2 := as_datetime(activity_start_date_time, format = "%m/%d/%Y %H:%M:%S")]
windat_as3[, year := factor(year(activity_start_date_time2))]
windat_as3[, season := factor(fcase(month(activity_start_date_time2) %in% c(12, 1, 2), "W",
                                    month(activity_start_date_time2) %in% c(3, 4, 5), "Sp",
                                    month(activity_start_date_time2) %in% c(6, 7, 8), "Su",
                                    month(activity_start_date_time2) %in% c(9, 10, 11), "F"), levels = c("Sp", "Su", "F", "W"))]

#fix some analyte names
setnames(windat_as3,
         c("temperature_water", "salinity", "carbon_organic", "alkalinity_ca_co3", "hardness_calculated_ca_co3", "hardness_ca_co3"),
         c("temp", "sal", "doc", "alk", "hard_calc", "hard_meas"))

#Calculate an estimated DIC using the "seacarb" package (suggestion from ChatGPT 5.2)
#convert depths to pressure in bar (1dbar = 0.1bar)
windat_as3[, pressure := d2p(depth = activity_depth, lat = dep_latitude)] # * 0.1
#calculate seawater density in kg/L (1 kg/m3 is 0.001 kg/L)
windat_as3[, density := rho(S = sal, T = temp, P = pressure) * 0.001]
#convert alk from mg/L to mol/kg (formula from ChatGPT 5.2; 50 mg CaCO₃ = 1 meq, and 1000 meq = 1 mol of charge)
windat_as3[!is.na(alk), alk_mol := alk/(50 * 1000 * density)]
#plug values into the "carb" function (flag 8 means pH and ALK given)
windat_as3[!is.na(sal) & !is.na(temp) & !is.na(alk) & !is.na(ph), dic := carb(flag = 8,
                                                                              var1 = ph,
                                                                              var2 = alk_mol,
                                                                              S = sal,
                                                                              T = temp,
                                                                              P = pressure,
                                                                              Pt = 0,
                                                                              Sit = 0)$DIC, by = .I]

windat_as3[, coords := paste0(dep_latitude, "_", dep_longitude)]
windat_as3[, day := yday(activity_start_date_time2)]


#Load DIC ~ salinity model object (created in HOBS_Notebook_2025_Geochron.Rmd)
# dicsalmod_skew2 <- readRDS(here::here("dicsalmod_skew2.rds"))
dicsalmod_skew4 <- readRDS(here::here("dicsalmod_skew4.rds"))

#Load calibration curve objects
#Extract Marine20 values for years <= 1950
ncm20_dat <- fread(here::here("Summer2023/NewCurve_wMarine20_vals.csv")) #Where are the original Marine20 data?
ncm20_dat[, `:=` (f14c_calc = rice::C14toF14C(`14Cage`, er = `14Cage_sd`)[,1],
                  f14c_calc_sd = rice::C14toF14C(`14Cage`, er = `14Cage_sd`)[,2])]
m20 <- ncm20_dat[!is.na(f14c_calc), .(year = Year_ce, f14c = f14c_calc, f14c_sd = f14c_calc_sd, d14c = d14C, d14c_sd = d14C_sd)]

#Load regional marine bomb-pulse curve
gamtest7 <- readRDS(here::here("gamtest7b_update20260113.rds"))

#Load Hua et al. (2022) and IntCal20 data and create atm_curve object
#Load and prep original Hua atmospheric curve
hua_22 <- read.xlsx(here::here("CalCurve/Hua_etal_2022_supp/s0033822221000953sup002.xlsx"), sheet = "NH zone 2", na.strings = c("", "NA", "NULL"), startRow = 5)
setDT(hua_22)

#Use weighted average formulas to coarsen to an annual time series
hua_22[, Year2 := floor(Year)]

hua_22_yr <- lapply(unique(hua_22$Year2), function(x){
  maxyr <- hua_22[Year2 == x + 1, min(Year)]
  minyr <- hua_22[Year2 == x - 1, max(Year)]
  dat_x <- hua_22[Year2 == x | Year == maxyr | Year == minyr, ]
  mn_yr <- data.table(yr = x,
                      d14c_mn = dat_x[, sum(d14C_mean/d14C_sd^2)/sum(1/d14C_sd^2)],
                      d14c_sd = dat_x[, max(c(sqrt(1/sum(1/d14C_sd^2)),
                                      sqrt((.N * sum((d14C_mean - sum(d14C_mean/d14C_sd^2)/sum(1/d14C_sd^2))^2/d14C_sd^2))/((.N - 1) * sum(1/d14C_sd^2)))))],
                      f14c_mn = dat_x[, sum(F14C_mean/F14C_sd^2)/sum(1/F14C_sd^2)],
                      f14c_sd = dat_x[, max(c(sqrt(1/sum(1/F14C_sd^2)),
                                      sqrt((.N * sum((F14C_mean - sum(F14C_mean/F14C_sd^2)/sum(1/F14C_sd^2))^2/F14C_sd^2))/((.N - 1) * sum(1/F14C_sd^2)))))])
  return(mn_yr)
})

hua_22_yr2 <- rbindlist(hua_22_yr, use.names = T, fill = T)

#Load IntCal20 data
intcal20 <- fread(here::here("CalCurve/intcal20.txt"), sep = ",", skip = 10, na.strings = c("NA", "", "NULL"))
setnames(intcal20, colnames(intcal20), c("year_bp", "14c_age", "14c_age_sd", "d14c", "d14c_sd"))
intcal20[, yr := 1950 - year_bp]
intcal20[, `:=` (f14c_calc = rice::C14toF14C(`14c_age`, er = `14c_age_sd`)[,1],
                 f14c_calc_sd = rice::C14toF14C(`14c_age`, er = `14c_age_sd`)[,2])]

atm_curve <- rbindlist(list(intcal20[, .(yr, d14c, d14c_sd, f14c = f14c_calc, f14c_sd = f14c_calc_sd)],
                            hua_22_yr2[yr > 1950, .(yr, d14c = d14c_mn, d14c_sd, f14c = f14c_mn, f14c_sd)],
                            data.table(yr = seq(2000, 2025))), use.names = T, fill = T)

#Load extension of Hua et al. (2022) atmospheric bomb pulse curve to 2025 (created in HOBS_Notebook_2025_Geochron.Rmd)
atm_extend3 <- readRDS(here::here("Summer2023/atm_extend3.rds"))


```

## Goal

The goal of this analysis was to refine the radiocarbon calibrations for estuarine oyster samples by creating a calibration surface relating F^14^C, salinity, and time, because due to the inherent variability in the estuarine setting and the offset between atmospheric F^14^C and marine F^14^C trends, using the regional marine curve with a constant offset based on dating a handful of modern, live-caught shells may not accurately reflect the appropriate local correction over time.

## Setting, Data Sources, and Tools

The study this pilot analysis aims to support is focused on the undammed portion of the Guana River, near St. Augustine, Florida (red outline on map below). It is a relatively small, shallow, tidally influenced river, surrounded by marsh, and is part of the Guana River Marsh Aquatic Preserve (GRMAP) and the Guana Tolomato Matanzas National Estuarine Research Reserve. However, in order to have more of the necessary water quality data, I included stations from across the GRMAP (see map below).

I used atmospheric F^14^C curves from Hua et al. (2022) and IntCal20, the regional marine curve from Durham et al. (2023) updated to 2025, and local salinity, water temperature, total alkalinity, pH, and dissolved organic carbon data from the Florida Department of Environmental Protection's WIN and SEACAR databases. I also used some of these data to derive estimated DIC using the *seacarb* R package (Gattuso et al. 2024) and to estimate seasonal growth potential for three size classes of oysters according to [Lowe et al. (2017)](https://bioone.org/journals/journal-of-shellfish-research/volume-36/issue-3/035.036.0318/Interactive-Effects-of-Water-Temperature-and-Salinity-on-Growth-and/10.2983/035.036.0318.short).

The analysis was guided by a conversation with ChatGPT, which contributed suggestions for analyses and helped design models, write code, and interpret results, making it a priority to vet this workflow with an expert in coastal carbon cycling and environmental geochemistry.

```{r}
#| echo: false

temp <- unique(copy(temp_gr[year %in% env_d_yr90$year, .(ParameterName, ProgramLocationID, OriginalLatitude, OriginalLongitude)]))
sal <- unique(copy(sal_gr[year %in% env_d_yr90$year, .(ParameterName, ProgramLocationID, OriginalLatitude, OriginalLongitude)]))
windats <- unique(copy(windat_as[hobs_loc == "GRMAP", .(dep_analyte_name, monitoring_location_id, dep_latitude, dep_longitude)]))
windats[, dep_analyte_name := fcase(dep_analyte_name == "Alkalinity (CaCO3)", "Total Alkalinity",
                                    dep_analyte_name == "Carbon- Organic", "Organic Carbon",
                                    dep_analyte_name == "Hardness- Calculated (CaCO3)", "Hardness",
                                    dep_analyte_name == "pH", "pH",
                                    dep_analyte_name == "Salinity", "Salinity",
                                    dep_analyte_name == "Temperature, Water", "Water Temperature"), by = dep_analyte_name]
setnames(windats, c("dep_analyte_name", "monitoring_location_id", "dep_latitude", "dep_longitude"), c("ParameterName", "ProgramLocationID", "OriginalLatitude", "OriginalLongitude"))

wqloc <- rbindlist(list(temp, sal, windats), use.names = T, fill = T)

wqloc_sf <- st_as_sf(wqloc, coords = c("OriginalLongitude", "OriginalLatitude"), crs = 4326, remove = F)

wqloc_sf_gr <- wqloc_sf[gr_poly_sf, , op = st_intersects]
wqloc_sf_gr$coords <- paste0(wqloc_sf_gr$OriginalLatitude, "_", wqloc_sf_gr$OriginalLongitude)

# wqloc_sf_grmap <- wqloc_sf[grmap, , op = st_intersects]

```

```{r}
#| echo: false

parcols <- c("firebrick", colorspace::sequential_hcl(palette = "Viridis", n = 7L))

gr_poly_sf$name <- "Study area"

mapview::mapview(gr_poly_sf, zcol = "name", color = parcols[1], col.regions = parcols[1], layer.name = "Study area", alpha.regions = 0, lwd = 2) +
  mapview::mapview(groy_grmap, col.regions = parcols[2], layer.name = "Oyster reef") +
  mapview::mapview(subset(wqloc_sf, wqloc_sf$ParameterName == "Organic Carbon"), col.regions = parcols[4], layer.name = "Org. carbon") +
  mapview::mapview(subset(wqloc_sf, wqloc_sf$ParameterName == "pH"), col.regions = parcols[5], layer.name = "pH") +
  mapview::mapview(subset(wqloc_sf, wqloc_sf$ParameterName == "Salinity"), col.regions = parcols[6], layer.name = "Salinity") +
  mapview::mapview(subset(wqloc_sf, wqloc_sf$ParameterName == "Total Alkalinity"), col.regions = parcols[7], layer.name = "Total alk.") +
  mapview::mapview(subset(wqloc_sf, wqloc_sf$ParameterName == "Water Temperature"), col.regions = parcols[8], layer.name = "Water temp.")

```

## Logic Summary

The initial concept was to use the atmospheric and marine curves as end-members for a salinity x F^14^C mixing surface, because it would essentially provide local calibration curves that would not require ∆R or dead carbon corrections. I expected that the predominant source of freshwater to Guana River would be precipitation and related sheet flow, given the river is dammed and northeast Florida is not dominated by spring flow in the same way that many rivers are on the Gulf Coast, however a simple linear mixing model was not appropriate because rainwater itself contains too little DIC relative to seawater to change estuarine F^14^C~DIC~ by very much, except at the lowest salinities.

Therefore, I wanted to modify the mixing relationship to make it more realistic and also find principled ways to narrow down the portion of the surface that would need to be considered in a radiocarbon calibration. I pursued two:

1)  Lowe et al. (2017) modeled growth for three size classes of oysters in Louisiana as a function of salinity and temperature and found differences in growth patterns between them, so I decided to apply these models to salinity and temperature data from GRMAP to estimate the conditions most favorable for shell growth (i.e., based on the assumption that water conditions during those times would be significantly more likely to be recorded in the radiocarbon-dated shells); and

2)  ChatGPT suggested that some local carbon dynamics could be inferred from relationships between variables such as salinity, pH, and total alkalinity (including estimating DIC using *seacarb*), so I decided to pursue those comparisons using available GRMAP water quality data in order to improve the mixing surface.

### Oyster Growth

I began by calculating the daily median salinity and temperature values so that there was one pair of values for each day in the dataset, then used them to calculate daily growth potential values for spat (≤ 25mm), seed (25-75mm), and sack (≥ 75mm) oyster size classes using the models from Lowe et al. (2017). Next, I filtered the results to exclude any years with fewer than 90% of days represented in the dataset to minimize seasonal bias in the growth potentials and rounded each value to the nearest ppt/deg. C.

This figure shows the relative frequency of each daily salinity x temperature combination in the dataset:

```{r}
#| echo: false
#| out-width: "50%"
#| fig-align: "center"

ggplot(env_d_yr90) +
  geom_tile(aes(x = sal_medr2, y = temp_medr2, fill = env_freq2)) +
  theme_bw() +
  labs(x = "Median daily salinity (ppt)", y = "Median daily temp. (deg. C)", fill = "Rel. freq.")

```

However, the most common conditions are not necessarily those most likely to promote oyster growth. Here is the same plot, but showing the relative frequency values weighted by the Lowe et al.-based growth potentials for each size class:

```{r}
#| echo: false
#| fig-align: "center"

gr_spat_swt <- ggplot(env_d_yr90) +
  geom_tile(aes(x = sal_medr2, y = temp_medr2, fill = gr_spat_swt/max(gr_spat_swt))) +
  theme_bw() +
  labs(x = "Median daily salinity (ppt)", y = "Median daily temp. (deg. C)", fill = "Growth-weighted freq.", title = "Spat")

gr_seed_swt <- ggplot(env_d_yr90) +
  geom_tile(aes(x = sal_medr2, y = temp_medr2, fill = gr_seed_swt/max(gr_seed_swt))) +
  theme_bw() +
  labs(x = "Median daily salinity (ppt)", y = "Median daily temp. (deg. C)", fill = "Growth-weighted freq.", title = "Seed")

gr_sack_swt <- ggplot(env_d_yr90) +
  geom_tile(aes(x = sal_medr2, y = temp_medr2, fill = gr_sack_swt/max(gr_sack_swt))) +
  theme_bw() +
  labs(x = "Median daily salinity (ppt)", y = "Median daily temp. (deg. C)", fill = "Growth-weighted freq.", title = "Sack")

gr_spat_swt + gr_seed_swt + gr_sack_swt + plot_layout(guides = "collect", axes = "collect") &
  scale_x_discrete(breaks = levels(env_d_yr90$sal_medr2)[seq(1, length(levels(env_d_yr90$sal_medr2)), by = 5)]) &
  scale_y_discrete(breaks = levels(env_d_yr90$temp_medr2)[seq(1, length(levels(env_d_yr90$temp_medr2)), by = 5)]) &
  theme(legend.position = "bottom",
        legend.justification = "right",
        legend.title.position = "top",
        legend.text = element_text(angle = -45, hjust = 0.1, vjust = 0))

```

Below is the same type of plot but converted to show the proportion of daily growth across the time series. I masked all of the temperature and salinity combinations that collectively contributed the lower 50% of total growth potential in red and calculated the salinity value that corresponds with the median of the upper-50% total growth potential for each size class (bright turquoise line). There is only a very subtle shift between spat and seed size classes towards growth at lower temperatures and salinities, but the shift is very stark between the seed and sack size classes, with the majority of daily growth expected at 36ppt, 36ppt, and 26ppt for spat, seed, and sack oysters, respectively.

```{r}
#| echo: false
#| fig-align: "center"
#| fig-width: 8
#| fig-height: 8

gr_spat_dtotp <- ggplot(env_d_r3) +
  geom_tile(aes(x = sal_medr2, y = temp_medr2, fill = gr_spat_dtotp)) +
  geom_tile(data = env_d_r3[gr_spat_dtotp < gr_spat_dtotp_med, ], aes(x = sal_medr2, y = temp_medr2), fill = "lightcoral", alpha = 0.6) +
  # geom_vline(xintercept = env_d_r3[gr_spat_dtotp == gr_spat_dtotp_med, mean(sal_medr) - 2], color = "dodgerblue4") +
  geom_vline(xintercept = env_d_r3[gr_spat_dtotp == gr_spat_dtotp_med_gr50, mean(sal_medr) - 2], color = "turquoise1", lwd = 1) +
  theme_bw() +
  labs(x = "Median daily salinity (ppt)", y = "Median daily temp. (deg. C)", fill = "Proportion of\ndaily growth", title = "Spat")

gr_seed_dtotp <- ggplot(env_d_r3) +
  geom_tile(aes(x = sal_medr2, y = temp_medr2, fill = gr_seed_dtotp)) +
  geom_tile(data = env_d_r3[gr_seed_dtotp < gr_seed_dtotp_med, ], aes(x = sal_medr2, y = temp_medr2), fill = "lightcoral", alpha = 0.6) +
  # geom_vline(xintercept = env_d_r3[gr_seed_dtotp == gr_seed_dtotp_med, mean(sal_medr) - 2], color = "dodgerblue4") +
  geom_vline(xintercept = env_d_r3[gr_seed_dtotp == gr_seed_dtotp_med_gr50, mean(sal_medr) - 2], color = "turquoise1", lwd = 1) +
  theme_bw() +
  labs(x = "Median daily salinity (ppt)", y = "Median daily temp. (deg. C)", fill = "Proportion of\ndaily growth", title = "Seed")

gr_sack_dtotp <- ggplot(env_d_r3) +
  geom_tile(aes(x = sal_medr2, y = temp_medr2, fill = gr_sack_dtotp)) +
  geom_tile(data = env_d_r3[gr_sack_dtotp < gr_sack_dtotp_med, ], aes(x = sal_medr2, y = temp_medr2), fill = "lightcoral", alpha = 0.6) +
  # geom_vline(xintercept = env_d_r3[gr_sack_dtotp == gr_sack_dtotp_med, mean(sal_medr) - 2], color = "dodgerblue4") +
  geom_vline(xintercept = env_d_r3[gr_sack_dtotp == gr_sack_dtotp_med_gr50, mean(sal_medr) - 2], color = "turquoise1", lwd = 1) +
  theme_bw() +
  labs(x = "Median daily salinity (ppt)", y = "Median daily temp. (deg. C)", fill = "Proportion of\ndaily growth", title = "Sack")


gr_spat_dtotp + gr_seed_dtotp + gr_sack_dtotp + plot_layout(guides = "collect", axes = "collect") &
  scale_x_discrete(breaks = levels(env_d_r3$sal_medr2)[seq(1, length(levels(env_d_r3$sal_medr2)), by = 5)]) &
  scale_y_discrete(breaks = levels(env_d_r3$temp_medr2)[seq(1, length(levels(env_d_r3$temp_medr2)), by = 5)]) &
  theme(legend.position = "bottom",
        legend.justification = "right",
        legend.title.position = "top",
        legend.text = element_text(angle = -45, hjust = 0.1, vjust = 0))

# ggplot() +
#   geom_histogram(data = env_d_r3, aes(x = sal_medr), fill = "dodgerblue") +
#   geom_histogram(data = env_d_r3[gr_spat_dtotp >= gr_spat_dtotp_med, ], aes(x = sal_medr), fill = "firebrick", alpha = 0.6) +
#   geom_vline(xintercept = env_d_r3[, median(sal_medr)], color = "dodgerblue4") +
#   geom_vline(xintercept = env_d_r3[gr_spat_dtotp >= gr_spat_dtotp_med, median(sal_medr)], color = "firebrick4")
#
#
# ggplot() +
#   geom_histogram(data = env_d_r3, aes(x = sal_medr), fill = "dodgerblue") +
#   geom_histogram(data = env_d_r3[gr_spat_dtotp >= gr_spat_dtotp_med, ], aes(x = sal_medr), fill = "firebrick", alpha = 0.6) +
#   geom_vline(xintercept = env_d_r3[gr_spat_dtotp == med_val_spat, mean(sal_medr)], color = "dodgerblue4") +
#   geom_vline(xintercept = env_d_r3[gr_spat_dtotp == gr_spat_dtotp_med_gr50, mean(sal_medr)], color = "firebrick4")

```

The plots below show the proportions of observations falling within each day of the year in the years for which ≥ 90% of days were represented that had salinity and temperature conditions within the "upper 50% growth potential" conditions I just defined. They suggest that if oyster growth follows the models from Lowe et al. (2017) in Guana River, then spat and seed oysters grow most in the spring and early summer, while sack oysters grow most in the spring and fall. Total observations per day are relatively similar across the year except for the first and last days and day \~200.

```{r}
#| echo: false
#| fig-align: "center"
#| fig-width: 8
#| fig-height: 10


gr50_sack <- ggplot(env_d_yr90) +
  geom_histogram(aes(x = day, group = gr50_inc_sack, fill = gr50_inc_sack)) +
  theme_bw() +
  labs(title = "Sack") #+
  # theme(legend.position = "none")

gr50_seed <- ggplot(env_d_yr90) +
  geom_histogram(aes(x = day, group = gr50_inc_seed, fill = gr50_inc_seed)) +
  theme_bw() +
  labs(title = "Seed") +
  theme(legend.position = "none")

gr50_spat <- ggplot(env_d_yr90) +
  geom_histogram(aes(x = day, group = gr50_inc_spat, fill = gr50_inc_spat)) +
  theme_bw() +
  labs(title = "Spat") +
  theme(legend.position = "none")

gr50_spat + gr50_seed + gr50_sack + plot_layout(guides = "collect", axes = "collect") &
  labs(x = "Day of the year", y = "N observations", fill = "Upper 50%\ngrowth\nconditions")

```

### Carbon Dynamics
ChatGPT suggested the following conceptual mixing model for basic estuarine carbon dynamics. The model defines the fractions of fresh and saltwater as:

$$
f_{\mathrm{fw}}(S) = 1 - \frac{S}{S_{\mathrm{sw}}}
$$

and

$$
f_{\mathrm{sw}}(S) = 1 - f_{\mathrm{fw}}(S)
$$
where $S$ is the estuarine salinity and $S_{\mathrm{{sw}}}$ is full-marine salinity (i.e., 35ppt). The estuarine DIC concentration, $C_{\mathrm{mix}}$, is then estimated as a combination of the water fractions and the end-member DIC concentrations:
$$
C_{\mathrm{mix}}(S) =
f_{\mathrm{sw}}(S)\,C_{\mathrm{sw}}
+
f_{\mathrm{fw}}(S)\,C_{\mathrm{fw}}
$$
where $C_{\mathrm{sw}}$ and $C_{\mathrm{fw}}$ are the end-member DIC concentration values for full marine and fresh waters, respectively.

The mixture F^14^C is then a function of the DIC concentrations, the water fractions and the end-member F^14^C values at salinity $S$ and time $t$:
$$
F_{\mathrm{mix}}(S,t)
=
\frac{
f_{\mathrm{sw}}(S)\,C_{\mathrm{sw}}(t)\,F_{\mathrm{sw}}(t)
+
f_{\mathrm{fw}}(S)\,C_{\mathrm{fw}}(t)\,F_{\mathrm{fw}}(t)
}{
C_{\mathrm{mix}}(S)
}
$$

where $F_{\mathrm{sw}}(t)$ and $F_{\mathrm{fw}}(t)$ represent the end-member values from the regional marine and atmospheric calibration curves, respectively. In the absence of additional "non-conservative" carbon dynamics, $F_{\mathrm{mix}}(S,t)$ equals the estuarine F^14^C~DIC~, however several of the live-caught oyster radiocarbon dates suggested that the F^14^C~DIC~ in Guana River was higher than would be expected by this mixing relationship alone, so a deeper investigation of the carbon dynamics was necessary.

ChatGPT 5.2 made the point that the relationship between total alkalinity and salinity of a water body can provide insight into the DIC concentration in the freshwater sources. It also summarized several processes that can lead F^14^C~DIC~ in an estuary to be higher than the marine F^14^C~DIC~:

1.  Rapid air–water CO₂ exchange, especially in circumstances where the estuarine water has a long residence time, is shallow and well-mixed, and experiences frequent gas exchange from wind, tides, and warming (it said shifts of +0.01 to +0.03 F14C relative to offshore marine DIC are plausible).

2.  Respiration of modern organic carbon (especially marsh & seagrass): It explained that if respiration converts recently fixed organic carbon (marsh plants, phytoplankton, SAV) into CO₂, that CO₂ is modern in F^14^C, and adds directly to the DIC pool. However, although respiration of modern carbon can raise or maintain a high F^14^C, respiration of old carbon (e.g., from peat, soil OM, carbonate) can likewise lower F^14^C, but it expected marsh primary production to predominantly be rapid and modern. It said shifts in ∆F^14^C could be about +0.005 to +0.02, with stronger shifts in warm months.

3.  Photosynthesis + atmospheric equilibration (indirect effect): It explained that photosynthesis itself doesn’t fractionate radiocarbon in a way that changes F^14^C much, but it lowers CO₂(aq), which enhances CO₂ invasion from the atmosphere, thereby pulling in radiocarbon-modern CO₂. Therefore, it acts as a feedback amplifier of #1 above, especially in high productivity systems.

4.  Runoff picking up soil CO₂ (often modern, not old): Soil CO₂ from root respiration and microbial activity is usually modern and equilibrates with soil water as DIC before entering surface waters. In sandy, well-drained coastal plain soils (like much of NE Florida), soil CO₂ residence times are short and carbon is dominantly post-bomb, so runoff and shallow porewater can carry much higher DIC than rainwater, and still with a modern F^14^C. This process behaves very differently from karst groundwater. Its magnitude is highly variable, but can be potentially comparable to air–sea exchange when runoff pulses are large.

To help interpret these possible dynamics, it suggested I could plot DIC vs. salinity, pH vs. salinity, pH vs. total alkalinity, dissolved organic carbon/total organic carbon (preferably DOC) vs. salinity, pCO~2~ vs. salinity, and/or DIC:TA ratio vs. salinity. It recommended I prioritize pH vs. salinity, DOC vs. salinity, and DIC vs. salinity.

I produced the following plots. Note, I did not produce a pCO~2~ plot because there is a warning in the documentation for seacarb that "pCO~2~ estimates below 100 m are subject to considerable uncertainty". I also have circled the sample points that come from within the study area itself (red polygon on the locality map above) for easy reference.

```{r}
#| echo: false
#| fig-width: 8
#| fig-height: 15

sal_alk <- ggplot(windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(alk)], aes(x = sal, y = alk)) +
            geom_point(aes(color = year, shape = season)) +
            geom_point(data = windat_as3[coords %in% wqloc_sf_gr$coords & !is.na(sal) & !is.na(alk)], shape = 1, color = "black", size = 4) + #These are the points actually within the Guana River below the dam; all points reflect the GRMAP boundary
            geom_smooth(method = "lm") +
            theme_bw() +
            labs(x = "Salinity", y = "Total Alkalinity", color = "Year") +
            facet_wrap(~season, nrow = 1)

ph_sal <- ggplot(windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(ph)], aes(x = ph, y = sal)) +
            geom_point(aes(color = year, shape = season)) +
            geom_point(data = windat_as3[coords %in% wqloc_sf_gr$coords & !is.na(sal) & !is.na(ph)], shape = 1, color = "black", size = 4) + #These are the points actually within the Guana River below the dam; all points reflect the GRMAP boundary
            geom_smooth(method = "lm") +
            theme_bw() +
            labs(x = "pH", y = "Salinity", color = "Year") +
            facet_wrap(~season, nrow = 1)

ph_alk <- ggplot(windat_as3[hobs_loc == "GRMAP" & !is.na(alk) & !is.na(ph)], aes(x = ph, y = alk)) +
            geom_point(aes(color = year, shape = season)) +
            geom_point(data = windat_as3[coords %in% wqloc_sf_gr$coords & !is.na(alk) & !is.na(ph)], shape = 1, color = "black", size = 4) + #These are the points actually within the Guana River below the dam; all points reflect the GRMAP boundary
            geom_smooth(method = "lm") +
            theme_bw() +
            labs(x = "pH", y = "Alkalinity", color = "Year") +
            facet_wrap(~season, nrow = 1)

doc_sal <- ggplot(windat_as3[hobs_loc == "GRMAP" & !is.na(doc) & !is.na(sal)], aes(x = doc, y = sal)) +
            geom_point(aes(color = year, shape = season)) +
            geom_point(data = windat_as3[coords %in% wqloc_sf_gr$coords & !is.na(doc) & !is.na(sal)], shape = 1, color = "black", size = 4) + #These are the points actually within the Guana River below the dam; all points reflect the GRMAP boundary
            geom_smooth(method = "lm") +
            theme_bw() +
            labs(x = "Dissolved Organic Carbon", y = "Salinity", color = "Year") +
            facet_wrap(~season, nrow = 1)

dic_sal <- ggplot(windat_as3[hobs_loc == "GRMAP" & !is.na(dic) & !is.na(sal)], aes(x = dic, y = sal)) +
            geom_point(aes(color = year, shape = season)) +
            geom_point(data = windat_as3[coords %in% wqloc_sf_gr$coords & !is.na(dic) & !is.na(sal)], shape = 1, color = "black", size = 4) + #These are the points actually within the Guana River below the dam; all points reflect the GRMAP boundary
            geom_smooth(method = "lm") +
            theme_bw() +
            labs(x = "Est. Dissolved Inorganic Carbon (mol/kg)", y = "Salinity", color = "Year") +
            facet_wrap(~season, nrow = 1)

sal_alk / ph_sal / ph_alk / doc_sal / dic_sal + plot_layout(guides = 'collect') + plot_annotation(title = "Guana River Marsh AP", subtitle = "Circled data points come from stations within the undammed portion of the Guana River.")

```

```{r}

#Try updating dicsalmod_skew2 model with data from the same area as the salinity/temp data
#ChatGPT recommended some priors to help the model run better, and also recommended ditching the correlation component of the random season effect.
mu_dic <- windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(dic), mean(dic)]
sd_dic <- windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(dic), sd(dic)]

pri <- c(
  # Mean model
  set_prior(paste0("normal(", mu_dic, ", ", 2*sd_dic, ")"), class = "Intercept"),
  set_prior(paste0("normal(0, ", 2*sd_dic, ")"), class = "b"),  # incl scalesal

  # Group-level SDs (half-normal because sd params are constrained >0)
  set_prior(paste0("normal(0, ", sd_dic, ")"), class = "sd", group = "season"),
  set_prior(paste0("normal(0, ", sd_dic, ")"), class = "sd", group = "year"),

  # Sigma model: by default brms models positive distributional params (like sigma) on the log scale,
  # so these priors apply to the linear predictor for log(sigma).
  set_prior(paste0("normal(", log(sd_dic), ", 0.5)"), class = "Intercept", dpar = "sigma"),
  set_prior("normal(0, 0.5)", class = "b", dpar = "sigma"),

  # Skew-normal shape alpha (alpha=0 is Gaussian); regularize to avoid extreme skew.
  set_prior("normal(0, 2)", class = "alpha")
)

# dicsalmod_skew3_ppc <- brm(bf(dic ~ scale(sal) + (1 + scale(sal) || season) + (1 | year),
#                           sigma ~ season),
#                       family = skew_normal(),
#                       prior = pri,
#                       data = droplevels(windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(dic), ]),
#                       cores = 4,
#                       seed = 456,
#                       chains = 4,
#                       iter = 5000,
#                       warmup = 1500,
#                       control = list(adapt_delta = 0.95,
#                                      max_treedepth = 15),
#                       backend = "cmdstanr",
#                       threads = threading(2),
#                       sample_prior = "only")
#
# dicsalmod_skew3 <- brm(bf(dic ~ scale(sal) + (1 + scale(sal) || season) + (1 | year),
#                           sigma ~ season),
#                       family = skew_normal(),
#                       prior = pri,
#                       data = droplevels(windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(dic), ]),
#                       cores = 4,
#                       seed = 456,
#                       chains = 4,
#                       iter = 5000,
#                       warmup = 1500,
#                       control = list(adapt_delta = 0.95,
#                                      max_treedepth = 15),
#                       backend = "cmdstanr",
#                       threads = threading(2),
#                       file = here::here("dicsalmod_skew3.rds"))
# dicsalmod_skew3 <- add_criterion(dicsalmod_skew3, "loo")
#
# #Recheck that skew_normal outperforms gaussian and student distributions
# dicsalmod_gauss <- update(dicsalmod_skew3,
#                           family = gaussian(),
#                           cores = 4,
#                           seed = 456,
#                           chains = 4,
#                           iter = 5000,
#                           warmup = 1500,
#                           control = list(adapt_delta = 0.95,
#                                          max_treedepth = 15),
#                           backend = "cmdstanr",
#                           threads = threading(2),
#                           file = here::here("dicsalmod_gauss.rds"))
# dicsalmod_gauss <- add_criterion(dicsalmod_gauss, "loo", moment_match = T)
#
# dicsalmod_stud <- update(dicsalmod_skew3,
#                           family = student(),
#                           cores = 4,
#                           seed = 456,
#                           chains = 4,
#                           iter = 5000,
#                           warmup = 1500,
#                           control = list(adapt_delta = 0.95,
#                                          max_treedepth = 15),
#                           backend = "cmdstanr",
#                           threads = threading(2),
#                           file = here::here("dicsalmod_stud.rds"))
# dicsalmod_stud <- add_criterion(dicsalmod_stud, "loo")
#
# loo_compare(dicsalmod_skew3, dicsalmod_gauss, dicsalmod_stud)

#Try updating dicsalmod_skew3 model with adding the year component to sigma
dicsalmod_skew4 <- brm(bf(dic ~ scale(sal) + (1 + scale(sal) || season) + (1 | year),
                          sigma ~ season + (1 | year)),
                      family = skew_normal(),
                      prior = pri,
                      data = droplevels(windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(dic), ]),
                      cores = 4,
                      seed = 456,
                      chains = 4,
                      iter = 5000,
                      warmup = 1500,
                      control = list(adapt_delta = 0.95,
                                     max_treedepth = 15),
                      backend = "cmdstanr",
                      threads = threading(2),
                      silent = 2,
                      file = here::here("dicsalmod_skew4.rds"))
# dicsalmod_skew4 <- add_criterion(dicsalmod_skew4, "loo", moment_match = T)

# loo_compare(dicsalmod_skew3, dicsalmod_gauss, dicsalmod_stud, dicsalmod_skew4)

```

The following are the carbon-system implications of these plots, according to ChatGPT:

1.  Total alkalinity vs. salinity shows a tight, linear TA–S relationship across years and seasons, with little scatter orthogonal to salinity and no “high-TA at low-S” tail. This pattern suggests that total alkalinity is behaving quasi-conservatively. There is no persistent, total alkalinity-rich groundwater or carbonate dissolution source dominating the estuary, consistent with precipitation and shallow soil water as the primary freshwater sources, not the Floridan aquifer. This result essentially rules out bulk carbonate dissolution as the primary cause of any F^14^C offset.

2.  Salinity vs pH indicates pH increases modestly with salinity, with considerable vertical scatter at a given salinity, but also some visible seasonal structure. Given that lower-salinity waters tend to be more CO₂-rich (lower pH), this result is consistent with respiration of terrestrial/marsh organic matter and porewater exchange, with limited buffering from alkalinity at lower salinities. The scatter implies that biology and gas exchange matter, not just mixing, which is the type of regime where small additions of respired CO₂ (with little total alkalinity signal) can measurably affect DIC isotopes.

3.  Total alkalinity vs. pH shows that total alkalinity is nearly independent of pH (high and low pH occur across a similar total alkalinity range) especially in spring and fall, but the relationship strengthens a bit in summer and winter (with opposite signs). This result suggests that changes in pH are driven primarily by CO₂ addition/removal, not by alkalinity inputs, which points to respiration, photosynthesis, and air–sea exchange as important dynamics more than carbonate chemistry shifts. This result is important because it is a combination of factors that can alter DIC F^14^C without leaving a strong total alkalinity fingerprint. The seasonal differences suggest that in the summer, strong respiration adds CO~2~, lowering pH and shifting carbonate speciation, but without adding alkalinity, while in winter there is low respiration, meaning photosynthesis and air-sea exchange dominate, so mixing with higher-alkalinity marine water coincides with higher pH (i.e., the system behaves more like a diluted coastal ocean). Overall, the opposite seasonal slopes indicate that total alkalinity is not driving pH; CO₂ processes are, meaning that carbon is cycled in ways that strongly affect DIC isotopes, but without large, obvious shifts in TA or salinity. This situation requires caution with regard to ΔR interpretations because it can lead to small F¹⁴C offsets arising without obvious hydrographic signals.

4.  DOC vs. salinity shows a strong inverse relationship, with DOC increasing sharply at lower salinities, as well as some seasonal structure. This result suggests that precipitation-driven inflow is DOC-rich. That DOC is almost certainly modern (marsh plants, recent soils), meaning its respiration will generate DIC with modern or near-modern F^14^C, but respiration CO₂ adds almost no alkalinity and only modest DIC concentrations at any instant. Therefore, the plot supports a picture where carbon cycling is intense, but bulk DIC remains marine-dominated.

5.  Salinity vs estimated DIC shows that DIC increases with salinity. Even at low salinity (\~15–20), estimated DIC is still \~0.0016–0.0018 mol/kg (1600–1800 µM), while at high salinity, DIC is \~0.0023–0.0024 mol/kg (2300–2400 µM). This result implies that the estuary’s DIC pool is always large and marine-like in magnitude, without large amounts of DIC contribution from freshwater, meaning that conservative freshwater mixing cannot explain large F^14^C shifts. In addition, DIC tracks salinity more tightly in spring and fall than in winter or summer, suggesting that spring and fall are more "mixing-dominated" seasons, meaning the performance of salinity as a proxy for carbon chemistry is better during those times. The reasons could include higher freshwater input, increasing tidal exchange, lower temperatures leading to slower metabolism and lower biological CO~2~ production/removal, and/or stronger winds and flushing, in contrast to summer conditions where temperatures are highest, and strong respiration and photosynthesis (and consequently DOC-rich inflow that fuels microbial CO~2~ production) can lead to DIC becoming decoupled from salinity (i.e., salinity still tracks water, but not carbon chemistry as cleanly). The radiocarbon ΔR implications are that shell carbonate formed in spring and fall is more likely to reflect a “marine-anchored” DIC pool, with smaller and more predictable ΔR offsets tied to mixing, while summer shell growth is where non-conservative processes are likely to dominate, increasing ΔR variability.

Altogether, ChatGPT explained that the plots rule out large carbonate-rich groundwater inputs, feshwater DIC dominating the estuarine DIC pool, and/or salinity-controlled F^14^C mixing as the main driver(s) of ΔR variability. Instead, they support a marine-dominated DIC reservoir (\~2 mM), active organic carbon cycling (high DOC at low salinity, pH variability), non-conservative CO₂ addition/removal with little total alkalinity signal, and the need for only very small (µM-scale) additions of isotopically distinct DIC to shift F^14^C. Seasonally, they suggest that Spring and Fall are characterized by conservative DIC \~ salinity relationships with a lower biological overprint, meaning ΔR should be smaller, with less variability, and closer to the regional marine reference values. In contrast, Summer shows some decoupling of DIC from salinity (e.g., due to high DOC respiration, strong porewater exchange, and/or small additions of isotopically distinct CO~2~), which implies ΔR should be more variable, with little salinity correlation, and potentially biased older, and Winter plots suggest lower metabolism and strong physical control, such that total alkalinity and pH move together, leading to a ΔR that is likely more stable, but also potentially reflects longer residence times.

## Final Conceptual Mixing Model

Because the analyses so far suggested that the Guana River carbon dynamics are a combination of "conservative" freshwater-marine mixing behavior and some additional "non-conservative" process that adds/removes DIC, ChatGPT recommended using a two-component mixing formula that estimates F^14^C for a given salinity and year as a function of expected F^14^C based on conservative mixing (i.e., as a function of salinity):

$$
F_{\mathrm{est}}(S,t)
=
\frac{
C_{\mathrm{mix}}(S)\,F_{\mathrm{mix}}(S,t)
+
\Delta C_{+}\,F_{\mathrm{fw}}(t)
}{
C_{\mathrm{mix}}(S) + \Delta C_{+}
}
$$

where the additional term $\Delta C_{+}$ is a non-conservative DIC addition/removal term (µM), varying with season, residence time, DOC, temperature, wind, etc., whose F^14^C is a function of the freshwater end-member F^14^C.

ChatGPT explained that $\Delta C_{+}$ could represent either sediment/porewater CO₂ addition (often low-F¹⁴C) or air–sea CO₂ exchange (often high-F¹⁴C, but sign can flip). In the former case, DIC is added with little or no clear alkalinity signal (especially if mostly CO~2~ from respiration) and it can be slightly old, even without obvious spring discharge, pushing F^14^C down and ∆R up. The latter could push F^14^C higher if there is a net invasion (i.e., modern CO~2~ is added), but net evasion may also occur (i.e., removal of CO~2~), which is also able to change DIC without much change to alkalinity and could push F^14^C up or down depending on which carbon species are exchanged and the direction. The pH/total alkalinity/DOC seasonal plots suggest that both may occur in Guana River, with their importance changing seasonally. It suggested I fit a DIC \~ salinity model and plot the resulting residual ∆DIC against DOC and pH by season to check for correlations. If residuals correlate with DOC (and/or with low pH), it would suggest that $\Delta C_{+}$ is driven by respiration/porewater CO~2~, indicating that the non-conservative term is real and quantifiable.

Therefore, I developed the following model of the relationship between salinity and estimated DIC (with assistance from ChatGPT) using a skew-normal likelihood:

$$
\mathrm{DIC}_i \sim \operatorname{SkewNormal}(\mu_i,\sigma_i,\alpha)
$$

where $\mu_i$ and $\sigma_i$ represent the mean and standard deviation for observation $i$ = 1,...,$n$, and $\alpha$ is the skew parameter. The mean structure of the model was

$$
\mu_i
=
\beta_0
+
\beta_1 z_i
+
u_{0,\mathrm{season}[i]}
+
u_{1,\mathrm{season}[i]} z_i
+
v_{\mathrm{year}[i]}
$$

where $z_i$ is the standardized salinity for observation $i$,

$$
z_i = \frac{\mathrm{sal}_i - \bar{\mathrm{sal}}}{s_{\mathrm{sal}}},
$$

$\beta_0$ and $\beta_1$ are the overall intercept and salinity slope, $u_{0,\mathrm{season}[i]}$ and $u_{1,\mathrm{season}[i]}$ are season-specific deviations in intercept and slope, and $v_{\mathrm{year}[i]}$ is a year-specific intercept deviation. Residual variation was allowed to differ among seasons according to

$$
\log(\sigma_i)
=
\gamma_0
+
\gamma_{\mathrm{season}[i]},
$$

where $\gamma_0$ is the baseline log-scale residual variance and $\gamma_{\mathrm{season}[i]}$ is the season-specific deviation. The season-specific varying intercepts and slopes were modeled as draws from a common multivariate normal distribution, and year-specific intercepts as draws from a common normal distribution:

$$
\begin{pmatrix}
u_{0,s} \\
u_{1,s}
\end{pmatrix}
\sim
\mathcal{N}
\left(
\begin{pmatrix}
0\\
0
\end{pmatrix},
\Sigma_{\mathrm{season}}
\right),
\qquad
v_y \sim \mathcal{N}(0,\tau^2_{\mathrm{year}}).
$$

Next, I calculated the positive residuals based on the expected values of the posterior distribution. The focus on positive residuals is because ∆DIC reflects DIC concentration, and processes which remove DIC from the estuary (i.e., producing negative residuals) cannot bias F^14^C. Only additions of isotopically distinct DIC can result in biases to F^14^C~shell~. Thus, conceptually, this plot represents the amounts of extra DIC the system adds beyond conservative mixing, when it adds any at all.

```{r}
#| echo: false
#| fig-align: "center"
#| fig-width: 8
#| fig-height: 8

#Get draws of the expected values of the posterior
epreddraws <- posterior_epred(dicsalmod_skew4, re_formula = NULL) #dicsalmod_skew2

#Create a vector of original calculated DIC values
yobs <- windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(dic), dic]

#Subtract each expected posterior draw from the corresponding actual observation-based calculated value
residdraws <- sweep(epreddraws, 2, yobs, FUN = "-") * -1 # equivalent to: resid = y_obs - mu

#Make it a data.table and add season variable
residdraws2 <- as.data.table(residdraws)
residdraws2 <- melt(residdraws2, measure.vars = colnames(residdraws2))
residdraws2[, `:=` (rowid = seq(1, nrow(residdraws2)),
                    obs_row = as.integer(str_sub(variable, 2, -1)))]
obs_seasons <- windat_as3[hobs_loc == "GRMAP" & !is.na(sal) & !is.na(dic), .(year, season, doc, alk, alk_mol), by = .I]
obs_seasons[, obs_row := seq(1, nrow(obs_seasons))]
residdraws3 <- merge(residdraws2, obs_seasons, by = "obs_row", all.x = T)

#Extract the positive delta-DIC residual values and summarize
posresid <- copy(residdraws3[value > 0, ])
posresid_sum <- posresid[, .(med = median(value),
                             q90 = quantile(value, 0.9),
                             q95 = quantile(value, 0.95)), by = season]
posresid_sum[, season2 := factor(season, levels = c("Sp", "Su", "F", "W"))]

#Plot them
ggplot(posresid_sum) +
  geom_line(aes(x = season2, y = med, group = 1, color = "med")) +
  geom_point(aes(x = season2, y = med, color = "med")) +
  geom_line(aes(x = season2, y = q90, group = 1, color = "q90")) +
  geom_point(aes(x = season2, y = q90, color = "q90")) +
  geom_line(aes(x = season2, y = q95, group = 1, color = "q95")) +
  geom_point(aes(x = season2, y = q95, color = "q95")) +
  theme_bw() +
  labs(x = "season", y = "∆DIC | ∆DIC > 0", color = "metric")

```

The plot illustrates some seasonal differences in how variable the carbon system is in Guana River. ChatGPT contextualized the seasonal differences as representing: 1) strong conservative mixing in the spring with increasing but still moderate biological activity and shorter residence times that lead to a tighter DIC \~ salinity relationship and the lowest residual variance; 2) summers with strong, but relatively steady, carbon cycling due to consistently high levels of respiration and DOC processing, which leads to a modest increase in variance in the DIC \~ salinity relationship; 3) cooling temperatures, episodic storms, shifting residence times, and a breakdown of summer stratification in the Fall that lead to a mixture of conservative and non-conservative regimes in carbon cycling, and consequently, the highest seasonal variability; and 4) winters with lower metabolism on average, but large physical variability, e.g., cold fronts, wind-driven exchange, and strong episodic flushing vs. stagnation, that leads variance in the DIC \~ salinity relationship to remain high, but without consistently high means. The values on the y-axis are in mol/kg, and are reasonable numbers for porewater exchange, sediment respiration, DOC remineralization, and weak carbonate dissolution buffered by rapid re-equilibration, according to ChatGPT, which also noted the values are within the range that can leave the total alkalinity \~ salinity relationship looking conservative and produces only modest DIC scatter, while still strongly affecting F^14^C.

Next, I plotted the residuals (∆DIC) against total alkalinity and DOC for each season:

```{r}
#| echo: false
#| fig-align: "center"
#| fig-width: 8
#| fig-height: 8

resid_sum2 <- residdraws3[!is.na(value) & !is.na(doc) & !is.na(alk) & !is.na(alk_mol),
                          .(resid_med = median(value),
                            resid_q90 = quantile(value, 0.9),
                            resid_q95 = quantile(value, 0.95),
                            doc_med = median(doc),
                            alk_med = median(alk),
                            alk_mol_med = median(alk_mol)), by = list(year, season)]

resid_sum2[, season2 := factor(season, levels = c("Sp", "Su", "F", "W"))]

dic_alk <- ggplot(resid_sum2) +
              geom_point(aes(x = resid_med, y = alk_med), color = "dodgerblue") +
              geom_smooth(aes(x = resid_med, y = alk_med), method = "lm") +
              theme_bw() +
              labs(x = "Median ∆DIC", y = "Median TA") +
              facet_wrap(~season2, ncol = 1)

dic_doc <- ggplot(resid_sum2) +
              geom_point(aes(x = resid_med, y = doc_med), color = "firebrick") +
              geom_smooth(aes(x = resid_med, y = doc_med), method = "lm") +
              theme_bw() +
              labs(x = "Median ∆DIC", y = "Median DOC") +
              facet_wrap(~season2, ncol = 1)

dic_alk + dic_doc + plot_layout(guides = "collect")


```

ChatGPT suggested that:

1.  seeing some relationship between ∆DIC and total alkalinity indicates that residual DIC is linked to processes that affect the carbonate system, not just organic matter supply (e.g., carbonate dissolution/re-equilibration in sediments, porewater exchange involving bicarbonate, alkalinity-generating anaerobic processes like sulfate reduction or denitrification, and/or shifts in buffering capacity that modulate how CO₂ addition/removal manifests as DIC), because such processes can change DIC and total alkalinity locally, but still average out to a conservative total alkalinity \~ salinity relationship at the estuary scale.

2.  The lack of a relationship between ∆DIC and DOC is not surprising, given that DOC controls substrate availability for respiration rather than the instantaneous DIC anomaly measured (i.e., DOC is a stock while ∆DIC is a flux outcome). This distinction means that high DOC does not guarantee high respiration at that moment (i.e., DOC only sets the potential for DIC production rather than the realized residual at any given time). DOC delivered during storms may be respired days or weeks later, and possibly elsewhere, as it can be exported, buried, photodegraded, or uptaken by microbes prior to respiration. Thus, the fact that there does not appear to be a simple DOC \~ ∆DIC relationship strengthens the evidence that residual DIC reflects integrated carbonate-system dynamics, not just organic loading.

Now that the $\Delta C_{+}$ term is supported, here is the probabilistic surface model based on the framework from earlier.

## Probabilistic Surface Model

A probabilistic framework is necessary because using the curves for calibration requires estimates of uncertainty. I asked ChatGPT 5.2 to suggest a method for propagating error through to create a credible interval surface, beginning with defining the surface grid.

```{r}
#| echo: false

#Create the grid
sal_grid <- seq(0, 40, by = 1)
yr_grid <- c(seq(-50, 1950, by = 10), seq(1951, 2025, by = 1))
seasons <- c("Sp", "Su", "F", "W")
grid <- CJ(sal = sal_grid, yr = yr_grid, season = seasons)

#Format year and season as factors to match data from dic x sal model
grid[, `:=` (year = factor(yr, levels = levels(dicsalmod_skew4$data$year)), #dicsalmod_skew2$data$year
             season2 = factor(season, levels = levels(dicsalmod_skew4$data$season)))] #dicsalmod_skew2$data$season

```


I used the DIC ~ salinity model to derive $C_{\mathrm{mix}}$ and $\Delta C_{+}$, as draws from the expected mean posterior distribution for DIC and as the positive differences between the $C_{\mathrm{mix}}$ draws and draws from the full DIC posterior distribution for each day (i.e., $\Delta C_{d,+}=\max\!\left(\mathrm{DIC}_{\mathrm{rep},d} - C_d,\ 0\right)$, where $d$ indexes the donor-day observations and $C_d$ is the model-based expected DIC for donor day $d$). Thus, the $F_{\mathrm{est}}$ definition more accurately became

$$
F_{\mathrm{est},d}(S,t)
=
\frac{
C_d\,F_{\mathrm{mix}}(S,t)
+
\Delta C_{d,+}\,F_{\mathrm{fw}}(t)
}{
C_d+\Delta C_{d,+}