Skip to content

BYOC_ID: 46056 Refactor District Nursing#1327

Open
LucyEmma22 wants to merge 11 commits into
developmentfrom
refactor-dn
Open

BYOC_ID: 46056 Refactor District Nursing#1327
LucyEmma22 wants to merge 11 commits into
developmentfrom
refactor-dn

Conversation

@LucyEmma22

Copy link
Copy Markdown
Collaborator

Data Required:

  1. extract_district_nursing --> get_boxi_extract_path(year = year, type = "dn")
  2. dn_costs --> get_dn_raw_costs_path()
  3. dn_contacts --> fs::path(get_slf_dir(), "Costs", "DN-Contacts-Numbers-for-Costs.csv")
  4. population_lookup --> get_pop_path(type = "hscp")

To Do/Check:

read_extract_district_nursing:

  • Check table name for extract_district_nursing.
  • Check whether data should be filtered by year. If so, check year column name.
  • Check whether to keep Patient Data Zone 2011 (Contact).

get_dn_costs_path:

  • Based on refactoring of get_ch_costs_path.

process_costs_dn:

  • New function.
  • Check if file paths should be included as arguments. If so, these file path functions will need to be refactored.
  • Check it is okay to add dn_costs to log_slf_event.
  • Check table name for dn_costs, dn_contacts and population_lookup.
  • Check column names for dn_costs, dn_contacts and population_lookup.
  • Check whether all cost columns are numeric in Denodo view. If so converting costs to numeric can be removed.

log_slf_event:

  • Added dn_cost to type list.

@LucyEmma22 LucyEmma22 added On hold Waiting for something / someone outside of our control BYOC dependency missing Preprod data not available labels May 13, 2026
Comment thread R/process_costs_dn.R
names_to = "year",
names_pattern = "(\\d{4})_cost",
values_to = "cost"
)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed if we are using pivoted costs.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! costs are now pivoted so we can remove this code. Will need testing again once the data is available

Comment thread R/process_costs_dn.R
Comment on lines +12 to +14
# dn_raw_costs_path = get_dn_raw_costs_path(), # TODO: Check if needed. If it is function will need to be refactored to include the BYOC_MODE argument.
# dn_raw_contacts_path = fs::path(get_slf_dir(), "Costs", "DN-Contacts-Numbers-for-Costs.csv"), # TODO: Check if needed. If it is function will need to be refactored to include the BYOC_MODE argument.
# pop_path = get_pop_path(type = "hscp"), # TODO: Check if needed. If it is function will need to be refactored to include the BYOC_MODE argument.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - these will be required and include the BYOC_MODE argument. We agreed that going forward any dependencies should be displayed in the function arguments to make them more visible

Comment thread R/process_costs_dn.R
BYOC_MODE = FALSE,
run_id = NA,
run_date_time = NA) {
log_slf_event(stage = "process", status = "start", type = "dn_costs", year = year) # TODO: Check this is necessary.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot! - yes i think we need this!

Comment thread R/process_costs_dn.R
BYOC_MODE = BYOC_MODE
)

log_slf_event(stage = "process", status = "complete", type = "dn_costs", year = "all") # TODO: Check this is necessary.

@Jennit07 Jennit07 May 20, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - good spot! I added a comment to CH costs as a reminder to do this also!

denodo_connect,
dbplyr::in_schema("sdl", "sdl_district_nursing_source")
) %>% # TODO: Check table name.
# TODO: Check whether to filter by year.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - i think filter by year is missing here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what the year column is/will be?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this one is interesting actually - This works similar to SPARRA how the extracts are saved in hscdiip separately, the main difference is that DN is an archived dataset but this will also be a file ingest by NSS. Currently, each extract is split into FY with the name, however once this is in Denodo, i think NSS will create a FY column for us and the data would be stacked as one view.

For now, i think you could call the column year and add placeholder code with a TODO so that we remember to check this when we get the data to test

Comment on lines +138 to +143
process_costs_dn(
denodo_connect = get_denodo_connection(BYOC_MODE = BYOC_MODE),
BYOC_MODE = FALSE,
run_id = NA,
run_date_time = NA
),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ive answered the TODO comments above, they would also need added to the targets as parameters here too

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good point!

Comment thread _targets.R
Comment on lines +240 to +244
process_costs_dn(
denodo_connect = get_denodo_connection(BYOC_MODE = BYOC_MODE),
BYOC_MODE = FALSE,
run_id = NA,
run_date_time = NA

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above: also need to add extra parameters here for addtitional files

@Jennit07 Jennit07 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LucyEmma22 Thanks for doing this PR - great work. Everything looks good, ive added some comments answering some of your code questions. A few changes required here. But overall looks good and i can test again when the data is ready! Thanks

Comment thread R/get_costs_paths.R
if (isTRUE(BYOC_MODE)) {
dn_costs_path <- file.path(
denodo_output_path(),
stringr::str_glue("Cost_DN_Lookup.parquet")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are now using cost_ch_lookup.parquet, changing everything to small letters for consistency. NSS's copy_to_s3 will take cost_dn_lookup.parquet as it has been effected on their end as small letters.

Comment thread R/get_costs_paths.R
dn_costs_path <- get_file_path(
directory = fs::path(get_slf_dir(), "Costs"),
file_name = stringr::str_glue(
"Cost_DN_Lookup.parquet"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

Comment thread R/process_costs_dn.R
BYOC_MODE = FALSE,
run_id = NA,
run_date_time = NA) {
log_slf_event(stage = "process", status = "start", type = "dn_costs", year = year) # TODO: Check this is necessary.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, the line of code needs to be added. type should be dn_cost_lookup for consistency.

Comment thread R/process_costs_dn.R
BYOC_MODE = BYOC_MODE
)

log_slf_event(stage = "process", status = "complete", type = "dn_costs", year = "all") # TODO: Check this is necessary.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type should be dn_cost_lookup for consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants