Skip to content

Inconsistent handling of invalid strings in mdy, etc #1204

@salishseaswimmer

Description

@salishseaswimmer

Issue:
MDY (and related functions)

"Sept" is a recognized non-canonical abbreviation for September.
Faced with this as an input string, mdy should be consistent and not return invalid values.
Current mdy() performance is inconsistent.

library(tidyverse)
df = tibble(
  prefix = "Sept, ",
  days = 1:30
) %>% expand_grid(
  ., year_val=2015:2025, 
) %>% 
  mutate(
  target_string = paste(input_string = paste0(prefix, days, ", ", year_val)),
  target_date = mdy(target_string)
  )
  
  summary(df$target_date)
        Min.      1st Qu.       Median 
"2015-01-20" "2017-10-12" "2020-07-05" 
        Mean      3rd Qu.         Max. 
"2020-07-05" "2023-03-27" "2025-12-20" 
        NA's 
       "198" 
> 

mdy returns both valid "date" structures and NA. Valid "date" returns are incorrect.

For dates from "Sept, 1" of any year tested, until "Sept, 12" of those years , mdy returns a valid date object, but the value is incorrect.
e.g mdy("Sept, 12, 2022") = 2022-12-20

But, starting with "Sept, 13" through to end of month, mdy consistently returns NA
mdy(Sept, 13, 2015) = NA

Desired performance: mdy should work consistently across this class of invalid inputs.
Desired performance mdy should return NA in all cases.

lubridate_1.9.3
R version 4.5.1 (2025-06-13)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.3 LTS

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviorparser 🥕

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions