Skip to content

Prevent warnings when $parsed_date[EDTFUtils::YEAR_BASE] is not set #139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 25, 2025

Conversation

xurizaemon
Copy link
Contributor

@xurizaemon xurizaemon commented Feb 23, 2025

GitHub Issue: (link)

What does this Pull Request do?

The YEAR_BASE value is set conditionally, only when YEAR_EXPONENT has been populated.

// Expand the year if the Year Exponent exists.
if (array_key_exists(EDTFUtils::YEAR_EXPONENT, $parsed_date) && !empty($parsed_date[EDTFUtils::YEAR_EXPONENT])) {
$parsed_date[EDTFUtils::YEAR_BASE] = EDTFUtils::expandYear($parsed_date[EDTFUtils::YEAR_FULL], $parsed_date[EDTFUtils::YEAR_BASE], $parsed_date[EDTFUtils::YEAR_EXPONENT]);
}

// Expand exponents.
if (!empty($parsed_date[self::YEAR_EXPONENT])) {
$exponent = intval(substr($parsed_date[self::YEAR_EXPONENT], 1));
$parsed_date[self::YEAR_BASE] = strval((10 ** $exponent) * intval($parsed_date[self::YEAR_BASE]));
$parsed_date[self::YEAR_BASE] = self::expandYear($parsed_date[self::YEAR_FULL], $parsed_date[self::YEAR_BASE], $parsed_date[self::YEAR_EXPONENT]);
}

When YEAR_BASE is unset, which happens when YEAR_EXPONENT is not set, formatting some inputs may trigger warnings or errors in Drupal.

Various invalid EDTF values were identified in the site where this was noted. This led to a large number of warnings being logged, particularly when exporting CSV of collection data using Views.

On the affected site, this also correlated with thumbnails not being viewable (but am not confident if that is the root cause of the thumbnail issue).

What's new?

Improve validation of EDTF inputs when being formatted, eliminate some warnings when invalid data supplied.

How should this be tested?

Formatting output for some invalid EDTF values should trigger this behaviour. Example inputs are (empty string), "not-an-EDTF-date", "1900s", etc.

Test coverage is added here, and may want to be expanded for additional inputs.

Additional Notes:

Interested parties

@Islandora/committers, @antbrown

@seth-shaw-asu
Copy link
Member

I'm unable to reproduce the error.

I spun up a current version of site-template, created an item using the value 1998-04 in field_date_issued, and viewed the item. No errors or warnings appeared in the recent log messages. I then cleared caches, viewed the items again, and reloaded the watch dog report. Still no errors or warnings.

Can you provide a more specific example of how to reproduce the error?

xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon pushed a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
@xurizaemon xurizaemon changed the title Edtf year base unset $parsed_date[EDTFUtils::YEAR_BASE] may be unset for some dates Feb 25, 2025
@xurizaemon
Copy link
Contributor Author

xurizaemon commented Feb 25, 2025

You're right, apologies for reporting 1998-04 as an invalid EDTF, my error relaying information. PR details corrected now.

Looks like the warnings we're observing are due to invalid EDTF inputs in the DB - or at least, that's one way to trigger them. While these should ideally be validated for storage, I do think it's an improvement to check $parsed_date[ ... ] keys exist before using.

An alternative approach could be to call EDTFUtils::validate() on the date we're formatting, and return an empty string if the date is not EDTF-valid. That feels more specific, but like a bigger change. Happy to switch to that if preferred.

I've included test coverage, and 9fe716d should pass for valid dates, a18c741 then shows PHPUnit failures due to warnings when some additional inputs are provided, and then 6ad41ef is the fix proposed here initially.

The invalid inputs are any non-EDTF value, for example:

  • `` (empty string)
  • invalid-date (any random input you like, we have many!)
  • 1900s (almost-EDTF?)

Failing test output

At a18c741 we see this output where the undefined array keys are referenced; this is fixed in the subsequent commit.

phpunit web/modules/custom/controlled_access_terms/
PHPUnit 9.6.22 by Sebastian Bergmann and contributors.

Testing /app/web/modules/custom/controlled_access_terms
..........EEE..                                                   15 / 15 (100%)

Time: 00:35.066, Memory: 10.00 MB

There were 3 errors:

1) Drupal\Tests\controlled_access_terms\Kernel\Plugin\Field\FieldFormatter\EDTFFormatterTest::testDateFormatting with data set "Empty input" ('', '')
Undefined array key 3

/app/web/modules/custom/controlled_access_terms/src/Plugin/Field/FieldFormatter/EDTFFormatter.php:244
/app/web/modules/custom/controlled_access_terms/tests/src/Kernel/EDTFFormatterTest.php:144
/app/vendor/phpunit/phpunit/src/Framework/TestResult.php:729

2) Drupal\Tests\controlled_access_terms\Kernel\Plugin\Field\FieldFormatter\EDTFFormatterTest::testDateFormatting with data set "Invalid date format" ('invalid-date', '')
Undefined array key 3

/app/web/modules/custom/controlled_access_terms/src/Plugin/Field/FieldFormatter/EDTFFormatter.php:244
/app/web/modules/custom/controlled_access_terms/tests/src/Kernel/EDTFFormatterTest.php:144
/app/vendor/phpunit/phpunit/src/Framework/TestResult.php:729

3) Drupal\Tests\controlled_access_terms\Kernel\Plugin\Field\FieldFormatter\EDTFFormatterTest::testDateFormatting with data set "Invalid (date-like)" ('1900s', '')
Undefined array key 3

/app/web/modules/custom/controlled_access_terms/src/Plugin/Field/FieldFormatter/EDTFFormatter.php:244
/app/web/modules/custom/controlled_access_terms/tests/src/Kernel/EDTFFormatterTest.php:144
/app/vendor/phpunit/phpunit/src/Framework/TestResult.php:729

ERRORS!
Tests: 15, Assertions: 148, Errors: 3.

xurizaemon pushed a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon pushed a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
@xurizaemon
Copy link
Contributor Author

Would someone please permit the introduced tests to run on this PR?

I see "This workflow requires approval from a maintainer." and I seek your approval 😄

xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
@xurizaemon xurizaemon changed the title $parsed_date[EDTFUtils::YEAR_BASE] may be unset for some dates Prevent warnings when $parsed_date[EDTFUtils::YEAR_BASE] is not set Feb 25, 2025
@seth-shaw-asu
Copy link
Member

The failure are all coding styles. Be sure to use phpcs --standard=Drupal,DrupalPractice to check for errors. Most of these can be fixed with phpcbf --standard=Drupal,DrupalPractice.

xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon pushed a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
@xurizaemon
Copy link
Contributor Author

Doh! Fixed, rebased (squashed PHPCS into the history rather than adding a separate commit), let's try that again.

Thanks @seth-shaw-asu :)

@seth-shaw-asu
Copy link
Member

Yet more style fixes are needed.

xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon pushed a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
xurizaemon added a commit to xurizaemon/controlled_access_terms that referenced this pull request Feb 25, 2025
@xurizaemon
Copy link
Contributor Author

xurizaemon commented Feb 25, 2025

Updated once more, thanks. I had missed a few trailing periods, and am omitting PHPCBF changes which aren't relevant to this PR (eg list() => []). Will include that if it fails again.

(edit: I didn't do that, because it was a different failure.)

@xurizaemon
Copy link
Contributor Author

xurizaemon commented Feb 25, 2025

Ugh, I thought :void would do for that @return value. On it.

Oh, wrong function, doh.

@seth-shaw-asu
Copy link
Member

This is the last bit holding the checks back:

FILE: ...s/controlled_access_terms/build_dir/tests/src/Kernel/EDTFFormatterTest.php
--------------------------------------------------------------------------------
FOUND 1 ERROR AFFECTING 1 LINE
--------------------------------------------------------------------------------
 157 | ERROR | Description for the @return value is missing
--------------------------------------------------------------------------------

@xurizaemon
Copy link
Contributor Author

Thanks for your patience :)

@seth-shaw-asu
Copy link
Member

All is green. I'll give it a test.

@seth-shaw-asu
Copy link
Member

I can confirm that the PR, as it exists, works. The warnings are gone. However, after thinking about it for a bit while testing, we really ought to add a warning when we come across bad data in this field. So I think the idea shared earlier...

An alternative approach could be to call EDTFUtils::validate() on the date we're formatting, and return an empty string if the date is not EDTF-valid. That feels more specific, but like a bigger change. Happy to switch to that if preferred.

... is a better idea. That said, instead of simply returning an empty string (which results in a poor user experience), I would return the original string value as-is and create a Drupal logger message indicating the item at fault along with the validator's messages. That way, site admins can see helpful messages instead of vague ones.

@xurizaemon
Copy link
Contributor Author

xurizaemon commented Feb 25, 2025

For the site affected here, the current issue is that exporting collections to CSV generates a large number of warnings. (In an internal ticket, the ~1000-odd invalid values in EDTF fields have been reported to the customer for addressing.)

The current behaviour for this site is that invalid EDTF values are not presented to the public via node view, and don't seem to be surfaced in the edit interface. Fields are using the "Default EDTF widget" element for edit.

If I understand right, #139 (comment) would be a change in behaviour for this module: where a site currently has an invalid EDTF value, I think we don't expect that to be surfaced? (Maybe I described that poorly with 'return an empty string', I was forgetting that the validate function returns a non-empty array on validation failure.)

For the customer affected, my goal here was to reduce log noise and (separately) allow them to tidy up some invalid data. I have a concern that logging each "bad" value would add load both to CSV exports (already a demanding task) and to regular page display (a small thing, but not insignificant with current levels of web crawling).

Would you consider merging this as-is (introduces test coverage, small change to eliminate a PHP warning) and working towards improved validation and reporting separately? I see there's other validation improvements forthcoming in eg #123 and #129, and a suggestion in 129 to use professional-wiki/edtf.

@xurizaemon
Copy link
Contributor Author

xurizaemon commented Feb 25, 2025

Meant to say this but lost it in edits.

AFAICT the current behaviour of this module is that invalid EDTF inputs:

  • Are not visible in the edit UI via "Default EDTF widget"
  • Are not presented to the end user via the formatter

In the UAT environment, I wasn't able to get an invalid value to show up in the admin UI or the frontend, nor to save it. (The existing values will have been imported from CSV, bypassing form validation.)

In the production environment, I dug through existing invalid values, and the only invalid one I see exposed in the admin UI is 190x (should be 190X?). Attempting to save 190x in the web UI is rejected ("Could not parse the date '190x'), but the value is presented to the editing user, and appears to be revealed in the public display of the item also.

EDIT: A site admin has been hard at work fixing invalid values - so my comment above that they are not exposed in admin UI was incorrect. Invalid values such as 1960%/1970% (trailing space, nid 102291) are presented to the admin in the form, but cannot be saved without correction.

image

image

(for my reference: production node 6367 above)

The UAT and production environments aren't identical here and I don't want to experiment with production.

Anyway, this is mostly to say that I'd want to be careful about revealing invalid date values to the admin or end user, if that's what #139 (comment) proposes.

(I'm now realising that the cleanup may not be possible in the admin interface, if those values aren't presented to the admin at all! That is a different problem we will be facing. 😁 )

Copy link
Member

@seth-shaw-asu seth-shaw-asu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@seth-shaw-asu seth-shaw-asu merged commit eddcbe9 into Islandora:2.x Feb 25, 2025
10 checks passed
@xurizaemon xurizaemon deleted the edtf-YEAR_BASE-unset branch February 25, 2025 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants