Skip to content

Sticky posts and cornerstone content priorities for llms.txt #22316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

leonidasmi
Copy link
Contributor

@leonidasmi leonidasmi commented May 28, 2025

Context

  • Cornerstone content gets priority in the llms.txt file
  • If you have cornerstone posts:
    • If you have fewer than 5 posts set as cornerstone content:
      • [Cornerstone]
      • [Cornerstone]
      • [Latest]
      • [Latest]
      • [Latest]
    • If you more than 5 posts set as cornerstone content:
      • If posts, or other hierarchical post type: Use 5 latest cornerstone content pieces (and no other posts)
      • If pages, or other hierarchical post type: Use all cornerstone content pieces

Summary

This PR can be summarized in the following changelog entry:

  • Prioritizes cornerstone content for the posts lists in the llms.txt file.

Relevant technical choices:

  • Managed to avoid using post__not_in arguments in get_posts() as that would make the query much slower. Instead, we retrieve 5 posts anyway and manually exclude them if they are already fetched before because they are cornerstone.
  • Hierarchal post types show ALL cornerstone content, not just the 5 most recently updated ones. There was no limit implemented in the relevant query as well, although typically that's considered a bad practice.
    • The reason for that was that I don't expect cornerstone content to be a number high enough that could make this an issue, as the amount of cornerstone content should be several orders of magnitude less than that.

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:

  • In general, let's try to test the rule described in the PR's context above. Some examples are:
  • Have 6 posts (or any other non-hierarchical post type) and all of them are cornerstone. Generate the file and confirm you get the 5 cornerstones that are the most recently modified. (that would probably mean that you will get the five that you last set as cornerstone, so you will not see the post that you first set as cornerstone)
  • Have 6 posts and one of them is cornerstone. Generate the file and confirm you get the cornerstone first and the rest 4 posts sorted by modified date
    • Experiment with both having the cornerstone more recently modified than the other posts and vice versa
  • Have 6 posts and 4 of them are cornerstone. Generate the file and confirm you get the cornerstones first sorted by modified date and then the 1 regular post which is the most recently modified post.
    • Experiment with both having the regular post more recently modified than the cornerstones and vice versa
  • Have only posts that have a creation date older than 12 months. Confirm that there's no post in the file.
    • Make a couple of them cornerstone. Confirm that the cornerstone are now in the file
  • In any of the above tests, make one cornerstone post into one of the following. Confirm that the file is created without acknowledging that non-public post at all:
    • password protected or private
    • draft or pending review or scheduled
    • its Allow search engines to show this content in search results setting is set to no
      • for that case, if it's one of the most recently modified posts, it will still go into the llms.txt file unless Use indexables for retrieving posts for the llms.txt file #22327 is merged.
      • we can test whether the cornerstone query ignores it, which is what we want.
      • to do that, once you turned its Allow search engines to show this content to no, edit another post and generate the file
      • the cornerstone post will not be at the top of the post list.
  • Repeat some of the tests, while adding the add_filter( 'Yoast\WP\SEO\should_index_indexables', '__return_false' ); snippet and reset indexables. That way we have disabled indexable creation, so the cornerstone content wont be able to be retrieved
    • Generate the file and confirm that there's no acknowledgement of the cornerstone content
    • Remove the filter and regenerate indexables
  • Repeat some of the tests, while disabling the cornerstone content feature.
    • Generate the file and confirm that there's no acknowledgement of the cornerstone content

  • Repeat the above tests for pages (or any other hierarchical post type). To be exact, have a couple of cornerstone posts while repeating the tests for pages, so as to confirm that the one type doesn't affect the other.
    • The difference in the above steps for pages is that we expect to have all cornerstone pages returned, not just the 5 most recently modified ones.
    • so let's tweak the above steps:
  • Have 6 pages (or any other non-hierarchical post type) and all of them are cornerstone. Generate the file and confirm you get all 6 of the cornerstone pages, sorted by the most recently modified.
  • Have 6 pages and one of them is cornerstone. Generate the file and confirm you get the cornerstone first and the rest 4 pages sorted by modified date
    • Experiment with both having the cornerstone more recently modified than the other pages and vice versa
  • Have 6 pages and 4 of them are cornerstone. Generate the file and confirm you get the cornerstones first sorted by modified date and then the 1 regular post which is the most recently modified page.
    • Experiment with both having the regular page more recently modified than the cornerstones and vice versa
  • Have 8 pages and 6 of them are cornerstone. Generate the file and confirm you get all 6 cornerstones sorted by modified date and no other page.
    • Experiment with both having the regular pages more recently modified than the cornerstones and vice versa.
  • Have only pages that have a creation date older than 12 months. Confirm that there's no page in the file.
    • Make a couple of them cornerstone. Confirm that the cornerstone are now in the file
  • In any of the above tests, make one cornerstone page into one of the following. Confirm that the file is created without acknowledging that non-public page at all:
    • password protected or private
    • draft or pending review or scheduled
    • its Allow search engines to show this content in search results setting is set to no
      • for that case, if it's one of the most recently modified pages, it will still go into the llms.txt file unless Use indexables for retrieving posts for the llms.txt file #22327 is merged.
      • we can test whether the cornerstone query ignores it, which is what we want.
      • to do that, once you turned its Allow search engines to show this content to no, edit another page and generate the file
      • the cornerstone page will not be at the top of the page list.

Relevant test scenarios

  • Changes should be tested with the browser console open
  • Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies
  • Changes should be tested on different editors (Default Block/Gutenberg/Classic/Elementor/other)
  • Changes should be tested on different browsers
  • Changes should be tested on multisite

Test instructions for QA when the code is in the RC

  • QA should use the same steps as above.

Impact check

This PR affects the following parts of the plugin, which may require extra testing:

UI changes

  • This PR changes the UI in the plugin. I have added the 'UI change' label to this PR.

Other environments

  • This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • I have written documentation for this change. For example, comments in the Relevant technical choices, comments in the code, documentation on Confluence / shared Google Drive / Yoast developer portal, or other.

Quality assurance

  • I have tested this code to the best of my abilities.
  • During testing, I had activated all plugins that Yoast SEO provides integrations for.
  • I have added unit tests to verify the code works as intended.
  • If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • I have written this PR in accordance with my team's definition of done.
  • I have checked that the base branch is correctly set.

Innovation

  • No innovation project is applicable for this PR.
  • This PR falls under an innovation project. I have attached the innovation label.
  • I have added my hours to the WBSO document.

Fixes https://github.com/Yoast/reserved-tasks/issues/578

@leonidasmi leonidasmi added the changelog: enhancement Needs to be included in the 'Enhancements' category in the changelog label May 28, 2025
Copy link

A merge conflict has been detected for the proposed code changes in this PR. Please resolve the conflict by either rebasing the PR or merging in changes from the base branch.

@leonidasmi leonidasmi force-pushed the 578-nice-to-have-sticky-posts-and-cornerstone-content-has-priority-for-the-postpagecpt-list branch from 6b0fb76 to 1829cbd Compare June 3, 2025 08:16
@leonidasmi leonidasmi marked this pull request as ready for review June 3, 2025 12:30
@coveralls
Copy link

coveralls commented Jun 3, 2025

Pull Request Test Coverage Report for Build 9c00443b1f830383e79e52807964591ec0d9aa13

Details

  • 30 of 40 (75.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.5%) to 53.06%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/repositories/indexable-repository.php 0 10 0.0%
Totals Coverage Status
Change from base Build 72ad0b38d9f69bfeb206b1138e895f5c62f774c8: -0.5%
Covered Lines: 29750
Relevant Lines: 57012

💛 - Coveralls

Copy link
Contributor

@thijsoo thijsoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of suggestions

* Returns the most recently modified cornerstone content of a post type.
*
* @param string $post_type The post type.
* @param int $limit The maximum number of posts to return.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved here.

*
* @return Indexable[] array of indexables.
*/
public function get_recent_cornerstone_per_post_type( string $post_type, ?int $limit ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say for instead of per since Per suggests a list of different post types and this is for one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixed here.

Copy link
Contributor

@thijsoo thijsoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CR + ACC 👍

@thijsoo thijsoo added this to the 25.4 milestone Jun 4, 2025
@thijsoo thijsoo merged commit db307e5 into trunk Jun 4, 2025
27 checks passed
@thijsoo thijsoo deleted the 578-nice-to-have-sticky-posts-and-cornerstone-content-has-priority-for-the-postpagecpt-list branch June 4, 2025 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog: enhancement Needs to be included in the 'Enhancements' category in the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants