Add FactScore-STEM-Geo dataset; Include CodeGenUQ in docs by dylanbouchard · Pull Request #409 · cvs-health/uqlm

dylanbouchard · 2026-05-31T19:15:07Z

Description

Add FactScore-STEM-Geo dataset (from Bouchard et al., 2026) to load_example_dataset utility. This uses wikipedia-api library to create the long-form answer key.
Added code gen methods to docs site

Type of Change

Checklist

Tests added or updated for all changed behavior
Docstrings updated for any new or modified public API
Type annotations added for any new or modified functions
ruff check and ruff format pass locally

dylanbouchard · 2026-06-01T13:58:43Z

@mohitcek

dskarbrevik · 2026-06-05T19:34:43Z

+            if cols:
+                df = _dataset_processing(df=df, subset_columns=cols)
+            if isinstance(n, int):
+                df = df.iloc[:n]


what is this slicing used for? Reason I'm calling it out is that because it happens at the very end you've already gone through the hard part of all the http calls to wikipedia only to throw it away here if you're doing .iloc[:5] or something right?

True, good point. Should we just ignore n parameter for factscore-stem-geo and user can use .head() or sample()? Open to ideas here

yeah either that or passing n to load_factscore_stem_geo_dataset() so that it can handle what to do with n before fetching pages.

How long do you find it takes to load this dataset with this code? Might be worth putting a progress bar on the for loop calling wikipedia so the user understands what's taking so long. Unless you find that it happens fast b/c wikipedia+the wiki lib just hit the server hard and it's ok with that. But I think I originally put factscore in HF 1) to keep the HF-centric approach and 2) to avoid issues with having to scrape on demand... not that I'm arguing for this needing to be static in HF but just where all of my motivation for this comment thread is coming from :D

updated as discussed!

virenbajaj

One function that should be private is public I think.
2 documentation nits. Otherwise looks good!

virenbajaj · 2026-06-08T16:39:18Z

        print(f"Loading dataset - {name}...")
+        if dataset_dict[name]["load_params"].get("loader") == "_load_factscore_stem_geo_dataset":
+            if isinstance(n, int):
+                print("Note: the 'n' parameter is not used for 'factscore-stem-geo' — all available articles will be returned.")


nit: this note says all available articles will be returned, but this is capped at 100 articles per entity type. Can we say something like: "At most 100 longest articles per entity will be returned"?

virenbajaj · 2026-06-08T17:00:47Z

+        "livecodebench", "factscore-stem-geo"

    n : int, optional
        Number of rows to load from the dataset.


nit: change to
"n : int, optional
Number of rows to load from the dataset. Ignored for "factscore-stem-geo",
which always returns all fetched articles."

virenbajaj · 2026-06-08T17:02:42Z

+}
+
+
+def get_wiki_texts_from_entities(entities: List[str]) -> dict:


Should this be a private helper that starts with an underscore _ like _load_factscore_stem_geo_dataset()?
def _get_wiki_texts_from_entities(entities: List[str]) -> dict:

dylanbouchard added 6 commits May 31, 2026 15:12

add factscore-stem-geo to dataloader

436bcbf

update longform example notebooks

34d39ec

ruff format

c8c9bcb

add full entity lists

6c680f4

add code gen to docs

8c2d619

update readme

075506b

dylanbouchard changed the title ~~Factscore stem geo~~ Add FactScore-STEM-Geo dataset; Include CodeGenUQ in docs Jun 1, 2026

dskarbrevik reviewed Jun 5, 2026

View reviewed changes

dylanbouchard added 2 commits June 6, 2026 10:13

fix demo link

15bd4da

add print statement for factscore-stem-geo

b348d79

virenbajaj suggested changes Jun 8, 2026

View reviewed changes

dylanbouchard added 2 commits June 8, 2026 13:35

update docstring and print statements

a6dc990

make helper private

42065e7

dylanbouchard merged commit 0b44b24 into develop Jun 8, 2026
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FactScore-STEM-Geo dataset; Include CodeGenUQ in docs#409

Add FactScore-STEM-Geo dataset; Include CodeGenUQ in docs#409
dylanbouchard merged 10 commits into
developfrom
factscore-stem-geo

dylanbouchard commented May 31, 2026 •

edited

Loading

Uh oh!

dylanbouchard commented Jun 1, 2026

Uh oh!

dskarbrevik Jun 5, 2026

Uh oh!

dylanbouchard Jun 5, 2026

Uh oh!

dskarbrevik Jun 5, 2026

Uh oh!

dylanbouchard Jun 6, 2026

Uh oh!

virenbajaj left a comment

Uh oh!

virenbajaj Jun 8, 2026

Uh oh!

virenbajaj Jun 8, 2026

Uh oh!

virenbajaj Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		}


		def get_wiki_texts_from_entities(entities: List[str]) -> dict:

Conversation

dylanbouchard commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Checklist

Uh oh!

dylanbouchard commented Jun 1, 2026

Uh oh!

dskarbrevik Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

dylanbouchard Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

dskarbrevik Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

dylanbouchard Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

virenbajaj left a comment

Choose a reason for hiding this comment

Uh oh!

virenbajaj Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

virenbajaj Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

virenbajaj Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dylanbouchard commented May 31, 2026 •

edited

Loading