-
Notifications
You must be signed in to change notification settings - Fork 76
Description
I've been updating the rscorecard R package and have run into a couple of issues. Both involve the same call that has worked in the past. Here are the two things I'm seeing:
500 Internal Server error
When I make this call with rscorecard:
df <- sc_init() %>%
sc_filter(control == 1, region == 1:2, ccbasic == 1:24) %>%
sc_select(unitid, instnm, md_earn_wne_p10) %>%
sc_year(2009) %>%
sc_get()which translates to
https://api.data.gov/ed/collegescorecard/v1/schools.json?school.ownership=1&school.region_id__range=1..2&school.carnegie_basic__range=1..24&_fields=id,school.name,2009.earnings.10_yrs_after_entry.median&_page=0&_per_page=100&api_key=<HIDDEN>
I get a page with this message
This might be related to this error reported on the rscorecard GitHub repo.
Bad JSON elements
When I change the call to use data for 2010 instead of 2009, I get extra elements at the end of the pull. It's causing rscorecard to break, which is my issue, but since the code as worked in the past, something new is happening. Here's the API call (notice that I'm calling page=2, which returns the last 83 elements of the 283 element pull):
https://api.data.gov/ed/collegescorecard/v1/schools.json?school.ownership=1&school.region_id__range=1..2&school.carnegie_basic__range=1..24&_fields=id,school.name,2010.earnings.10_yrs_after_entry.median&_page=2&_per_page=100&api_key=<HIDDEN>
Here's the result (I've cut the result to the last 10 elements to save space and placed a ... to mark the cuts):
{
"metadata": {
"page": 2,
"total": 283,
"per_page": 100
},
"results": [
...
{
"2010.earnings.10_yrs_after_entry.median": null,
"school.name": "Pennsylvania College of Technology",
"id": 366252
},
{
"2010.earnings.10_yrs_after_entry.median": null,
"school.name": "Suffolk County Community College",
"id": 366395
},
{
"2010.earnings.10_yrs_after_entry.median": null,
"school.name": "Carroll Community College",
"id": 405872
},
{
"2010.earnings.10_yrs_after_entry.median": null,
"school.name": "Pennsylvania Highlands Community College",
"id": 414911
},
{
"2010.earnings.10_yrs_after_entry.median": null,
"school.name": "Lancaster County Career and Technology Center",
"id": 418533
},
{
"2010.earnings.10_yrs_after_entry.median": null,
"school.name": "York County Community College",
"id": 420440
},
{
"2010.earnings.10_yrs_after_entry.median": null,
"school.name": "Community College of Baltimore County",
"id": 434672
},
{
"UNITID": 475565,
"id": null,
"school.name": null,
"2010.earnings.10_yrs_after_entry.median": null
},
{
"UNITID": 479956,
"id": null,
"school.name": null,
"2010.earnings.10_yrs_after_entry.median": null
},
{
"UNITID": 480064,
"id": null,
"school.name": null,
"2010.earnings.10_yrs_after_entry.median": null
}
]
}The last three elements have an extra key UNITID and then NULL values for the rest. This causes an error in my rscorecard pull. Again, that's my issue, but it isn't something that's been a problem in the past.
Next steps
These issues only recently started happening --- I'm guessing with the big changes to the API in April. Is this something that needs to be addressed on your end or on my end with better error handling? Either way, thanks for your work on this. I'm also happy to send more info.
