Texas appears to have implemented Cloudflare, which most of the time is killing the scraper.
Periodically, however, the scraper starts working again; it's happened twice. I'm backburnering this to work on less-dumb things.
With modifications to a library, I've got a Zyte implementation that can get me the main file to scrape -- but not the Excel files I need. Possibly the Excel files can be cured with an added Referrer tag for the headers, but I don't know.
Most likely this would require a real browser implementation, possibly using Zyte as a HTTP proxy. (And we don't have existing proxy code.) Existing implementation of a real browser approach is Virginia.
If this thing fixes itself twice a week, I'm inclined to worry about other stuff more.
Texas appears to have implemented Cloudflare, which most of the time is killing the scraper.
Periodically, however, the scraper starts working again; it's happened twice. I'm backburnering this to work on less-dumb things.
With modifications to a library, I've got a Zyte implementation that can get me the main file to scrape -- but not the Excel files I need. Possibly the Excel files can be cured with an added Referrer tag for the headers, but I don't know.
Most likely this would require a real browser implementation, possibly using Zyte as a HTTP proxy. (And we don't have existing proxy code.) Existing implementation of a real browser approach is Virginia.
If this thing fixes itself twice a week, I'm inclined to worry about other stuff more.