-
Notifications
You must be signed in to change notification settings - Fork 28
BB2-4309 / BB2-4331 upgrading libraries / improving python package management #1442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The unit tests fail. I'm not yet sure why. The localstack will come up, but the unit tests do not configure their DB correctly, and crash out. However, this *seems* to be a working version set for Django 4.2.27 and other required updates.
The primary change is that we're not using the vendored files, but instead using the .txt directly. Therefore, the container build now runs pip-install, and pulls down packages. Apparently, I was not running `makemigrations` as part of the startup. That's an oversight. We should always makemigrations/migrate on every standup.
This doesn't matter, but the local dockerfiles (and, in production, where appropriate) should always use the same versions for the base image. Although it does not matter for MSLS, we already have the image for other builds, so switching from slim to full Debian does not matter.
This should hold us to where we were. However, our build/deploy process has to change for this to be picked up. That said, as a test, this would basically... change the build/deploy process, but no libraries. That would probably be the safest place to start. Then, we can start bumping libraries towards a full Django 5.2 move.
As part of this, we're using the `requirements.txt` to provide our SHAs, and we're going to move towards `--binary :all:` in our build. It will not be a single step, but part of it is that we no longer carry the library binaries in the tree.
This bumps to Django 4.2.26 -> 4.2.27 and sqlparse 0.5.0 -> 0.5.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: the changes in this file are a forward-looking pointer/comment for where we want to go next in terms of build, and a change to how we pip install inside of our local container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: the local build of containers does not need to include the ECR Selenium image. It was commented out regardless; removing this removes noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: this was missing. When standing up our local stack, we can and should always run makemigrations followed by migrate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: there was a point-in-time where some libraries updated, and this configuration variable was declared deprecated. We can leave it commented out, or we can bring it back... but it will have to go away as we upgrade core Django libraries.
Use the same container image/version of Python everywhere in the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: I'm trying to have us use the same container/version of Python everywhere in our local stack, for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: this is described in the PR. It is the container that is used to compile the requirements.txt file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: this is the script that compiles our requirements files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: a few things going on here.
- Alphabetized. We should keep these files sorted, always. It makes navigating them easier.
- Everything is pinned. This will allow us to very intentionally upgrade libraries/sets of libraries.
This file is used locally, not in production. It subsumes everything in requirements.in.
For review: check that the same number of libraries are present and, if you're feelin' it, make sure every library on the left is present on the right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are all present!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: similarly.
- Alphabetized.
- All requirements are pinned. This includes libraries that are not directly imported.
To check this file, I would look at the requirements.txt in master. Then, check every library, and make sure that the versions match. The intent is that this requirements.in pins all of the versions in the existing requirements.txt.
There should only be two differences:
- Django 4.2.27
- sqlparse 0.5.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was able to verify functionality locally and tested the requirements generation with a minor change. I changed cryptography == 44.0.3 in requirements.in and re-ran make generate and was able to get everything spun back up. (Not including that upgrade here yet as I'm guessing it might happen with Django 5.2 and it's beyond the scope of the ticket, but this otherwise makes the upgrade process easier.) Great work @jadudm !
improving python package management
The primary focus of 4309 and 4331 were to (respectively):
The primary focus of the work of this ticket was making transparent how Python packages are versioned/managed in our build and deployment process.
change to versioning: everything is pinned
A significant change in this PR is that every requirement in both
requirements.inandrequirements.dev.inare now pinned (e.g.==) to a very specific version of a library. This includes libraries that we don't explicitly require, but are implicitly pulled in. This double-layer of pinning is to make sure nothing moves underneath us (for now).For example, in
requirements/requirements.in, we set constraints on what versions we want for various libraries supporting the application. Previously, we might have expressed things likeThis would "pin"
crypto, allowhtmlto upgrade (as long as it is at least v2.0), and thetacoslibrary could be anything. This can lead to surprising behavior. For us, it did not, because...removal of the
vendordirectoryThe
vendordirectory contained binary versions of our requirements. Literally, the compiled libraries of Python bytecode (and native libraries, I assume, from C). Because of how we do our installs in dev/prod, this is essentially an implicit set of "pins" in our installation. As a result, we did not have to explicitly pin everything inrequirements.in.By removing reliance on a
vendordirectory, we lost those constraints, and almost everything in our stack wanted to upgrade. This led to some breaking behaviors. Hence, the fix was to "back-port" the pins from ourrequirements.txtthat is in production into therequirements.inthat is in this PR.change to how we compile the requirements
The requirements were generated by hand. Now, you should be able to
This will create a container image based on the same version of Python we're running in prod (and the same version we're running locally), and inside of. that container, it will run
pip-compile. This is essentially what we did before, but it automates the work and it makes sure we are all running the compilation of the requirements in the same way every time.When we want to (say) upgrade Python, we will update the container spec (along with our local stack), and re-run
make generateto generate new requirements files.change to how we run the local stack
The last step of
make build(in the app container) is to now runpip installwith the following flags:(
requirements.dev.txtincludesrequirements.txt)--require-hashesmakes sure we are checking hashes on everything we pull in, so that if something changes upstream, and there is a hash conflict between the local.txtand the library we pull, the build will fail--no-depstellspip installto not install needed libraries automatically. Why? Because if we do not specify it in therequirements.txtfile, we do not want it in our build.--prefer-binarysays we prefer binary.whlfiles whenever possible. We have one or two libraries that do not have a.whl, so we cannot (yet) go to--only-binary :all:. This is our goal/target, but not in this PR. Doing so is the most secure posture we can take.pending: change to how we deploy
We need to update the deployment repo to run
pip installin the same way as described above.upgrade Django 4.2.26 -> 4.2.27, sqlparse 0.5.0 -> 0.5.4
It is possible to diff the prior and current version and see that only these two libraries changed. After the update, all tests continue to pass. The local stack/test client continue to behave normally.
next steps
Our next steps are:
urllib == 2.6.0Each of these are larger upgrades, but we can now selectively unpin requirements, and in doing so, control the rate at which we make changes to the infrastructure under the application.
JIRA Ticket:
BB2-4309
BB2-4331
What Does This PR Do?
What Should Reviewers Watch For?
Can you, with a clean environment (remove containers, images, volumes)...
make build-localmake run-local bfd=... auth=...python runtests.pyreturns 322 or so passing tests, no failuresThere should be no migrations in this PR.
Validation
See above.
What Security Implications Does This PR Have?
Please indicate if this PR does any of the following:
security engineer's approval.
This change has been discussed in some depth with our security team. We agree that this does not diminish our posture, and if anything, moves us to a more secure posture. It changes how we are getting our libraries into the stack, but we are still using hash-based versioning/checking; where we cannot use binary
.whlfiles under this approach (meaning, where we must use a source-code library), it is not "more insecure," as we were pulling the.tar.gzfor those libraries previously. We can now begin systematically removing/switching those libraries using our improved pinning approach.We will look for an explicit approving comment from a security engineer on this PR.
Any Migrations?
There should be no migrations. Please verify.