-
Notifications
You must be signed in to change notification settings - Fork 2
Add support for Vault to OpenBao migration #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @seunghun1ee, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust and automated solution for migrating existing HashiCorp Vault deployments to OpenBao using Ansible. It provides a structured approach to handle both single-node and High Availability Vault environments, ensuring that critical data, configurations, and TLS settings are seamlessly transitioned. The changes focus on orchestrating the storage backend migration, deploying OpenBao, and managing its lifecycle, thereby streamlining the process of adopting OpenBao. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive Ansible role and playbook for migrating from HashiCorp Vault to OpenBao. The separation of logic for single-node and HA migrations is well-designed. However, the review identified several significant issues that need to be addressed. There is a critical logic flaw in the HA migration playbook concerning how new OpenBao nodes join the cluster, which will cause the migration to fail for followers. Additionally, there are multiple instances of unreliable pauses instead of robust polling, insecure file permissions, and inconsistencies in configuration variables. The provided feedback includes specific code suggestions and recommendations to fix these issues, improve reliability, and enhance the security of the migration process.
playbooks/ha_vault_bao_migration.yml
Outdated
| openbao_raft_leaders: | ||
| - "{{ (new_leader_api_addr_query.json.leader_address if inventory_hostname == vault_raft_leader else vault_leader_api_addr) | urlsplit('hostname') }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for setting openbao_raft_leaders is incorrect for follower nodes. During the HA migration, followers are migrated first. This logic incorrectly makes them try to join the old Vault leader (vault_leader_api_addr) instead of forming a new OpenBao cluster. The first migrated follower should initialize the new OpenBao cluster, and all subsequent nodes should join it.
A more robust approach is to designate the first host in the ordered_vault_hosts group as the seed for the new OpenBao cluster. All other nodes will then join this seed node.
openbao_raft_leaders: "{{ [groups['ordered_vault_hosts'][0]] if inventory_hostname != groups['ordered_vault_hosts'][0] else [] }}"
playbooks/ha_vault_bao_migration.yml
Outdated
| ansible.builtin.wait_for: | ||
| timeout: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using wait_for with only a timeout creates an unreliable pause. A fixed delay may not be sufficient for a new leader to be elected in all environments. This is especially problematic in the context of this playbook where the step-down is executed on the last Vault node, meaning there are no other nodes to elect as a new leader. This part of the logic seems flawed and should be re-evaluated. If a pause is truly needed, ansible.builtin.pause is more explicit, but a polling mechanism that waits for a specific condition is always more robust.
|
|
||
| openbao_protocol: "{{ 'https' if openbao_tls_key and openbao_tls_cert else 'http' }}" | ||
|
|
||
| openbao_api_addr: "{{ openbao_bind_addr ~ ':' ~ openbao_api_port }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The openbao_api_addr default value is missing the protocol (http:// or https://). The OpenBao api_addr configuration parameter requires a full URL, including the protocol. This is inconsistent with the vault_api_addr definition in this same file and the documentation in the README, which specifies 'Full URL including protocol and port'.
openbao_api_addr: "{{ openbao_protocol }}://{{ openbao_bind_addr }}:{{ openbao_api_port }}"| ansible.builtin.file: | ||
| path: "{{ vault_staging_dir_path }}" | ||
| state: directory | ||
| mode: '0777' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ansible.builtin.wait_for: | ||
| timeout: 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using wait_for with only a timeout creates an unreliable pause. After restoring a snapshot, OpenBao will restart and become sealed. Instead of a fixed pause, you should poll the /v1/sys/seal-status endpoint until the sealed status is true. This reliably indicates that the restore process is complete and the instance is ready to be unsealed with its original keys.
roles/vault_bao_migration/README.md
Outdated
| 1. Print to Ansible log output (`openbao_log_keys`) | ||
| 1. Set a `openbao_keys` fact (`openbao_set_keys_fact`) | ||
| 1. Write to a file (`openbao_write_keys_file`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The numbered list is using 1. for all items. For clarity and correctness in Markdown, it should be an incrementing list (1., 2., 3.).
| 1. Print to Ansible log output (`openbao_log_keys`) | |
| 1. Set a `openbao_keys` fact (`openbao_set_keys_fact`) | |
| 1. Write to a file (`openbao_write_keys_file`) | |
| 1. Print to Ansible log output (`openbao_log_keys`) | |
| 2. Set a `openbao_keys` fact (`openbao_set_keys_fact`) | |
| 3. Write to a file (`openbao_write_keys_file`) |
roles/vault_bao_migration/README.md
Outdated
| tasks: | ||
| - name: Enable bao secrets engines | ||
| hashivault_secret_engine: | ||
| url: "https://vault.example.com:8200" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example URL https://vault.example.com:8200 could be confusing in a playbook for OpenBao post-deployment configuration. To avoid ambiguity, consider changing vault.example.com to something that more clearly represents OpenBao, such as bao.example.com.
| url: "https://vault.example.com:8200" | |
| url: "https://bao.example.com:8200" |
| - name: Check the number of the raft peers | ||
| ansible.builtin.shell: | ||
| cmd: > | ||
| docker exec -e VAULT_TOKEN=$VAULT_TOKEN -e VAULT_ADDR=$VAULT_ADDR | ||
| {% if vault_tls_ca != "" %} -e VAULT_CAPATH=$VAULT_CAPATH {% endif %} | ||
| {{ vault_docker_name }} | ||
| vault operator raft list-peers -format=json | jq -r '.data.config.servers | length' | ||
| register: vault_raft_peer_length | ||
| environment: | ||
| VAULT_ADDR: "{{ vault_api_addr }}" | ||
| VAULT_TOKEN: "{{ vault_root_token }}" | ||
| VAULT_CAPATH: "{{ '/vault/config/' + vault_tls_ca if vault_tls_ca != '' else omit }}" | ||
| changed_when: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using ansible.builtin.shell with jq introduces an external dependency on jq and can be less robust than using Ansible's built-in capabilities. You can achieve the same result by using the community.docker.docker_container_exec module to get the raw JSON output and then processing it with Ansible's from_json and length filters. This approach is more idiomatic to Ansible and removes the external dependency.
| owner_id: 100 | ||
| group_id: 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoding the owner_id and group_id is brittle, as these numeric IDs can vary across systems. If possible, use user and group names (e.g., openbao). If numeric IDs are necessary for this module, consider defining them as variables in defaults/main.yml to make them more manageable and easier to override.
| - name: Unseal OpenBao | ||
| ansible.builtin.import_role: | ||
| name: stackhpc.hashicorp.vault_unseal | ||
| vars: | ||
| vault_api_addr: "{{ openbao_api_addr }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This task to unseal OpenBao implicitly relies on the vault_unseal_keys variable being populated with the original Vault unseal keys. This is not obvious from the playbook and can lead to failures if the variable is not set correctly by the user. It would be good practice to:
- Add a check to ensure
vault_unseal_keysis defined and not empty before attempting to unseal. - Document this requirement clearly in the role's
README.mdfile.
dc56f29 to
a8a4561
Compare
and fix the old variable
Since ha_migration.yml is a separate playbook, migration_* variables from vault_bao_migration role may not be defined when the playbook is being used. Added default values to make the playbook safe to run without optional variables.
b145f17 to
172a03e
Compare
To acurately compare hostname and vault ha-status results
172a03e to
184de94
Compare
00d6d48 to
17f27f2
Compare
This option needs to be explictly set when using integrated storage backend (Raft) from Vault 1.20
17f27f2 to
e5b653f
Compare
As OpenBao's migration guide [1] is only tested against Vault 1.14.1 and OpenBao 2.2.0 [1] https://openbao.org/docs/guides/migration/
This reverts commit e5b653f.
No description provided.