Configs/Secrets Bootstrapping & Management #24
Labels
No labels
availability
bug
deployment-usability
duplicate
enhancement
help-wanted
question
security
stack-auth
stack-chat
stack-cleanup
stack-git
stack-mesh
stack-site-support
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Blocks
Depends on
#11 Generated security.txt
python-support/python-support-infra
#19 Automate Wireguard Key Generation
python-support/python-support-infra
#28 Create Bucket-Limited Tokens for Each S3-Backed Volume Mount
python-support/python-support-infra
#23 Configs/Secrets Rotation
python-support/python-support-infra
Reference: python-support/python-support-infra#24
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Both secrets and configs will be referred to as configs. The usage is identical.
We use
password-store
to provide static key-value storage for sensitive / docker-config-bound values. It has minimal attack surface, access control (by GPG key ID), and enforces encryption for all values.Manual Configs
We define manual configs as a value that the user must type.
run.sh
action, which prompts the user for all unset manual values (ex. provider tokens), and inserts them into thepassword-store
.Generated Configs
We define generated configs as having some combination of the following properties:
can_be_pregen: bool
: This config can be entirely and correctly generated before deployment onlocalhost
.true
Examples: Secret key strings, signed security.txt files.false
Examples: One-time and muilti-read reusable tokens retrieved by API.can_be_regen: bool
: This config can be regenerated to the exact same valuetrue
Examples: Multi-read tokens retrieved by API, signed security.txt files.false
Examples: One-time reusable tokens retrieved over APIs, secret key strings.expiry: datetime
: This config should not be considered valid after this date.no_cache: bool
: This config will not / should not live beyond this deploy cycle.set_fact
internally, and does not interact with the secret store.true
Examples: Swarm join tokens, access-JWT during API communication via OAuthImplementation
Thus, generating configs happens in two phases:
Pregeneration (cached): Before deployment is attempted, configs with
can_be_pregen: true
are generated and inserted intopassword-store
following these rules:can_be_regen: true
will be created, and might be rewritten.can_be_regen: false
will be created, but will never be rewritten.expiry > now
will always be rewritten, regardless of the above, even if contents change (rotation).password-store
's git features to allow for rollback if ex. the local system timezone is wrong.Hot-Path Generation (cached): During deployment, any
role
requiring a configs markedcan_be_pregen: false
, has responsibility for generating the config, and storing it correctly in thepassword-store
.can_be_regen: true
might be created (as in, writing topassword-store
is optional), and might be rewritten.userpass
incommunity.general.passwordstore lookup
can_be_regen: false
will be created, but will never be rewritten.expiry > now
will always be rewritten, regardless of the above, even if contents change (rotation).Temporary Generation (uncached): During deployment, any
role
requiring a config withno_cache: true
, may allow on Ansible variables previously set withset_fact
in anotherrole
.set_fact
for this purpose should document this.Some configs can be designed to work as any of these. We use these precedence rules:
Boiled down: Pre-generate everything, and keep the hot-path-generation to a minimum as much as possible.
Tasks
To make this reality:
run.sh
action, which leverages root & stack config files to pre-generate all described tokens correctly, and inserts them into thepassword-store
.run.sh
action, which checks all installed tokens for expiry, which errors if any are missing. This can ex. be run beforesync
, so that the user never runs the playbook without all required & unexpiredpassword-store
entries.role
deploy_config
, scan the config files to determine whichpassword-store
secrets to lookup in addition to file-based configs when installing docker configs. Generated configs are deployed alongside the usual file/templated file configs.Questions remain about how to allow hot-path configs to more easily do their own expiry checking. Do they edit their own config file or something? Solutions should be motivated by real-world cases.
password-store
.Now it should be possible to easily bootstrap all configs/secrets, know when to + actually recreate them when they expire, and with the help of #23, redeploy them with minimal downtime.
password-store
entry (ex. with therun.sh
action), then redeploying - #23 ensures that changed configs will be immediately picked up on bylookup()
and that only services that need to be restarted will be.Future Work
One might want to:
run.sh
action that removes an individual (ex. former employee) from thepassword-store
, re-encrypts it, and rotates all configs/secrets that the individual had access to, before encouraging the user to re-deploy (so that the new, rotated configs/secrets may actually be applied).run.sh
action that adds an individual (ex. former employee) to thepassword-store
and re-encrypts it.Configs/Secrets Boostrapping & Managementto Configs/Secrets Bootstrapping & Managementpassword-store
#26