Configs/Secrets Bootstrapping & Management #24
Labels
No Label
availability
bug
deployment-usability
duplicate
enhancement
help-wanted
question
security
stack-auth
stack-chat
stack-cleanup
stack-git
stack-mesh
stack-site-support
wontfix
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Blocks
Depends on
#28 Create Bucket-Limited Tokens for Each S3-Backed Volume Mount
python-support/python-support-infra
#23 Configs/Secrets Rotation
python-support/python-support-infra
Reference: python-support/python-support-infra#24
Loading…
Reference in New Issue
There is no content yet.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may exist for a short time before cleaning up, in most cases it CANNOT be undone. Continue?
Both secrets and configs will be referred to as configs. The usage is identical.
We use
password-store
to provide static key-value storage for sensitive / docker-config-bound values. It has minimal attack surface, access control (by GPG key ID), and enforces encryption for all values.Manual Configs
We define manual configs as a value that the user must type.
run.sh
action, which prompts the user for all unset manual values (ex. provider tokens), and inserts them into thepassword-store
.Generated Configs
We define generated configs as having some combination of the following properties:
can_be_pregen: bool
: This config can be entirely and correctly generated before deployment onlocalhost
.true
Examples: Secret key strings, signed security.txt files.false
Examples: One-time and muilti-read reusable tokens retrieved by API.can_be_regen: bool
: This config can be regenerated to the exact same valuetrue
Examples: Multi-read tokens retrieved by API, signed security.txt files.false
Examples: One-time reusable tokens retrieved over APIs, secret key strings.expiry: datetime
: This config should not be considered valid after this date.no_cache: bool
: This config will not / should not live beyond this deploy cycle.set_fact
internally, and does not interact with the secret store.true
Examples: Swarm join tokens, access-JWT during API communication via OAuthImplementation
Thus, generating configs happens in two phases:
Pregeneration (cached): Before deployment is attempted, configs with
can_be_pregen: true
are generated and inserted intopassword-store
following these rules:can_be_regen: true
will be created, and might be rewritten.can_be_regen: false
will be created, but will never be rewritten.expiry > now
will always be rewritten, regardless of the above, even if contents change (rotation).password-store
's git features to allow for rollback if ex. the local system timezone is wrong.Hot-Path Generation (cached): During deployment, any
role
requiring a configs markedcan_be_pregen: false
, has responsibility for generating the config, and storing it correctly in thepassword-store
.can_be_regen: true
might be created (as in, writing topassword-store
is optional), and might be rewritten.userpass
incommunity.general.passwordstore lookup
can_be_regen: false
will be created, but will never be rewritten.expiry > now
will always be rewritten, regardless of the above, even if contents change (rotation).Temporary Generation (uncached): During deployment, any
role
requiring a config withno_cache: true
, may allow on Ansible variables previously set withset_fact
in anotherrole
.set_fact
for this purpose should document this.Some configs can be designed to work as any of these. We use these precedence rules:
Boiled down: Pre-generate everything, and keep the hot-path-generation to a minimum as much as possible.
Tasks
To make this reality:
run.sh
action, which leverages root & stack config files to pre-generate all described tokens correctly, and inserts them into thepassword-store
.run.sh
action, which checks all installed tokens for expiry, which errors if any are missing. This can ex. be run beforesync
, so that the user never runs the playbook without all required & unexpiredpassword-store
entries.role
deploy_config
, scan the config files to determine whichpassword-store
secrets to lookup in addition to file-based configs when installing docker configs. Generated configs are deployed alongside the usual file/templated file configs.Questions remain about how to allow hot-path configs to more easily do their own expiry checking. Do they edit their own config file or something? Solutions should be motivated by real-world cases.
password-store
.Now it should be possible to easily bootstrap all configs/secrets, know when to + actually recreate them when they expire, and with the help of #23, redeploy them with minimal downtime.
password-store
entry (ex. with therun.sh
action), then redeploying - #23 ensures that changed configs will be immediately picked up on bylookup()
and that only services that need to be restarted will be.Future Work
One might want to:
run.sh
action that removes an individual (ex. former employee) from thepassword-store
, re-encrypts it, and rotates all configs/secrets that the individual had access to, before encouraging the user to re-deploy (so that the new, rotated configs/secrets may actually be applied).run.sh
action that adds an individual (ex. former employee) to thepassword-store
and re-encrypts it.Configs/Secrets Boostrapping & Managementto Configs/Secrets Bootstrapping & Management