Hey everyone; busy playing with Flyte and currentl...
# ask-the-community
g
Hey everyone; busy playing with Flyte and currently considering what the workflow should look like to have workflows progress from development -> staging -> production. Currently what I have in mind is the following: 1.
development
contains workflows currently being worked on, including ad-hoc registrations resulting from users running
pyflyte run --remote …
; these are run against corresponding development environments of other apps. 2.
staging
contains workflows that are versioned (incl. prerelease versions) for deployment against other corresponding staging environments. Essentially this would be a dedicated environment for QA without all the noise of development and to test rollout. 3.
production
has the same versioned workflows as
staging
for deployment against the corresponding official production environment. The above is aspirational and I was really glad to see Flyte’s separation of domains per project; however I am missing some things to make this happen and would like to better understood how Flyte is intended to be used for this: 1. Development and Staging are fundamentally separate if all ways – they deploy separate code that is not “promoted” from one environment to the next. On the other hand, Production should run identical code without needing to rebuild/register the workflows after a particular version is cleared to be deployed to production. What is the appropriate way to deal with this? Or is it appropriate to simply reserialize and register the workflow again for the production domain? 2. As each environment is separate, there are separate secrets to be retrieved from an appropriate plugin; however I haven’t found a way to configure this at runtime, i.e. per domain without needing to hard-code the path to those secrets. Is there an idiomatic way to do this? Obviously hard-coding the secret path is not desirable as it would make it difficult to keep the code between staging and production identical (see previous point). Any advice appreciated
g
@Lee Ning Jie Leon much appreciated; that is definitely useful for Python environments. Would you happen to know how I could supply that information to Secrets being used as part of something like a
SQLAlchemyTask
? There isn’t an accessible python runtime for that so one can’t access
current_context()
l
not sure if i follow, but you can pass in secret_connect_args, like this block. im not sure if
flytekit.current_context().execution_id.domain
works outside tasks if it doesn't there is an env var too, not sure if this is recommended though. https://flyte-org.slack.com/archives/CP2HDHKE1/p1691687747543199?thread_ts=1691687330.613679&cid=CP2HDHKE1
g
The trouble with
secret_connect_args
is the secrets are passed to the
@task
annotation, which AFAICS is finalized at build-time… I guess there isn’t a way to specify different secrets based on the domain which a workflow is deployed to? Now that I think about it, even getting secrets from
current_context()
still requires specifying those secret requests on the
@task
beforehand anyway. I would need to load secrets for all environments into the task to have a switch based on
flytekit.current_context().execution_id.domain
l
probably a few way that it can be done but that it might not be the cleanest way. wait for someone who knows better to comment 😅
s
1. You'll need to register the tasks/workflows for them to be available in the production domain (you can use the same version ID). You can also reference your "staging" launch plans in the production domain, but I'm not sure if it's applicable in this case as it makes more sense to have the tasks/workflows available in production 2. Does it make sense to create different secrets in all your namespaces under the same group-key? This works if you want to retrieve the same set of secrets in every namespace. If that's not the case, I think you can give pod templates a try: https://discuss.flyte.org/t/9984966/hello-i-need-to-create-secrets-and-access-them-in[…]-am?threads%5Bquery%5D=secrets%20in%20different%20namespaces
g
@Samhita Alla that makes sense for
k8s
secrets, however I am using Hashicorp Vault secrets which means that either the
group
or the
key
will need to change between environments because they are not automatically separated per namespace. I was just about to experiment with using something like the Vault Operator/Injector to create secrets automatically in each namespace and then reference those secrets from within Flyte as
k8s
secrets… I’m sure that will work but it is unfortunate that I cannot seem to use the Vault plugin out of the box. I have also had other difficulties with it.
s
I'll wait for input from @Dan Rammer (hamersaw).
d
@Greg Linklater from the comment on the issue it sounds like this is not a priority for you anymore? Unfortunately with limited resources we need to prioritize our efforts. I think in the case, there is a relatively clear solution to update the SQLAlchemy plugin to support vault "db" secrets. However, right now I do not think we have the bandwidth to hack this together, but will certainly keep it in the backlog. This is the perfect type of issue for a community contribution if anyone else stumbles on it and would be interested.
g
@Dan Rammer (hamersaw) ACK. Right now using Vault Operator to automatically create and manage k8s secrets for Flyte appears to be a path forward — or at least to unblock me and resolve #2 above; however it is still quite strange that versioned workflows can’t be used across domains or promoted from one domain to another (see #1) and it is very difficult to supply environmental information (credentials, connection strings, etc.) to a workflow at runtime. I am not sure what purpose domains are supposed to solve or how they are intended to be used.