<@U06HBGN1SA0> How do I update to a new flyte vers...
# ask-ai
a
@Glime How do I update to a new flyte version using terraform for flyte-core on GCP
g
To update to a new Flyte version using Terraform for Flyte-core on GCP, you would typically follow these steps: 1. Check for Updates: First, ensure that there is a new version of Flyte available that you wish to upgrade to. You can check the Flyte GitHub releases for the latest versions. 2. Update Terraform Configuration: If the new Flyte version requires changes to your Terraform configuration (for example, new variables, changes in the module versions, etc.), you should update your Terraform files accordingly. This might involve updating the Flyte Helm chart version in your Terraform configuration. The specific changes would depend on the release notes of the Flyte version you are upgrading to. 3. Apply Terraform Changes: Run
terraform plan
to see the changes that will be applied and then
terraform apply
to apply the changes. This step will update the infrastructure as needed based on the changes you made in the Terraform configuration. 4. Update Flyte Components: If the upgrade involves changes to the Flyte components themselves (for example, new Docker images for Flyte services), these changes should be reflected in the Helm chart values. When you apply the Terraform changes, it should automatically update the Flyte components if you have configured Terraform to manage your Helm releases. 5. Verify the Upgrade: After applying the changes, verify that the Flyte cluster is running the new version correctly. You can check the version of the deployed Flyte components and ensure that all services are operational. Please note, the steps above are a general guide. The specific steps for your upgrade might vary based on the Flyte version you are upgrading from and to, as well as your specific Terraform and Flyte configuration. Always refer to the release notes of the Flyte version for any version-specific upgrade instructions. Additionally, it's a good practice to perform the upgrade in a staging environment before applying it to your production environment to ensure that the upgrade process goes smoothly and does not disrupt your workflows.
d
@Andrew I guess in this case you mean upgrading to a new version of the Helm chart?
a
Yes, that’s right
@David Espejo (he/him) related question. I can't seem to get any terraform to run. Its always timing out, with this error:
Error: context deadline exceeded
Its ever since I updated and had that
flyte
flyteadmin
database issue, which I fixed in the code, but I'm not sure if I ended up in a weird state. Whenever I try to run terraform and get that timeout error, some of the pods are on
CrashLoopBackOff
, but without any real errors. Any tips?
d
hey Andrew which Pods are failing? if you run a
kubectl describe
on the failing pods, isn't there anything in the events section?
a
datacatalog
got
Init:Error
as status
flyteadmin
got
Init:CrashLoopBackOff
flytescheduler
got
CrashLoopBackOff
syncresources
got
Error
Back-off restarting failed container run-migrations in pod datacatalog
That's what all of them say, so not really an error, but something with the
run-migrations
step
I'm currently looking into everything I've added to the values file to see if anything is weird there
d
I'm curious if that container emmited any logs?
kubectl logs <flyteadmin-pod-name> -c run-migrations -n flyte
a
[error] failed to initialize database, got error failed to connect to host=<ip> user=flyte database=flyte: failed SASL auth (FATAL: password authentication failed for user "flyte" (SQLSTATE 28P01))
Ah, that's it. It's still trying to use
flyte
instead of
flyteadmin
for some reason
Copy code
additional_databases = [
    {
      name      = "flyteadmin"
      charset   = ""
      collation = ""
    }
  ]

  additional_users = [
    {
      name            = "flyteadmin"
      password        = ""
      random_password = true
    }
  ]
I did revert these after it failed when I pulled the new code. Not sure if its in a weird state or something
d
I see the DB name and user hardcoded in the values file https://github.com/unionai-oss/deploy-flyte/blob/6a6765cd4cb92fad46bb4b6466edf8f5a766bbb4/environments/gcp/flyte-core/values-gcp-core.yaml#L198-L217 So I guess the situation is You have a
flyteadmin
CloudSQL instance with a
flyteadmin
user but your values file point to a
flyte
DB with a
flyte
username. Changing your values file to reflect that should make it work. I wonder how it has worked before but let's fix this first.
a
Ahh, I missed those lines. That would make sense, actually, because it's been doing this since I updated from upstream, which pulled in that database name change. I reverted it in the sql file, but not in the values file
Ok, awesome, that fixed it
And then, as for updating the chart, I ran
helm repo update
and then
terraform apply
, does that seem right? It just listed
update in-place
as the terraform execution
d
that's a workaround yes. what do you see when yo run
helm ls -n flyte
?
a
Copy code
NAME      	NAMESPACE	REVISION	UPDATED                            	STATUS  	CHART             	APP VERSION
flyte-core	flyte    	46      	2024-04-10 10:43:34.72876 -0600 MDT	deployed	flyte-core-v1.10.0
d
so it's still 1.10. Latest is 1.11.0
a
Yes. I haven't updated it since initially setting it up. Maybe its not really required, but figured I'd try as part of the troubleshooting. But now that its working maybe its not a big deal
d
got it. But still an interesting find. I'll see how to handle it from the modules themselves