I’m trying to run the above gcp terraform. I got t...
# flyte-on-gcp
I’m trying to run the above gcp terraform. I got these three errors. I’m not sure why it didn’t find the flyte namespace, I was able to point to it with
. And I can’t find those two bucket names to know where to change them. Any ideas?
Copy code
│ Error: namespaces "flyte" not found
│   with kubernetes_secret.flyte-tls-secret,
│   on <http://ingress.tf|ingress.tf> line 40, in resource "kubernetes_secret" "flyte-tls-secret":
│   40: resource kubernetes_secret "flyte-tls-secret" {
│ Error: googleapi: Error 409: The requested bucket name is not available. The bucket namespace is shared by all users of the system. Please select a different name and try again., conflict
│   with module.flyte_data.google_storage_bucket.buckets["flyte-gcp-data"],
│   on .terraform/modules/flyte_data/main.tf line 40, in resource "google_storage_bucket" "buckets":
│   40: resource "google_storage_bucket" "buckets" {
│ Error: googleapi: Error 409: The requested bucket name is not available. The bucket namespace is shared by all users of the system. Please select a different name and try again., conflict
│   with module.flyte_user_data.google_storage_bucket.buckets["flyte-gcp-user-data"],
│   on .terraform/modules/flyte_user_data/main.tf line 40, in resource "google_storage_bucket" "buckets":
│   40: resource "google_storage_bucket" "buckets" {
ok, good findings. Thanks! 1. For the secret issue, I'll try with a explicit dependency on the ns resource 2. Right. GCS bucket names have to be globally unique, so I'll add a randomized suffix to avoid this situation PR to come
Awesome, thanks!
please fetch. I just pushed a commit. Basically using the PROJECT_NUMBER to build a globally-unique ID for GCS buckets. Updated the instructions too
Trying again. What’s the best way to ignore already existing resources? For example service accounts it created on the first apply
Terraform should ignore them by default
Oh, for some reason it errored out on service accounts
One example:
Copy code
│ Error: Error creating service account: googleapi: Error 409: Service account flyte-gcp-flyteadmin already exists within project projects/dai-ml-pipelines.
│ Details:
│ [
│   {
│     "@type": "type.googleapis.com/google.rpc.ResourceInfo",
│     "resourceName": "projects/<project>/serviceAccounts/flyte-gcp-flyteadmin@<project>.iam.gserviceaccount.com"
│   }
│ ]
│ , alreadyExists
│   with google_service_account.flyteadmin-gsa,
│   on iam.tf line 15, in resource "google_service_account" "flyteadmin-gsa":
│   15: resource "google_service_account" "flyteadmin-gsa" {
wow that's weird
let's terraform destroy and try again?
sounds good
destroy didn’t entirely work, and now I think I’m in a weird state. The main issue seems to be the VPC network
ye, the VPC network most likely has dependencies. you can delete it from the UI, then destroy again.
It says
The auto-generated peering route cannot be deleted.
when I try to delete, missed that part
But I don’t see any peering routes
go into the VPC network : VPC peering and there's one
I think I’d already done that. It seems like that line is just an artifact left over, I’m trying an apply again
I think its up and running, now! I’ll have to come back to it next week to test it out and try some workflows. Thanks for all the help
great to hear. any problem that arises, just let me know, I hope to keep improving these modules
one other quick question. since I didn’t use helm directly, where it talks about authentication, and changing the helm values file, how would that work? I know there’s the
but do I helm install with that?
you should use that values file yes, but apply it with
terraform apply
Hey, another follow up to this. The console was working fine for a bit, and then it stopped loading and when I checked the logs for the admin pods (which show
), I got this:
Copy code
Defaulted container "flyteadmin" out of: flyteadmin, run-migrations (init), seed-projects (init), sync-cluster-resources (init), generate-secrets (init)
time="2023-11-28T00:11:12Z" level=info msg="Using config file: [/etc/flyte/config/cluster_resources.yaml /etc/flyte/config/clusters.yaml /etc/flyte/config/db.yaml /etc/flyte/config/domain.yaml /etc/flyte/config/remoteData.yaml /etc/flyte/config/server.yaml /etc/flyte/config/storage.yaml /etc/flyte/config/task_resource_defaults.yaml]"
Error: [CERTIFICATE_FAILURE] failed to load X509 key pair: , caused by: open : no such file or directory
[CERTIFICATE_FAILURE] failed to load X509 key pair: , caused by: open : no such file or directory
  flyteadmin serve [flags]

  -h, --help   help for serve

Global Flags:
      --admin.audience string                                                      Audience to use when initiating OAuth2 
... < a lot of doc output here>

settings for file-filtered logging

panic: [CERTIFICATE_FAILURE] failed to load X509 key pair: , caused by: open : no such file or directory

goroutine 1 [running]:
        /go/src/github.com/flyteorg/flyteadmin/cmd/main.go:14 +0x9f
sorry, can you revert
? It should be overrriden by the
insecure: false
flag in your local config file
perfect, I think that was it, its up again right now. thanks!
great. I'm curious, does the console show you a valid certificate? It should do so, but just checking 🙂
Looks like it is, its showing as secured 🙂