cuddly-application-93412
07/14/2023, 8:44 PMkind-kite-58745
07/16/2023, 1:01 PM{
flyteadmin = [
"iam.serviceAccounts.signBlob",
"storage.buckets.get",
"storage.objects.create",
"storage.objects.delete",
"storage.objects.get",
"storage.objects.getIamPolicy",
"storage.objects.update",
],
flytepropeller = [
"storage.buckets.get",
"storage.objects.create",
"storage.objects.delete",
"storage.objects.get",
"storage.objects.getIamPolicy",
"storage.objects.update",
],
flytescheduler = [
"storage.buckets.get",
"storage.objects.create",
"storage.objects.delete",
"storage.objects.get",
"storage.objects.getIamPolicy",
"storage.objects.update",
],
datacatalog = [
"storage.buckets.get",
"storage.objects.create",
"storage.objects.delete",
"storage.objects.get",
"storage.objects.update",
],
flyteworkflow = [
"storage.buckets.get",
"storage.objects.create",
"storage.objects.delete",
"storage.objects.get",
"storage.objects.list",
"storage.objects.update",
],
}
ingress-nginx helm_release from <https://kubernetes.github.io/ingress-nginx>
chart name ingress-nginx
(i use version 4.0.13
)
cert-manager helm_release from <https://charts.jetstack.io>
chart name cert-manager
version v1.12.0
Note that cert-manager here is 1.12.0 instead of 0.12.0 that was used in the documentation example, that’s because we need it to be compatible with newer versions of kubernetes
flyte-core helmchart from <https://flyteorg.github.io/flyte>
chart name flyte-core
with your preferred flyte version, I use 1.7.0. Use the values from https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/values-gcp.yaml
I recommend passing helm values using templatefile to allow dynamic configuration based on terraform values, here’s my example:
values = templatefile("../infra-root-modules/helm-values/flyte.yaml", {
project_id = var.gcp_project
db_host = module.flyte-psql-instance[0].private_ip_address
db_password = sensitive(var.flyte_cluster_secrets["${var.environment}/flyte_sql_root_pw"]) # I use carplett sops provider for secrets, handle this however you prefer
storage_bucket = module.flyte-storage[0].name
host_name = "flyte.${var.environment}.${var.domain}"
})
From my initial observation it seems that Flyte will automatically create the Certificate resource, but it’s a custom resource installed by cert-manager, so make sure you pass the helm value installCRDs: true
to your cert-manager and have flyte helm_release depends_on the cert-manager helm_release so it will be able to create the Certificate. You’ll also need to set up an Issuer first, but in my case I prefer ClusterIssuer
because it lets you separate cert-manager and flyte’s namespaces. Use kubectl provider or kubernetes_manifest resource with kubernetes provider to make something like this (in my case I used a templatefile, so there are some placeholder values):
apiVersion: <http://cert-manager.io/v1|cert-manager.io/v1>
kind: ClusterIssuer
metadata:
name: letsencrypt-production
spec:
acme:
server: <https://acme-v02.api.letsencrypt.org/directory>
email: ${email}
privateKeySecretRef:
name: letsencrypt-production
solvers:
- selector: {}
http01:
ingress:
class: ${ingress_class}
If using GKE Autopilot, you’ll need to set this for cert-manager values file as well (replace placeholder):
global:
leaderElection:
namespace: ${certmanager_namespace}
This lets cert-manager create leases in a namespace other than kube-system, because GKE Autopilot restricts access to kube-system namespace. You will also probably need to configure at least 500m cpu requests in your flyte helm values.yaml, because it uses pod anti affinity which requires a minimum of 500m CPU requests when using GKE Autopilot
Your helm install will probably fail because dns is not set up, it seems necessary for the ingress to work, which Flyte also uses. Now you should set up DNS however you like, I use cloudflare for this.
Next there’s the cloudsql database. Create a google_sql_database_instance
, create a google_sql_database
named ‘flyteadmin’, and create a google_sql_user
. Configure the user’s name and the host IP address outputs from terraform with flyte using templatefile on your values.yaml file. It seems we can’t use dns names here (or the connection name), at least not with private IPs, so I am using a static IP address for now.
Next there’s the GCS bucket, a simple google_storage_bucket
I currently have everything set up except for the DNS, connecting it at the moment. I’ll update here if there’s anything else worth mentioning about this process. Hope this helpscuddly-application-93412
07/16/2023, 2:15 PMcuddly-application-93412
07/16/2023, 2:16 PMcuddly-application-93412
07/16/2023, 2:17 PMkind-kite-58745
07/16/2023, 5:29 PMresource "cloudflare_record" "flyte" {
zone_id = var.cloudflare_zone_id
name = local.flyte_host
value = data.kubernetes_service.nginx-lb.status[0].load_balancer[0].ingress[0].ip
type = "A"
ttl = 3600
priority = 10
proxied = false
}
data "kubernetes_service" "nginx-lb" {
metadata {
name = "${module.nginx-ingress[0].release_name}-ingress-nginx-controller"
namespace = module.nginx-ingress[0].namespace
}
depends_on = [module.nginx-ingress]
}
Note that I used modules wrapping the charts instead of the helm_release directly. You don’t need to do this, you can use the helm_release directly, this is something unique to my own use case because of other, unrelated requirements.
The idea is that you add a datasource for the ingress-nginx loadbalancer service, configured with the service name (which is going to have the release name as its prefix) and the namespace where you expect it to be.
Then you can get its IP address using data.kubernetes_service.nginx-lb.status[0].load_balancer[0].ingress[0].ip
If using cluster issuer, you have to make sure you configured the annotations of the ingress resources in the flyte helm values accordingly. Example from my flyte-core values.yaml templatefile:
common:
ingress:
host: "{{ .Values.userSettings.hostName }}"
tls:
enabled: true
annotations:
<http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: nginx
<http://nginx.ingress.kubernetes.io/ssl-redirect|nginx.ingress.kubernetes.io/ssl-redirect>: "true"
<http://cert-manager.io/cluster-issuer|cert-manager.io/cluster-issuer>: "letsencrypt-production"
<http://nginx.ingress.kubernetes.io/whitelist-source-range|nginx.ingress.kubernetes.io/whitelist-source-range>: ${whitelisted_cidrs}
Use either <http://cert-manager.io/cluster-issuer|cert-manager.io/cluster-issuer>
or <http://cert-manager.io/issuer|cert-manager.io/issuer>
depending on your choice with the kubernetes_manifest, has to matchcuddly-application-93412
07/17/2023, 2:29 AMcuddly-application-93412
07/17/2023, 2:33 AMcuddly-application-93412
07/17/2023, 2:26 PMcuddly-application-93412
07/17/2023, 2:26 PMkind-kite-58745
07/20/2023, 1:25 PMgoogle_container_cluster
should be configured with a workload_identity_config
input like this:
resource "google_container_cluster" "gke" {
...
workload_identity_config {
workload_pool = "${var.gcp_project}.svc.id.goog"
}
...
}
Then create the service accounts of each flyte component by looping for_each on the local map I shared above:
resource "google_service_account" "flyte_sa" {
for_each = local.service_accounts
account_id = each.key
display_name = each.key
project = var.gcp_project
}
Add the custom role for each:
resource "google_project_iam_custom_role" "flyte_role" {
for_each = local.service_accounts
title = each.key
project = var.gcp_project
permissions = each.value
role_id = "${each.key}_${random_string.role_id_suffix.id}", #roles are not deleted immediately behind the scenes so name should be unique, use random_string resource to generate a suffix
}
Bind the gcp roles to the gcp service accounts:
resource "google_project_iam_member" "membership" {
for_each = local.service_accounts
project = var.gcp_project
role = google_project_iam_custom_role.flyte_role["${each.key}"].name
member = "serviceAccount:${google_service_account.flyte_sa["${each.key}"].email}"
}
Create bindings to allow kubernetes serviceaccounts (kind: ServiceAccount) to use workload identity permissions:
resource "google_service_account_iam_member" "flyteworkflow_sa_binding" {
for_each = toset(["development", "staging", "production"])
service_account_id = google_service_account.flyte_sa["flyteworkflow"].id
role = "roles/iam.workloadIdentityUser"
member = "serviceAccount:${var.gcp_project}.svc.id.goog[${each.key}/default]"
}
These all loop on the same map so it’s a good idea to put them in a terraform module and to the for_each once on the module call, I’m not doing it here to keep the example simple
With this IAM setup, flyte works for me from terraformcuddly-application-93412
07/21/2023, 6:42 PM