cool-lifeguard-49380
10/19/2022, 8:57 AMflytectl get projects
, am redirected to the browser, and then retrieve the projects. pyflyte register ...
works as well. So far so good.
When I manually launch a registered workflow from the console, it doesn't start however.
k -n development describe <http://flyteworkflows.flyte.lyft.com|flyteworkflows.flyte.lyft.com> ah9ghn272q7v22dbsvvd
gives:
Status:
Failed Attempts: 9
Message: Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake"]
Phase: 0
Tasks:
resource_type:TASK project:"3dod" domain:"development" name:"project.workflows.workflow.deploy_model" version:"2"
Details about config in 🧵
A pointer in the right direction would be much appreciated.cool-lifeguard-49380
10/19/2022, 8:58 AME1019 08:50:19.766619 1 workers.go:102] error syncing 'development/arzsbvcrdhd8jdbp2vdd': Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake"]
E1019 08:50:19.776908 1 workers.go:102] error syncing 'development/aprk6hb8gxdcj882njpf': Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: first record does not look like a TLS handshake"]
{"json":{"exec_id":"ah9ghn272q7v22dbsvvd","ns":"development","res_ver":"1249185","routine":"worker-37","wf":"3dod:development:project.workflows.workflow.pipeline"},"level":"warning","msg":"Event recording failed. Error [EventSinkError: Error sending event, caused by [rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: tls: first record does not look like a TLS handshake\"]]","ts":"2022-10-19T08:50:49Z"}
cool-lifeguard-49380
10/19/2022, 8:59 AMapiVersion: v1
data:
cluster_resources.yaml: |
cluster_resources:
customData:
- production:
- projectQuotaCpu:
value: "5"
- projectQuotaMemory:
value: 4000Mi
- gsa:
value: gsa-production@<project>.<http://iam.gserviceaccount.com|iam.gserviceaccount.com>
- staging:
- projectQuotaCpu:
value: "2"
- projectQuotaMemory:
value: 3000Mi
- gsa:
value: gsa-staging@<project>.<http://iam.gserviceaccount.com|iam.gserviceaccount.com>
- development:
- projectQuotaCpu:
value: "2"
- projectQuotaMemory:
value: 3000Mi
- gsa:
value: gsa-development@<project>.<http://iam.gserviceaccount.com|iam.gserviceaccount.com>
refreshInterval: 5m
standaloneDeployment: false
templatePath: /etc/flyte/clusterresource/templates
db.yaml: |
database:
dbname: flyteadmin
host: '10.22.0.3'
passwordPath: /etc/db/pass.txt
port: 5432
username: flyteadmin
domain.yaml: |
domains:
- id: development
name: development
- id: staging
name: staging
- id: production
name: production
namespace_config.yaml: |
namespace_mapping:
template: '{{ domain }}'
remoteData.yaml: |
remoteData:
scheme: gcs
signedUrls:
durationMinutes: 3
server.yaml: |
auth:
appAuth:
thirdPartyConfig:
flyteClient:
clientId: flytectl
redirectUri: <http://localhost:53593/callback>
scopes:
- offline
- all
authorizedUris:
- <https://flyte.company.com>
- <https://localhost:30081>
- <http://flyteadmin:80>
- <http://flyteadmin.flyte.svc.cluster.local:80>
userAuth:
openId:
baseUrl: <https://accounts.google.com>
clientId: <client id>
scopes:
- profile
- openid
flyteadmin:
eventVersion: 2
metadataStoragePrefix:
- metadata
- admin
metricsScope: 'flyte:'
profilerPort: 10254
roleNameKey: <http://iam.amazonaws.com/role|iam.amazonaws.com/role>
testing:
host: <http://flyteadmin>
server:
grpcPort: 8089
httpPort: 8088
security:
allowCors: true
allowedHeaders:
- Content-Type
allowedOrigins:
- '*'
secure: false
useAuth: true
storage.yaml: |
storage:
type: stow
stow:
kind: google
config:
json: ""
project_id: <project>
scopes: <https://www.googleapis.com/auth/devstorage.read_write>
container: "company-flyte-bucket"
enable-multicontainer: false
limits:
maxDownloadMBs: 10
task_resource_defaults.yaml: |
task_resources:
defaults:
cpu: 500m
memory: 500Mi
storage: 500Mi
limits:
cpu: 2
gpu: 1
memory: 1Gi
storage: 2000Mi
kind: ConfigMap
metadata:
annotations:
<http://meta.helm.sh/release-name|meta.helm.sh/release-name>: flyte
<http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: flyte
creationTimestamp: "2022-10-18T18:09:45Z"
labels:
<http://app.kubernetes.io/instance|app.kubernetes.io/instance>: flyte
<http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
<http://app.kubernetes.io/name|app.kubernetes.io/name>: flyteadmin
<http://helm.sh/chart|helm.sh/chart>: flyte-core-v1.2.0
name: flyte-admin-base-config
namespace: flyte
resourceVersion: "780983"
uid: b5da3e3e-35d2-406b-953e-ad568179a646
boundless-pizza-95864
10/19/2022, 9:55 AMcool-lifeguard-49380
10/19/2022, 9:55 AMcool-lifeguard-49380
10/19/2022, 9:59 AMapiVersion: v1
data:
admin.yaml: |
admin:
clientId: 'flytepropeller'
clientSecretLocation: /etc/secrets/client_secret
endpoint: flyteadmin:81
insecure: false
event:
capacity: 1000
rate: 500
type: admin
cache.yaml: |
cache:
max_size_mbs: 1024
target_gc_percent: 70
catalog.yaml: |
catalog-cache:
endpoint: datacatalog:89
insecure: true
type: datacatalog
copilot.yaml: |
plugins:
k8s:
co-pilot:
image: <http://cr.flyte.org/flyteorg/flytecopilot-release:v1.2.0|cr.flyte.org/flyteorg/flytecopilot-release:v1.2.0>
name: flyte-copilot-
start-timeout: 30s
core.yaml: |
manager:
pod-application: flytepropeller
pod-template-container-name: flytepropeller
pod-template-name: flytepropeller-template
propeller:
downstream-eval-duration: 30s
enable-admin-launcher: true
gc-interval: 12h
kube-client-config:
burst: 25
qps: 100
timeout: 30s
leader-election:
enabled: true
lease-duration: 15s
lock-config-map:
name: propeller-leader
namespace: flyte
renew-deadline: 10s
retry-period: 2s
limit-namespace: all
max-workflow-retries: 50
metadata-prefix: metadata/propeller
metrics-prefix: flyte
prof-port: 10254
queue:
batch-size: -1
batching-interval: 2s
queue:
base-delay: 5s
capacity: 1000
max-delay: 120s
rate: 100
type: maxof
sub-queue:
capacity: 1000
rate: 100
type: bucket
type: batch
rawoutput-prefix: <gs://company-flyte-bucket/>
workers: 40
workflow-reeval-duration: 30s
webhook:
certDir: /etc/webhook/certs
serviceName: flyte-pod-webhook
enabled_plugins.yaml: |
tasks:
task-plugins:
default-for-task-types:
container: container
container_array: k8s-array
sidecar: sidecar
enabled-plugins:
- container
- sidecar
- k8s-array
k8s.yaml: |
plugins:
k8s:
default-cpus: 100m
default-env-vars: []
default-memory: 100Mi
resource_manager.yaml: |
propeller:
resourcemanager:
type: noop
storage.yaml: |
storage:
type: stow
stow:
kind: google
config:
json: ""
project_id: <project>
scopes: <https://www.googleapis.com/auth/devstorage.read_write>
container: "company-flyte-bucket"
enable-multicontainer: false
limits:
maxDownloadMBs: 10
task_logs.yaml: |
plugins:
k8s-array:
logs:
config:
stackdriver-enabled: true
stackdriver-logresourcename: k8s_container
logs:
cloudwatch-enabled: false
kubernetes-enabled: false
stackdriver-enabled: true
stackdriver-logresourcename: k8s_container
kind: ConfigMap
metadata:
annotations:
<http://meta.helm.sh/release-name|meta.helm.sh/release-name>: flyte
<http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: flyte
creationTimestamp: "2022-10-18T18:09:47Z"
labels:
<http://app.kubernetes.io/instance|app.kubernetes.io/instance>: flyte
<http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
<http://app.kubernetes.io/name|app.kubernetes.io/name>: flyteadmin
<http://helm.sh/chart|helm.sh/chart>: flyte-core-v1.2.0
name: flyte-propeller-config
namespace: flyte
resourceVersion: "1245505"
uid: 4dd30035-f278-452c-8910-cc9bd4878c6b
boundless-pizza-95864
10/19/2022, 10:02 AMflytepropeller
client id using that default secret. I think that was fixed so you might have to set that explicitly now.boundless-pizza-95864
10/19/2022, 10:06 AMuseAuth: true
and don't explicitly configure an external auth server.cool-lifeguard-49380
10/19/2022, 10:08 AMboundless-pizza-95864
10/19/2022, 10:10 AMboundless-pizza-95864
10/19/2022, 10:11 AMcool-lifeguard-49380
10/19/2022, 10:12 AMboundless-pizza-95864
10/19/2022, 10:12 AMfoobar
as well, hence the security issue).boundless-pizza-95864
10/19/2022, 10:14 AMboundless-pizza-95864
10/19/2022, 10:15 AMboundless-pizza-95864
10/19/2022, 10:18 AMboundless-pizza-95864
10/19/2022, 10:19 AMcool-lifeguard-49380
10/19/2022, 10:21 AMsecrets:
adminOauthClientCredentials:
enabled: true
clientSecret: <random string>
clientId: flytepropeller
So this affects the communication between propeller and admin right?
Do you know how auth works between the other flyte services?
Console and admin work via the Google login I understand.boundless-pizza-95864
10/19/2022, 10:27 AMcool-lifeguard-49380
10/19/2022, 10:29 AMStatus:
Failed Attempts: 3
Message: Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: oauth2: cannot fetch token: 401 Unauthorized
Response: {"error":"invalid_client","error_description":"Client authentication failed (e.g., unknown client, no client authentication included, or unsupported authentication method)."}]
Phase: 0
Tasks:
But I also didn't actively do anything to make admin aware of this secret.boundless-pizza-95864
10/19/2022, 10:29 AMconfigmap:
adminServer:
server:
httpPort: 8088
grpcPort: 8089
dataProxy:
upload:
storagePrefix: upload
security:
secure: false
useAuth: true
allowCors: true
allowedOrigins:
# Accepting all domains for Sandbox installation
- '*'
allowedHeaders:
- Content-Type
auth:
appAuth:
authServerType: Self
selfAuthServer:
accessTokenLifespan: 30m0s
authorizationCodeLifespan: 5m0s
claimSymmetricEncryptionKeySecretName: claim_symmetric_key
issuer: ""
oldTokenSigningRSAKeySecretName: token_rsa_key_old.pem
refreshTokenLifespan: 1h0m0s
staticClients:
flyte-cli:
audience: null
grant_types:
- refresh_token
- authorization_code
id: flyte-cli
public: true
redirect_uris:
- <http://localhost:53593/callback>
- <http://localhost:12345/callback>
response_types:
- code
- token
scopes:
- all
- offline
- access_token
flytectl:
audience: null
grant_types:
- refresh_token
- authorization_code
id: flytectl
public: true
redirect_uris:
- <http://localhost:53593/callback>
- <http://localhost:12345/callback>
response_types:
- code
- token
scopes:
- all
- offline
- access_token
flytepropeller:
audience: null
client_secret: <your client secret hashed and base64 encoded>
grant_types:
- refresh_token
- client_credentials
id: flytepropeller
public: false
redirect_uris:
- <http://localhost:3846/callback>
response_types:
- token
scopes:
- all
- offline
- access_token
boundless-pizza-95864
10/19/2022, 10:31 AMcool-lifeguard-49380
10/19/2022, 11:02 AMecho -n "clientSecret" | base64
which is what I would typically do for a k8s secret.cool-lifeguard-49380
10/19/2022, 11:04 AMk -n flyte logs -f flytescheduler-6cbfb47f9f-h76q4 -c flytescheduler-check
--storage.limits.maxDownloadMBs int Maximum allowed download size (in MBs) per call. (default 2)
--storage.stow.config stringToString Configuration for stow backend. Refer to github/flyteorg/stow (default [])
--storage.stow.kind string Kind of Stow backend to use. Refer to github/flyteorg/stow
--storage.type string Sets the type of storage to configure [s3/minio/local/mem/stow]. (default "s3")
panic: rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: oauth2: cannot fetch token: 401 Unauthorized
Response: {"error":"invalid_client","error_description":"Client authentication failed (e.g., unknown client, no client authentication included, or unsupported authentication method)."}
goroutine 1 [running]:
main.main()
/go/src/github.com/flyteorg/flyteadmin/cmd/scheduler/main.go:12 +0x85
And my worklfow still has the status:
Status:
Failed Attempts: 4
Message: Workflow[] failed. ErrorRecordingError: failed to publish event, caused by: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: oauth2: cannot fetch token: 401 Unauthorized
Response: {"error":"invalid_client","error_description":"Client authentication failed (e.g., unknown client, no client authentication included, or unsupported authentication method)."}]
Phase: 0
boundless-pizza-95864
10/19/2022, 12:09 PMfreezing-airport-6809
cool-lifeguard-49380
10/19/2022, 3:26 PMboundless-pizza-95864
10/19/2022, 4:11 PMpip install bcrypt
python
>>> import bcrypt
>>> bcrypt.hashpw(b"foobar", bcrypt.gensalt(prefix=b"2a"))
The resulting hash should look something like this:
b'$2a$12$d3mGDJwq9F5TiQA1YYm0TOVzvEvcBX5VEw2AW0gqrn7Mvh2InuiCS'
then base64 encode it and use that as client_secret in the config.boundless-pizza-95864
10/19/2022, 4:12 PMboundless-pizza-95864
10/19/2022, 4:13 PMboundless-pizza-95864
10/19/2022, 4:15 PMsecrets:
adminOauthClientCredentials:
enabled: true
clientSecret: foobar
clientId: flytepropeller
cool-lifeguard-49380
10/19/2022, 4:17 PMExcluding theb'$2a$12$d3mGDJwq9F5TiQA1YYm0TOVzvEvcBX5VEw2AW0gqrn7Mvh2InuiCS'
b'...'
I suppose?cool-lifeguard-49380
10/19/2022, 4:17 PM```secrets:
adminOauthClientCredentials:
enabled: true
clientSecret: foobar
clientId: flytepropeller```Yes, I have this 🙂
boundless-pizza-95864
10/19/2022, 4:18 PMcool-lifeguard-49380
10/19/2022, 4:18 PMthankful-minister-83577
python -c 'import bcrypt; import base64; print(base64.b64encode(bcrypt.hashpw("mypassword".encode("utf-8"), bcrypt.gensalt(6))))'
cool-lifeguard-49380
10/19/2022, 4:23 PMcool-lifeguard-49380
10/19/2022, 4:23 PMcool-lifeguard-49380
10/19/2022, 4:23 PMboundless-pizza-95864
10/19/2022, 4:24 PMcool-lifeguard-49380
10/19/2022, 4:38 PMthankful-minister-83577
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
appAuth:
thirdPartyConfig:
flyteClient:
clientId: flytectl
redirectUri: <http://localhost:53593/callback>
scopes:
- offline
- all
selfAuthServer:
staticClients:
flyte-cli:
id: "flyte-cli"
redirect_uris:
- "<http://localhost:53593/callback>"
- "<http://localhost:12345/callback>"
grant_types:
- refresh_token
- authorization_code
response_types:
- code
- token
scopes:
- all
- offline
- access_token
public: true
flytectl:
id: flytectl
redirect_uris:
- "<http://localhost:53593/callback>"
- "<http://localhost:12345/callback>"
grant_types:
- refresh_token
- authorization_code
response_types:
- code
- token
scopes:
- all
- offline
- access_token
public: true
flytepropeller:
id: flytepropeller
client_secret: JDJhJDA2JGd3N0pNUno1OXpCSzFk43DJkYlUxTHV2MGxRMlFWHNlTkczcElyU3V1TzhZai95ODJsQ2dh
redirect_uris:
- "<http://localhost:3846/callback>"
grant_types:
- refresh_token
- client_credentials
response_types:
- token
scopes:
- all
- offline
- access_token
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
cool-lifeguard-49380
10/19/2022, 4:44 PMcool-lifeguard-49380
10/19/2022, 4:44 PMcool-lifeguard-49380
10/19/2022, 4:45 PMthankful-minister-83577
thankful-minister-83577
cool-lifeguard-49380
10/19/2022, 4:45 PMthankful-minister-83577
thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
cool-lifeguard-49380
10/19/2022, 4:49 PMbase64.b64encode(bcrypt.hashpw...
is used by propeller to authorize itself to admin?thankful-minister-83577
thankful-minister-83577
thankful-minister-83577
cool-lifeguard-49380
10/19/2022, 4:50 PMcool-lifeguard-49380
10/19/2022, 4:50 PMthankful-minister-83577
cool-lifeguard-49380
10/19/2022, 4:50 PMcool-lifeguard-49380
10/19/2022, 4:51 PMthankful-minister-83577
cool-lifeguard-49380
10/19/2022, 4:51 PMcool-lifeguard-49380
10/19/2022, 4:52 PMboundless-pizza-95864
10/19/2022, 4:52 PMthankful-minister-83577
cool-lifeguard-49380
10/19/2022, 4:53 PMfreezing-airport-6809
high-park-82026
cool-lifeguard-49380
10/20/2022, 4:47 PMhelpful-kilobyte-23008
10/24/2022, 12:52 PMselfAuthServer.staticClients
be passed through automatically? Since it is available plaintext in adminOauthClientCredentials
anyway?freezing-airport-6809
cool-lifeguard-49380
10/24/2022, 4:05 PMexec.command
liveness probe and the ingress never comes up, I route all traffic from the ingress into an istio service mesh and via that to flyteconsole/flyteadmin.
• Make flyteadmin's gRPC endpoint available via an internal load balancer only in the VPC. So engineers can register tasks, ... only from their remote development machines or via CICD workers in the VPC. But that is not too much of a limitation for us since we use remote dev machines all the time...gifted-raincoat-59712
10/24/2022, 5:18 PMgifted-raincoat-59712
10/24/2022, 5:19 PMcool-lifeguard-49380
10/25/2022, 8:05 AMgifted-raincoat-59712
10/26/2022, 11:00 AMcool-lifeguard-49380
10/27/2022, 7:51 AMgifted-raincoat-59712
10/27/2022, 2:37 PMcold-lock-43986
11/08/2022, 2:24 AMflytepropeller
can perform auth against the K8s API instance, and that this requirement is not documented yet?freezing-airport-6809
high-park-82026
high-park-82026
freezing-airport-6809
cold-lock-43986
11/08/2022, 12:48 PMcold-lock-43986
11/09/2022, 10:26 AMcool-lifeguard-49380
02/20/2023, 5:19 PMfreezing-airport-6809