Hi, Im running into an issue when trying to set up...
# ask-the-community
g
Hi, Im running into an issue when trying to set up a dev environment by following this guide: https://github.com/flyteorg/flyte/pull/3811 I have the demo environment with Flyte running (all Pods Running with
flytectl demo start --dev
), and was able to
make compile
and test all the different components. Now on the final step when running
flyte start --config flyte_local.yaml
it fails, due to wanting to access
/home/{user}/.flyte/k3s/k3s.yaml
which is a broken symlink (as described here as well: https://github.com/flyteorg/flyte/issues/3645)
Copy code
❯ flyte start --config flyte_local.yaml
INFO[0000] Using config file: [flyte_local.yaml]        
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [plugins] updated. No update handler registered.","ts":"2023-06-30T14:39:46+02:00"}
...
{"json":{"src":"viper.go:398"},"level":"debug","msg":"Config section [webhook] updated. No update handler registered.","ts":"2023-06-30T14:39:46+02:00"}
{"json":{"src":"start.go:63"},"level":"info","msg":"Running Database Migrations...","ts":"2023-06-30T14:39:46+02:00"}
{"json":{"src":"start.go:124"},"level":"error","msg":"Failed to create controller manager. error building Kubernetes Clientset: Error building kubeconfig: stat /home/user/.flyte/k3s/k3s.yaml: no such file or directory","ts":"2023-06-30T14:39:46+02:00"}
{"json":{"src":"start.go:185"},"level":"panic","msg":"Failed to start Propeller, err: error building Kubernetes Clientset: Error building kubeconfig: stat /home/user/.flyte/k3s/k3s.yaml: no such file or directory","ts":"2023-06-30T14:39:46+02:00"}
panic: (*logrus.Entry) 0xc000510380

goroutine 61 [running]:
<http://github.com/sirupsen/logrus.(*Entry).log(0xc000510310|github.com/sirupsen/logrus.(*Entry).log(0xc000510310>, 0x0, {0xc0014de0a0, 0x9f})
        /root/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1/entry.go:259 +0x487
<http://github.com/sirupsen/logrus.(*Entry).Log(0xc000510310|github.com/sirupsen/logrus.(*Entry).Log(0xc000510310>, 0x0, {0xc00105be68?, 0x1?, 0x1?})
        /root/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1/entry.go:293 +0x4f
<http://github.com/sirupsen/logrus.(*Entry).Logf(0xc000510310|github.com/sirupsen/logrus.(*Entry).Logf(0xc000510310>, 0x0, {0x3145be9?, 0x0?}, {0xc000a8e230?, 0x0?, 0x0?})
        /root/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1/entry.go:338 +0x85
<http://github.com/sirupsen/logrus.(*Entry).Panicf(0x3fdac40|github.com/sirupsen/logrus.(*Entry).Panicf(0x3fdac40>?, {0x3145be9?, 0x416947?}, {0xc000a8e230?, 0x2a74b80?, 0x1?})
        /root/go/pkg/mod/github.com/sirupsen/logrus@v1.8.1/entry.go:376 +0x34
<http://github.com/flyteorg/flytestdlib/logger.Panicf({0x3fdac40|github.com/flyteorg/flytestdlib/logger.Panicf({0x3fdac40>?, 0xc000c55310?}, {0x3145be9, 0x22}, {0xc000a8e230, 0x1, 0x1})
        /root/go/pkg/mod/github.com/flyteorg/flytestdlib@v1.0.17/logger/logger.go:188 +0x64
<http://github.com/flyteorg/flyte/cmd/single.glob..func4.2()|github.com/flyteorg/flyte/cmd/single.glob..func4.2()>
        /home/user/git/flyte/flyte/cmd/single/start.go:185 +0xbe
<http://golang.org/x/sync/errgroup.(*Group).Go.func1()|golang.org/x/sync/errgroup.(*Group).Go.func1()>
        /root/go/pkg/mod/golang.org/x/sync@v0.1.0/errgroup/errgroup.go:75 +0x64
created by <http://golang.org/x/sync/errgroup.(*Group).Go|golang.org/x/sync/errgroup.(*Group).Go>
        /root/go/pkg/mod/golang.org/x/sync@v0.1.0/errgroup/errgroup.go:72 +0xa5
Hmm Im gettin a step further when using
flyte start --config flyte-single-binary-local.yaml
Hmm no that is also not working, it tries to look for
$HOME/.flyte/cluster-resource-templates
later on which does not exist in my environment
Alright, one step further again, with searching and finding https://github.com/flyteorg/flyte/blob/9d019382427ccc36e6502f13c61da42904236081/rsts/community/contribute.rst#cluster-resources I added a cluster-resource-templates, now I think something similar is needed for
$HOME/.flyte/webhook-certs
I tried extracting the certificates from the cluster (
kubectl get secret -n flyte flyte-pod-webhook -o json | jq -r '.data."ca.crt"' | base64 -d > ~/.flyte/webhook-certs/ca.crt
) but no luck:
Copy code
{
  "json": {
    "src": "start.go:185"
  },
  "level": "panic",
  "msg": "Failed to start Propeller, err: mkdir $HOME/.flyte/webhook-certs: no such file or directory",
  "ts": "2023-06-30T15:29:15+02:00"
}
Think I got it up and running now after using
./webhook-certs
instead (certs are created in local directory), but it feels very wonky
This leads to more errors later on when trying to run pipelines, any help would be appreciated 🙏
Copy code
Workflow[flytesnacks:development:example.training_workflow] failed. RuntimeExecutionError: max number of system retry attempts [11/10] exhausted. Last known status message: failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [container]: [InternalError] failed to create resource, caused by: Internal error occurred: failed calling webhook "<http://flyte-pod-webhook.flyte.org|flyte-pod-webhook.flyte.org>": failed to call webhook: Post "<https://flyte-sandbox-local.all.svc:9443/mutate--v1-pod?timeout=10s>": service "flyte-sandbox-local" not found
y
We need to use
flyte_local.yaml
in https://github.com/flyteorg/flyte/pull/3808. It has not been merged yet. (It is mentioned in doc, But I will highlight it further)
g
Thanks so much! 😄 I missed this comment. Works now!
y
Let me know If you encounter further errors!
g
I am running into another issue, Im not seeing any environment variables mounted from Secrets:
Copy code
@task(
    secret_requests=[
        Secret(
            group="user-info",
            key="user_secret",
            mount_requirement=Secret.MountType.ENV_VAR,
        )
    ]
)
def test() -> bool:
    print(os.environ)
    context = flytekit.current_context()
    secret_val = context.secrets.get("user-info", "user_secret")
    print(secret_val)
    return True
gives:
Copy code
Unable to find secret for key user_secret in group user-info in Env Var:_FSEC_USER-INFO_USER_SECRET and FilePath: /etc/secrets/user-info/user_secret
The secret is there in K8s cluster (in the correct namespace). Same code works fine in normal demo environment
y
following the sam doc here, with
flytectl demo start
, the task will success. with
flytectl demo start --dev
and
flyte start --config flyte_local.yaml
, the task does failed. Currently not sure why... looking at it.
The direct reason is in
flyte_local.yaml
, we disable the webhook.
Copy code
propeller:
  disableWebhook: true
Hi @jeev, I am wondering if you mind taking a look at it. Thank you in advance! I think the root cause is secret webhook does not work in single binary. But I found that you have address this issue here: https://github.com/flyteorg/flyte/pull/3228, it seems not work now. The following is my setting dev env code, and running the code @Geert mentions above will cause
failed to call webhook
error.
Copy code
flytectl demo start --dev
git clone <https://github.com/flyteorg/flyte.git>
cd flyte
go mod tidy
sudo make compile
flyte start --config flyte_local.yaml

# The flyte_local.yaml I used is this one(which I think I have set webhook correctly):

# This is a sample configuration file.
# Real configuration when running inside K8s (local or otherwise) lives in a ConfigMap
# Look in the artifacts directory in the flyte repo for what's actually run
# <https://github.com/lyft/flyte/blob/b47565c9998cde32b0b5f995981e3f3c990fa7cd/artifacts/flyteadmin.yaml#L72>
# Flyte clusters can be run locally with this configuration
# flytectl demo start --dev
# flyte start --config flyte_local.yaml
propeller:
  rawoutput-prefix: "<s3://my-s3-bucket/test/>"
  kube-config: "$HOME/.flyte/sandbox/kubeconfig"
  create-flyteworkflow-crd: true
webhook:
  certDir: ./webhook-certs
  secretName: flyte-sandbox-webhook-secret
  serviceName: flyte-sandbox-local
  localCert: true
  servicePort: 9443
tasks:
  task-plugins:
    enabled-plugins:
      - container
      - sidecar
      - K8S-ARRAY
    default-for-task-types:
      - container: container
      - container_array: K8S-ARRAY
server:
  kube-config: "$HOME/.flyte/sandbox/kubeconfig"
flyteadmin:
  runScheduler: false
database:
  postgres:
    port: 30001
    username: postgres
    password: postgres
    host: localhost
    dbname: flyteadmin
    options: "sslmode=disable"
storage:
  type: minio
  connection:
    access-key: minio
    auth-type: accesskey
    secret-key: miniostorage
    disable-ssl: true
    endpoint: "<http://localhost:30002>"
    region: my-region
  cache:
    max_size_mbs: 10
    target_gc_percent: 100
  container: "my-s3-bucket"
Logger:
  show-source: true
  level: 5
admin:
  endpoint: localhost:8089
  insecure: true
plugins:
  # All k8s plugins default configuration
  k8s:
    inject-finalizer: true
    default-env-vars:
      - AWS_METADATA_SERVICE_TIMEOUT: 5
      - AWS_METADATA_SERVICE_NUM_ATTEMPTS: 20
      - FLYTE_AWS_ENDPOINT: "<http://flyte-sandbox-minio.flyte:9000>"
      - FLYTE_AWS_ACCESS_KEY_ID: minio
      - FLYTE_AWS_SECRET_ACCESS_KEY: miniostorage
  # Logging configuration
  logs:
    kubernetes-enabled: true
    kubernetes-template-uri: <http://localhost:30080/kubernetes-dashboard/#/log/{{.namespace> }}/{{ .podName }}/pod?namespace={{ .namespace }}
cluster_resources:
  refreshInterval: 5m
  templatePath: "/etc/flyte/clusterresource/templates"
  # -- Starts the cluster resource manager in standalone mode with requisite auth credentials to call flyteadmin service endpoints
  standaloneDeployment: false
  customData:
  - production:
    - projectQuotaCpu:
        value: "8"
    - projectQuotaMemory:
        value: "16Gi"
  - staging:
    - projectQuotaCpu:
        value: "8"
    - projectQuotaMemory:
        value: "16Gi"
  - development:
    - projectQuotaCpu:
        value: "8"
    - projectQuotaMemory:
        value: "16Gi"
  refresh: 5m
flyte:
  admin:
    disableClusterResourceManager: true
    disableScheduler: true
  propeller:
    disableWebhook: false
task_resources:
  defaults:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: 2
    memory: 4Gi
    gpu: 5
catalog-cache:
  endpoint: localhost:8081
  insecure: true
  type: datacatalog
j
try:
Copy code
POD_NAMESPACE=flyte flyte start --config flyte_local.yaml
to get it to work with webhook
y
Thank you! try it now~
webhook works now, Thank you!!! but seems running sample code here will still get:
Copy code
Unable to find secret for key user_secret in group user-info in Env Var:_FSEC_USER-INFO_USER_SECRET and FilePath: /etc/secrets/user-info/user_secret
Not sure if this is same with the issue here: https://github.com/flyteorg/flyte/issues/2260
j
was it missing from the pod spec? try file mount instead of env var too.
y
Yes, I can not see these env vars in pod spec
j
can you try file mount
does the secret exist?
y
with file mount, I can see:
Copy code
- name: ovzwk3rnnfxgm216
    secret:
      defaultMode: 420
      items:
      - key: user_secret
        path: user_secret
      optional: true
      secretName: user-info
But still get
Unable to find secret for key user_secret in group user-info in Env Var:_FSEC_USER-INFO_USER_SECRET and FilePath: /etc/flyte/secrets/user-info/user_secret
j
do you see a volume mount?
i wonder if the
-
might be throwing if off
y
Seems not
strange
Let try without "-"
j
definitely seems weird though
y
without "-", I success
Looks like this is the issue
j
hmm
can you open an issue? would be good to resolve this.
y
But running
flytectl demo start
works just fine
Sure
j
oh wait
user-info
works in
flytectl demo start
but not
flytectl demo start --dev
?
y
Yes!
j
thats surprising
let me try to repro
y
Thank you!!!!!!
j
worked in dev mode for me @Yicheng Lu
try:
Copy code
make cmd/single/dist
POD_NAMESPACE=flyte go run -tags console cmd/main.go start --config ~/.flyte/flyte-single-binary-local.yaml
y
wondering what
make cmd/single/dist
does. Could I just run
POD_NAMESPACE=flyte  flyte start --config ~/.flyte/flyte-single-binary-local.yaml
j
Copy code
make cmd/single/dist
is to update the flyteconsole dist
since its bundled
you can ignore
y
Let me try it on a new ec2 instance.
Still get the same error in the new ec2 instance.
I start an empty ec2 instance, with only docker, kubectl go installed
started sandbox in dev mode
but otherwise the same
y
Oh, wait! It actually works!
I forgot to create the secret 🤣
Thank you for helping!!!!!!
g
Amazing!
POD_NAMESPACE=flyte go run -tags console cmd/main.go start --config ~/.flyte/flyte-single-binary-local.yaml
did the trick. Now have a dev environment where I can see my changes 😄 Thank you both so much!
j
🙌
272 Views