https://flyte.org logo
Join the conversationJoin Slack
Channels
announcements
ask-the-community
auth
conference-talks
contribute
databricks-integration
datahub-flyte
deployment
ecosystem-unionml
engineeringlabs
events
feature-discussions
flyte-bazel
flyte-build
flyte-console
flyte-deployment
flyte-documentation
flyte-github
flyte-ui-ux
flytekit
flytekit-java
flytelab
great-content
hacktoberfest-2022
helsing-flyte
in-flyte-conversations
introductions
jobs
konan-integration
linkedin-flyte
random
ray-integration
ray-on-flyte
release
scipy-2022-sprint
sig-large-models
workflow-building-ui-proj
writing-w-sfloris
Powered by Linen
announcements
  • s

    Sandra Youssef

    08/30/2022, 3:33 PM
    One of Flyte's ecosystem projects, UnionML, is the easiest way to build and deploy machine learning models. Join the UnionML crew in the open source planning meeting tomorrow, Aug 30th, 9am PST/12pm EST. Check out the Website and Docs Calendar Invite & Zoom Link
  • f

    Fabio Grätz

    08/30/2022, 4:24 PM
    Hey everyone 🙂 I have some problem configuring a
    StructuredDataset
    to pass a Spark Dataframe around between tasks. I’m following this guide and get this error:
    {
      "asctime": "2022-08-30 16:15:09,048",
      "name": "flytekit",
      "levelname": "ERROR",
      "message": "Failed to convert return value for var o0 with error <class 'ValueError'>: Failed to find a handler for <class 'pyspark.sql.dataframe.DataFrame'>, protocol gs, fmt parquet"
    }
    I added this to my spark config but it doesn’t solve the problem:
    spark-config-default:
              - "spark.jars.packages": "com.google.cloud.bigdataoss:gcs-connector:hadoop3-2.2.2"
              - "spark.hadoop.fs.AbstractFileSystem.gs.impl": "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS"
              - "spark.hadoop.fs.gs.impl": "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem"
              - "spark.hadoop.google.cloud.auth.service.account.enable": "true"
    Would be great to get some pointers in case somebody has seen this error before, thanks!
    y
    s
    • 3
    • 17
  • s

    Sandra Youssef

    08/30/2022, 6:55 PM
    Flyte Monthly Issue # 11 is out! https://lnkd.in/gzqW_E9n Read up on the KubeRay-Flyte Integration, WhyLogs, success stories, latest talks and meet some of Flyte's contributors. Subscribe here: https://lnkd.in/gQyBmi2c
  • h

    Haytham Abuelfutuh

    08/31/2022, 1:31 AM
    @Alex Pozimenko Please meet @Eduardo Apolinario (eapolinario) @Yee they are looking into real world use-cases for what we have been calling Signaling (Human-in-the-loop feature)…
    a
    y
    • 3
    • 9
  • z

    Zachary Carrico

    08/31/2022, 1:43 PM
    good morning! Is there a way to disable the use of cache when choosing “Relaunch” from the Flyte console?
    h
    • 2
    • 1
  • j

    Justin Tyberg

    08/31/2022, 4:35 PM
    Our
    flytepropeller
    is continuously logging the following error (at a rate of ~50/second), and we have no workflows or tasks running. The last workflow was a few days ago.
    {
      "json": {
        "exec_id": "f8114eedda3854878b11",
        "node": "n0/dn102/dn155",
        "ns": "dpp-default",
        "res_ver": "67801339",
        "routine": "worker-6",
        "src": "task_event_recorder.go:27",
        "wf": "dpp:default:msat.level2.workflow.level2_wf"
      },
      "level": "warning",
      "msg": "Failed to record taskEvent, error [EventAlreadyInTerminalStateError: conflicting events; destination: ABORTED, caused by [rpc error: code = FailedPrecondition desc = invalid phase change from SUCCEEDED to ABORTED for task execution {resource_type:TASK project:\"dpp\" domain:\"default\" name:\"msat.level2.proxy.run_splat\" version:\"dpp-b9ef0a90\"  node_id:\"n0-0-dn102-0-dn155\" execution_id:<project:\"dpp\" domain:\"default\" name:\"f8114eedda3854878b11\" >  0 {} [] 0}]]. Trying to record state: ABORTED. Ignoring this error!",
      "ts": "2022-08-31T16:29:14Z"
    }
    Is there a way to “reset” propeller and have it ignore these past errors? Seems the flyte state is in a bad state.
    d
    • 2
    • 3
  • m

    Matheus Moreno

    08/31/2022, 8:38 PM
    Hey, everyone! Is anyone else having problems starting the sandbox? A coworker of mine was trying to start it using FlyteCTL and was getting this error:
    Error: Get "<https://127.0.0.1:30086/api/v1/nodes>": dial tcp 127.0.0.1:30086: connect: connection refused
    When she tries to execute the Docker image
    <http://cr.flyte.org/flyteorg/flyte-sandbox|cr.flyte.org/flyteorg/flyte-sandbox>
    directly with
    docker run
    , this error happens:
    ...
    Release "flyte-core" does not exist. Installing it now.
    Error: file '/root/.cache/helm/repository/flyte-core-v1.1.0.tgz' does not appear to be a gzipped archive; got 'application/octet-stream'
    I was able to reproduce it in my machine. What could be happening?
    a
    k
    +2
    • 5
    • 6
  • e

    Eduardo Apolinario (eapolinario)

    08/31/2022, 9:21 PM
    We're aware of an issue with the
    flyte-core
    helm chart version
    1.1.0
    (which is the latest release). This is impacting all scenarios, including flyte deploys and also sandbox. Fix coming up shortly.
    • 1
    • 1
  • p

    Python practice

    09/01/2022, 10:32 AM
    Hey Everyone ! I'm trying to use flyte to run the diabetes classification model. I've created an EC2 instance and trying to run it. But I'm not able to change the s3 bucket at which I would like to store the meta data
  • p

    Python practice

    09/01/2022, 10:32 AM
    Can anyone suggest where I can modify the s3 bucket name to my requirement ?
    s
    • 2
    • 1
  • s

    Sujith Samuel

    09/01/2022, 11:10 AM
    #general When trying to deploy a workflow in flyte, I am getting messages which see, to specify that I am exceeding some sort of a limit error file @[s3://flyte/metadata/propeller/brozzu-smart-ca-pipeline-development-uzxgx5b5or/n3/data/0/error.pb] is too large [17471908] bytes, max allowed [10485760] bytes" Does this mean that there is a max 10Mb size allowed for serialized uploads to flyte? Please let me know if there a limit specification for this and if so, where can I customize the same.
    s
    d
    p
    • 4
    • 7
  • s

    Sandra Youssef

    09/01/2022, 6:48 PM
    Hi Flyers, Listen to the story of the Schibsted group as they establish a machine learning team, define infrastructure requirements and ML workflows, evaluate and adopt Flyte, and share some of their learnings. Many thanks to @Paul Beskow, @Mücahit, @Björn Schiffler, @Yini Gao and @Oleg Ievtushok!

    https://www.youtube.com/watch?v=no26Y4w_S1Q▾

  • s

    Sujith Samuel

    09/02/2022, 5:02 AM
    #general I am trying to run a flyte workflow from the console and I am getting the below error {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","routine":"worker-2","src":"handler.go:168"},"level":"info","msg":"Processing Workflow.","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"executor.go:364","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"info","msg":"Handling Workflow [an4zmzms9fccqwzzzhk7], id: [project:\"samuel-s3-data\" domain:\"development\" name:\"an4zmzms9fccqwzzzhk7\" ], p [Ready]","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"executor.go:123","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"info","msg":"Setting the MetadataDir for StartNode [s3://flytedata/metadata/propeller/samuel-s3-data-development-an4zmzms9fccqwzzzhk7/start-node/data]","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"executor.go:270","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"debug","msg":"Transitioning/Recording event for workflow state transition [Ready] -\u003e [Running]","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"admin_eventsink.go:44","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"debug","msg":"AdminEventSink received a new event execution_id:\u003cproject:\"samuel-s3-data\" domain:\"development\" name:\"an4zmzms9fccqwzzzhk7\" \u003e producer_id:\"propeller\" phase:RUNNING occurred_at:\u003cseconds:1662094333 nanos:794696270 \u003e ","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"workflow_event_recorder.go:69","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"info","msg":"Failed to record workflow event [execution_id:\u003cproject:\"samuel-s3-data\" domain:\"development\" name:\"an4zmzms9fccqwzzzhk7\" \u003e producer_id:\"propeller\" phase:RUNNING occurred_at:\u003cseconds:1662094333 nanos:794696270 \u003e ] with err: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken]","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"executor.go:351","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"warning","msg":"Event recording failed. Error [EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken]]","ts":"2022-09-02T04:52:13Z"} I am able to run flytectl commands from console using clientsecret file correctly as well as login to the flyte console properly. However the flytepropeller pod seems to not pick the correct authentication method. Please assist me in setting the correct authentication parameters in the flytepropeller pod configurations.
    k
    • 2
    • 3
  • s

    Sathish kumar Venkatesan

    09/02/2022, 7:59 AM
    Team, we configured own domain name and able to access the flyte console after running workflow. jobs are getting succeed but unable to view the pod logs from the console. link is pointing to http://localhost:30082 like below but our flyte url is https://dev-flyte.example.com. please let me know anything need to be updated. also we dont see any logs are pushed to cloudwatch after completion of job. http://localhost:30082/#!/log/cloudops-max-flyte-demo-development/f82ed701fa1f04c489[…]0-0-driver/pod?namespace=cloudops-max-flyte-demo-development
    s
    k
    +2
    • 5
    • 9
  • n

    Nimrod Rak

    09/05/2022, 6:59 AM
    Hi everyone! I want to use Flyte for my use case, which is a pipeline of tasks that run on GPU with torch tensors on GPU. Can Flyte pass GPU tensors between tasks in a workflow? Can this be done without copies between the tasks? How can I define concurrency of tasks? I saw this is possible but no documentation on how to do it. Thank you so much in advance!
    s
    k
    +2
    • 5
    • 14
  • p

    Pontus Wistbacka

    09/05/2022, 3:39 PM
    Hi, I'm sorry if this has been asked before, I've been searching for the answers but couldn't find it. In my current setup, I have a requirement to have flyte write all temporary files to a nondefault location. I was able to make it write some data (directories named like flyte+bunchofletters) to what I specify with the env variable TEMPDIR (used by pythons tempfile library) but flyte also writes some other stuff to /tmp/flyte. Can I configure also where this other stuff is written? Thanks for the help
    k
    y
    • 3
    • 5
  • s

    Sandra Youssef

    09/05/2022, 6:24 PM
    Hi Flyers, Curious about what Flyte has been up to? Want a sneak peek into upcoming improvements and updates? Join the Flyte Community Sync tomorrow 9/6 at 9am PT for Roadmap Updates. Also, hear guest speaker @Fabio Grätz introduce his talk at the upcoming Linux Foundation Open Source Summit in Europe this month, Building Robust ML Production Systems Using OSS Tools for Continuous Delivery for ML (CD4ML). Calendar Invite and Zoom Link
  • h

    Harshit Sharma

    09/06/2022, 12:03 AM
    Hi everyone, I am trying to update the node group instance type through env.yaml while deploying with opta. Currently it's at default t3.medium. However, adding node_instance_type with the desired instance type seems to have no effect. I am adding this under - type: k8s-cluster. Do I need to create a separate node group using opta env to achieve this?
    s
    y
    d
    • 4
    • 26
  • s

    Sujith Samuel

    09/06/2022, 8:59 AM
    #general I am trying to pass a tensorflow h5 model file between tasks but seems to have hit an issue. Is there any example for passing h5 files between tasks. [1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes. [adpz8w6l67zrzfglb9m9-n4-0] terminated with exit code (137). Reason [Error]. Message: use python type that flyte support."} {"asctime": "2022-09-06 08:39:46,752", "name": "flytekit", "levelname": "WARNING", "message": "Unsupported Type <class 'keras.engine.training.Model'> found, Flyte will default to use PickleFile as the transport. Pickle can only be used to send objects between the exact same version of Python, and we strongly recommend to use python type that flyte support."} Different versions of python message is also confusing since all of my tasks are present in the same container and follow the same versions.
    s
    s
    +2
    • 5
    • 12
  • e

    Evan Sadler

    09/06/2022, 7:02 PM
    Hey! Has anyone gotten apache-beam to run on Flyte natively? I was going to explore this a bit, but I wanted to see if anyone here has thought through this.
    k
    • 2
    • 10
  • e

    Eduardo Apolinario (eapolinario)

    09/07/2022, 7:08 AM
    Github issue tracking this: https://github.com/flyteorg/flyte/issues/2855 We're going to take care of this asap. The expectation is to have another flytekit beta version by tomorrow (9/7).
    k
    s
    k
    • 4
    • 4
  • k

    Katrina P

    09/07/2022, 5:26 PM
    For scheduling, it is done in UTC timezone, but is there a way to change just the UI so the console displays a local timezone?
    k
    • 2
    • 2
  • s

    Sujith Samuel

    09/07/2022, 6:39 PM
    #general One of my tasks is using mlflow logging facilities and when that tasks runs in the flyte pod, I am getting the below error mlflow.utils.autologging_utils: Encountered unexpected error during autologging: Unable to locate credentials This is because the flyte pod will not have the mlflow env variables pertaining to the mlflow environment like artefact store, s3 endpoint. My query is how to provide env variables to the flyte task pod which is triggered by flyte for each task
    s
    e
    • 3
    • 2
  • s

    Sandra Youssef

    09/07/2022, 10:41 PM
    Hi Flyers, Tune in to the latest on

    Flyte's Roadmap▾

    , the upcoming Machine Learning Hangout featuring Flyte & Ray, and hear MLOps Lead @Fabio Grätz present an intro on

    how to build ML systems using open-source tools▾

    and a typical software adoption journey, presented in depth at the Open Source Summit in Europe next week. Also check out a Flyte Ecosystem project

    UnionML review▾

    .
  • k

    Kevin Su

    09/08/2022, 2:06 PM
    could you share your docker file?
    • 1
    • 2
  • n

    Niels Bantilan

    09/08/2022, 5:48 PM
    hi all 👋, I just wanted to ping the community to ask a quick question 🤔: Say you have workflow that uses a trained model to generate predictions on a scheduled launchplan. The question is, how do you typically want to get features for that prediction? I.e. when the scheduled workflow kicks off, where are you reading those features from? Do you need the kick-off time as a parameter to fetching data from, say, an s3 bucket or DB?
    k
    r
    s
    • 4
    • 31
  • v

    varsha Parthasarathy

    09/08/2022, 7:56 PM
    Question about BQ plugin: (ref doc https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/gcp/bigquery/index.html) Will the flyte task also include monitoring/checking if results for the query is complete? Most of our queries does take a while to execute, so wondering how will the pods behave in this case?
    d
    • 2
    • 3
  • t

    Tamis van der Laan

    09/09/2022, 10:55 AM
    Question, does flyte support continues training and hyper parameter optimisation?
    s
    k
    • 3
    • 2
  • a

    Andrew Korzhuev

    09/09/2022, 3:03 PM
    Hello! We’re running Flyte on AWS EKS (k8s). Due to the security concerns we have a database which resides in VPC separate from the EKS. We would like to access that db from one task without giving access to the whole k8s cluster to that VPC. Do you know good strategies to achieve that? So far the ideas we’ve came up with were around using some other AWS service like Fargate or Glue to run the task (as just that task can be run within that specific VPC) and trigger it from Flyte. That feels overly complex however.
    k
    • 2
    • 4
  • s

    Samhita Alla

    09/12/2022, 9:21 AM
    @Prafulla Mahindrakar, could you help?
    p
    s
    k
    • 4
    • 10
Powered by Linen
Title
s

Samhita Alla

09/12/2022, 9:21 AM
@Prafulla Mahindrakar, could you help?
p

Prafulla Mahindrakar

09/12/2022, 11:52 AM
@Sathish kumar Venkatesan are you able to see other non spark task logs in cloudwatch. Also for spark can you check this which requires to segregate user, system logs https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/kubernetes/k8s_spark/index.html#step-3-optionally-setup-visibility Would be good to see first if normal tasks logs are flowing to cloud watch correctly using your initial configuration . And also when you click the log link you need to be in the same aws account where logs are being published to.
s

Sathish kumar Venkatesan

09/12/2022, 12:03 PM
@Prafulla Mahindrakar currently i am testing only spark. and aws account in same between flyte config and published account. is there any command like kubectle logs to check the clouwatch agent services logs at flyte side.
p

Prafulla Mahindrakar

09/12/2022, 12:10 PM
Can you check this thread https://flyte-org.slack.com/archives/C01P3B761A6/p1656366983976689 You might be missing configuring the mechanism for pushing the logs and we seem to missing docs on it. cc @Samhita Alla
s

Sathish kumar Venkatesan

09/12/2022, 12:26 PM
is fluent-bit mandatory for pushing logs from flyte to cloudwatch?
p

Prafulla Mahindrakar

09/12/2022, 12:33 PM
Amazon recommends using that https://aws.amazon.com/premiumsupport/knowledge-center/cloudwatch-stream-container-logs-eks/ for streaming logs from eks containers https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-EKS-logs.html
s

Sathish kumar Venkatesan

09/12/2022, 12:45 PM
thanks for that. since fluent-bit nowhere mentioned in flyte document, my assumption was by default flyte should able to steam logs to cloudwatch.
p

Prafulla Mahindrakar

09/12/2022, 12:51 PM
fair assumption with missing docs. Would you help with contributing this bit of docs once you have this setup in your environment. would really benefit the community.
s

Sathish kumar Venkatesan

09/12/2022, 12:53 PM
@Prafulla Mahindrakar sure. once i make it work from my side
k

Ketan (kumare3)

09/12/2022, 1:48 PM
@Sathish kumar Venkatesan Flyte cannot talk about how you push logs. There are multiple options. I do not think it is fair for Flyte to document how to push the logs- this is completely independent
The document should say, if you have logs in cloudwatch then
View count: 4