• Justin Tyberg

    Justin Tyberg

    3 weeks ago
    Our
    flytepropeller
    is continuously logging the following error (at a rate of ~50/second), and we have no workflows or tasks running. The last workflow was a few days ago.
    {
      "json": {
        "exec_id": "f8114eedda3854878b11",
        "node": "n0/dn102/dn155",
        "ns": "dpp-default",
        "res_ver": "67801339",
        "routine": "worker-6",
        "src": "task_event_recorder.go:27",
        "wf": "dpp:default:msat.level2.workflow.level2_wf"
      },
      "level": "warning",
      "msg": "Failed to record taskEvent, error [EventAlreadyInTerminalStateError: conflicting events; destination: ABORTED, caused by [rpc error: code = FailedPrecondition desc = invalid phase change from SUCCEEDED to ABORTED for task execution {resource_type:TASK project:\"dpp\" domain:\"default\" name:\"msat.level2.proxy.run_splat\" version:\"dpp-b9ef0a90\"  node_id:\"n0-0-dn102-0-dn155\" execution_id:<project:\"dpp\" domain:\"default\" name:\"f8114eedda3854878b11\" >  0 {} [] 0}]]. Trying to record state: ABORTED. Ignoring this error!",
      "ts": "2022-08-31T16:29:14Z"
    }
    Is there a way to “reset” propeller and have it ignore these past errors? Seems the flyte state is in a bad state.
    Justin Tyberg
    Dan Rammer (hamersaw)
    3 replies
    Copy to Clipboard
  • Matheus Moreno

    Matheus Moreno

    3 weeks ago
    Hey, everyone! Is anyone else having problems starting the sandbox? A coworker of mine was trying to start it using FlyteCTL and was getting this error:
    Error: Get "<https://127.0.0.1:30086/api/v1/nodes>": dial tcp 127.0.0.1:30086: connect: connection refused
    When she tries to execute the Docker image
    <http://cr.flyte.org/flyteorg/flyte-sandbox|cr.flyte.org/flyteorg/flyte-sandbox>
    directly with
    docker run
    , this error happens:
    ...
    Release "flyte-core" does not exist. Installing it now.
    Error: file '/root/.cache/helm/repository/flyte-core-v1.1.0.tgz' does not appear to be a gzipped archive; got 'application/octet-stream'
    I was able to reproduce it in my machine. What could be happening?
    Matheus Moreno
    a
    +3
    6 replies
    Copy to Clipboard
  • Eduardo Apolinario (eapolinario)

    Eduardo Apolinario (eapolinario)

    3 weeks ago
    We're aware of an issue with the
    flyte-core
    helm chart version
    1.1.0
    (which is the latest release). This is impacting all scenarios, including flyte deploys and also sandbox. Fix coming up shortly.
    Eduardo Apolinario (eapolinario)
    1 replies
    Copy to Clipboard
  • p

    Python practice

    3 weeks ago
    Hey Everyone ! I'm trying to use flyte to run the diabetes classification model. I've created an EC2 instance and trying to run it. But I'm not able to change the s3 bucket at which I would like to store the meta data
  • p

    Python practice

    3 weeks ago
    Can anyone suggest where I can modify the s3 bucket name to my requirement ?
    p
    1 replies
    Copy to Clipboard
  • Sujith Samuel

    Sujith Samuel

    3 weeks ago
    #general When trying to deploy a workflow in flyte, I am getting messages which see, to specify that I am exceeding some sort of a limit error file @[s3://flyte/metadata/propeller/brozzu-smart-ca-pipeline-development-uzxgx5b5or/n3/data/0/error.pb] is too large [17471908] bytes, max allowed [10485760] bytes" Does this mean that there is a max 10Mb size allowed for serialized uploads to flyte? Please let me know if there a limit specification for this and if so, where can I customize the same.
    Sujith Samuel
    Smriti Satyan
    +1
    7 replies
    Copy to Clipboard
  • Sandra Youssef

    Sandra Youssef

    3 weeks ago
    Hi Flyers, Listen to the story of the Schibsted group as they establish a machine learning team, define infrastructure requirements and ML workflows, evaluate and adopt Flyte, and share some of their learnings. Many thanks to @Paul Beskow, @Mücahit, @Björn Schiffler, @Yini Gao and @Oleg Ievtushok!

    https://www.youtube.com/watch?v=no26Y4w_S1Q

  • Sujith Samuel

    Sujith Samuel

    3 weeks ago
    #general I am trying to run a flyte workflow from the console and I am getting the below error {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","routine":"worker-2","src":"handler.go:168"},"level":"info","msg":"Processing Workflow.","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"executor.go:364","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"info","msg":"Handling Workflow [an4zmzms9fccqwzzzhk7], id: [project:"samuel-s3-data" domain:"development" name:"an4zmzms9fccqwzzzhk7" ], p [Ready]","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"executor.go:123","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"info","msg":"Setting the MetadataDir for StartNode [s3://flytedata/metadata/propeller/samuel-s3-data-development-an4zmzms9fccqwzzzhk7/start-node/data]","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"executor.go:270","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"debug","msg":"Transitioning/Recording event for workflow state transition [Ready] -\u003e [Running]","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"admin_eventsink.go:44","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"debug","msg":"AdminEventSink received a new event execution_id:\u003cproject:"samuel-s3-data" domain:"development" name:"an4zmzms9fccqwzzzhk7" \u003e producer_id:"propeller" phase:RUNNING occurred_at:\u003cseconds:1662094333 nanos:794696270 \u003e ","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"workflow_event_recorder.go:69","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"info","msg":"Failed to record workflow event [execution_id:\u003cproject:"samuel-s3-data" domain:"development" name:"an4zmzms9fccqwzzzhk7" \u003e producer_id:"propeller" phase:RUNNING occurred_at:\u003cseconds:1662094333 nanos:794696270 \u003e ] with err: EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken]","ts":"2022-09-02T04:52:13Z"} {"json":{"exec_id":"an4zmzms9fccqwzzzhk7","ns":"samuel-s3-data-development","res_ver":"39913921","routine":"worker-2","src":"executor.go:351","wf":"samuel-s3-data:development:s3_data.my_wokflow.my_test_workflow"},"level":"warning","msg":"Event recording failed. Error [EventSinkError: Error sending event, caused by [rpc error: code = Unauthenticated desc = token parse error [JWT_VERIFICATION_FAILED] Could not retrieve id token from metadata, caused by: rpc error: code = Unauthenticated desc = Request unauthenticated with IDToken]]","ts":"2022-09-02T04:52:13Z"} I am able to run flytectl commands from console using clientsecret file correctly as well as login to the flyte console properly. However the flytepropeller pod seems to not pick the correct authentication method. Please assist me in setting the correct authentication parameters in the flytepropeller pod configurations.
    Sujith Samuel
    Kevin Su
    3 replies
    Copy to Clipboard
  • Sathish kumar Venkatesan

    Sathish kumar Venkatesan

    3 weeks ago
    Team, when i am executing this command pyflyte run --remote -p flytetester --image <AWC_ACCOUNT>.dkr.ecr.us-east-1.amazonaws.com/flyte-pyspark:latest flyte/workflows/spark_example.py my_spark --triggered_date 2022-08-29 getting below exception.
    Sathish kumar Venkatesan
    Kevin Su
    +3
    15 replies
    Copy to Clipboard