https://flyte.org logo
Join the conversationJoin Slack
Channels
announcements
ask-the-community
auth
conference-talks
contribute
databricks-integration
datahub-flyte
deployment
ecosystem-unionml
engineeringlabs
events
feature-discussions
flyte-bazel
flyte-build
flyte-console
flyte-deployment
flyte-documentation
flyte-github
flyte-ui-ux
flytekit
flytekit-java
flytelab
great-content
hacktoberfest-2022
helsing-flyte
in-flyte-conversations
introductions
jobs
konan-integration
linkedin-flyte
random
ray-integration
ray-on-flyte
release
scipy-2022-sprint
sig-large-models
workflow-building-ui-proj
writing-w-sfloris
Powered by Linen
flyte-github
  • g

    GitHub

    01/10/2023, 9:39 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flyteplugins/tree/master|master>
    by hamersaw
    <https://github.com/flyteorg/flyteplugins/commit/109224c2a0e65782fee53336b46cbe4bd0d2d189|109224c2>
    - added raw-container to registered task types (#305) flyteorg/flyteplugins
  • g

    GitHub

    01/10/2023, 9:41 PM
    Release - v1.0.29 New release published by flyte-bot Changelog • 109224c added raw-container to registered task types (#305) flyteorg/flyteplugins
  • g

    GitHub

    01/10/2023, 9:47 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flyteidl/tree/master|master>
    by katrogan
    <https://github.com/flyteorg/flyteidl/commit/9fbac98b2d173fe1b30f18ac0487ff35b9627e3b|9fbac98b>
    - Add raw claims to user info response (#357) flyteorg/flyteidl
  • g

    GitHub

    01/10/2023, 9:54 PM
    Release - v1.3.3 New release published by flyte-bot Changelog • 9fbac98 Add raw claims to user info response (#357) flyteorg/flyteidl
  • g

    GitHub

    01/10/2023, 10:08 PM
    #511 Forward all claims in userinfo response Pull request opened by katrogan Signed-off-by: Katrina Rogan katroganGH@gmail.com TL;DR Return all claims from OIDC user info server response as custom metadata headers. Type ☐ Bug Fix ☑︎ Feature ☐ Plugin Are all requirements met? ☑︎ Code completed ☑︎ Smoke tested ☑︎ Unit tests added ☑︎ Code documentation added ☑︎ Any pending items have an associated Issue Complete description NA Tracking Issue fixes flyteorg/flyte#3225 Follow-up issue NA flyteorg/flyteadmin GitHub Actions: Unit Tests / Run Unit Test GitHub Actions: Lint / Run Lint GitHub Actions: Docker Build Images / Build Docker Image GitHub Actions: Check Go Generate / Go Generate ✅ 2 other checks have passed 2/6 successful checks
  • g

    GitHub

    01/10/2023, 10:09 PM
    #3226 [Core feature] map_task should be able to handle a partitioned StructuredDataset Issue created by cosmicBboy Motivation: Why do you think this is important? As a data practitioner, I should be able to apply a
    map_task
    to a partitioned StructuredDataset automatically so that I can process the partitions in an embarrassingly parallel fashion without too much extra code. Goal: What should the final outcome look like, ideally? Suppose we have a task that produces a
    StructuredDataset
    @task
    def make_df() -> StructuredDataset:
        df = pd.DataFrame.from_records([
            {
                "id": i,
                "partition": (i % 10) + 1,
                "name": "".join(
                    random.choices(string.ascii_uppercase + string.digits, k=10)
                )
            }
            for i in range(1000)
        ])
        return StructuredDataset(dataframe=df, partition_col=["partition"])
    Ideally, I should be able to do something like this:
    @task
    def process_df(dataset: StructuredDataset) -> StructuredDataset:
        df = structured_dataset.open(pd.DataFrame).read_partition()  # read the partition
        ... # do stuff
    
    @task
    def use_processed_df(dataset: List[StructuredDataset]) -> ...:
        ...
    
    @workflow
    def wf() -> StructuredDataset:
        structured_dataset = make_df()
        # where structured_dataset.partitions is a list of unpartitioned StructuredDatasets
        results: List[StructuredDataset] = map_task(process_df)(dataset=structured_dataset.partitions)
        return use_processed_df(dataset=results)
    Note that in this example code a few magical things are happening: 1. we pass in
    structured_dataset.partitions
    into the map task, which indicates that we want to apply
    process_df
    to each of the partitions defined in
    make_df
    2. The fact that
    map_task(process_df)
    returns a
    StructuredDataset
    implies that using map tasks with structured datasets does an implicit reduction, i.e. the outputs of
    map_task(process_df)
    are written to the same blob store prefix. Ideally the solution enables processing of
    StructuredDataset
    without having to manually handle reading in of partitions in the map task, and automatically reduces the results into a
    StructuredDataset
    without having to explicitly write a coalense/reduction task. Describe alternatives you've considered Users would have to roll their own way of processing partitions of a structured dataset using dynamic tasks. Propose: Link/Inline OR Additional context Slack context: https://flyte-org.slack.com/archives/CP2HDHKE1/p1673380243923279 Related to #3219 Are you sure this issue hasn't been raised already? ☑︎ Yes Have you read the Code of Conduct? ☑︎ Yes flyteorg/flyte
  • g

    GitHub

    01/10/2023, 11:00 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flyteadmin/tree/master|master>
    by katrogan
    <https://github.com/flyteorg/flyteadmin/commit/1ccd59c249e2305185bd7e5cd9340c4f60c6e8bd|1ccd59c2>
    - Forward all claims in userinfo response (#511) flyteorg/flyteadmin
  • g

    GitHub

    01/10/2023, 11:26 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flytekit/tree/master|master>
    by wild-endeavor
    <https://github.com/flyteorg/flytekit/commit/581a5c66b6dec1d105f8f655918c405665ceebf6|581a5c66>
    - Update default config to work out-of-the-box with flytectl demo (#1384) flyteorg/flytekit
  • g

    GitHub

    01/10/2023, 11:36 PM
    #3116 Update Administrator docs Pull request opened by wild-endeavor Deployment Documentation Restructure and update some deployment docs. Main changes: • Remove all the cloud specific stuff. • Point users to a newly created GH Discussion topic for submitting articles on deployment experiences. • Use the new
    flyte-binary
    Helm chart. • Remove a lot of the sandbox stuff since the new
    flytectl demo
    environment is architected differently. • Change instructions to use
    flytectl demo
    instead of
    flytectl sandbox
    • Remove
    ideal_flow.rst
    - this is incomplete. We should eventually offer some basic gh workflow examples rather than just talking about it. Structure Changes Bundling all Flyte configuration under the deployment moniker felt a bit of a stretch. I think renaming it to a broader administrator's guide makes more sense. • Currently the generated component configs are missing, but this will end up changing with the partial mono-repo work anyways. Deployment Updates One of the things we want to do is change the deployment journey. We want to make sure users are led comfortably through the various stages that one might expect of something as complex as Flyte. Basically the steps of the journey should be 1. flytectl demo sandbox 2. a simple cloud (eks/gke) based deployment with nothing tricky - just helm install. users will have to port-forward to see anything. 3. a production ready deployment (ingress, auth, etc) (think stable enough for most companies). 4. a scalable multicluster setup (lots of deployments will never need this level). The middle steps will replace the aws/gcp guides that we have today. These have been completely deleted. From there we should link to a revamped version of the
    ideal_flow.rst
    file I think. #2993 flyteorg/flyte ✅ All checks have passed 11/11 successful checks
    • 1
    • 1
  • g

    GitHub

    01/10/2023, 11:55 PM
    Release - v1.1.70 New release published by flyte-bot Changelog • 1ccd59c Forward all claims in userinfo response (#511) flyteorg/flyteadmin
  • g

    GitHub

    01/11/2023, 4:47 AM
    #3228 Add Kubernetes objects for dev mode with single-binary sandbox Pull request opened by jeevb on <!date^1673410264^{date_short}|2023-01-11T04:11:04Z> This PR completes support for running Flyte locally against a sandbox cluster running on your machine. Specifically, it adds support for Propeller's secret webhook. It does so by creating a headless service pointing to the docker host's IP address. The host IP is injected via the
    %{HOST_GATEWAY_IP}%
    template variable. flyteorg/flyte GitHub Actions: trigger-sandbox-lite-build GitHub Actions: trigger-single-binary-build GitHub Actions: compile GitHub Actions: Functional test ✅ 6 other checks have passed 6/10 successful checks
  • g

    GitHub

    01/11/2023, 12:51 PM
    #164 Change SdkBindingData to be typed and add typed transform outputs #minor Pull request opened by sonjaer TL;DR This PR tries to accomplish two features: Change the SdkBindingData to be typed SdkBindingData<?> This change allows the user to know the inner type of the attributes from INPUT and OUTPUT at development time. Add typed transform outputs This change allows getting the outputs of a SdkNode.getOutputs() this output is a typed output using an AutoValue class or a case class. Type ☐ Bug Fix ☑︎ Feature ☐ Plugin Are all requirements met? ☑︎ Code completed ☐ Smoke tested ☑︎ Unit tests added ☐ Code documentation added ☐ Any pending items have an associated Issue Complete description Change the SdkBindingData to be typed SdkBindingData<?> This change allows the user to know the inner type of the attributes from INPUT and OUTPUT at development time. Before this change, the users must go to the input/output class to figure out the specified attribute type. Now, the SdkBindingData<?> shows the inner type value making the user experience more smoothie. Add typed transform outputs This change allows getting the outputs of a SdkNode.getOutputs() this output is a typed output using an AutoValue class or a case class. These changes force all the input/output class attributes to be SdkBindingData<?> and now you need to do a
    SdkBindingData.get()
    to get the inner value in the run task context (java example, scala example), but at the same time these changes allow to recover the attributes by name in the workflow site (java example, scala example). Tracking Issue • flyteorg/flyte#3250 • flyteorg/flyte#3251 Follow-up issue • flyteorg/flyte#3252 flyteorg/flytekit-java ✅ All checks have passed 3/3 successful checks
    • 1
    • 1
  • g

    GitHub

    01/11/2023, 7:15 PM
    #1408 Return error code on fail Pull request opened by pingsutw Signed-off-by: Kevin Su pingsutw@gmail.com TL;DR aws batch get the status of job from return code. Therefore, we should return error code once we catch the error. blocked by flyteorg/flyteplugins#306 Type ☑︎ Bug Fix ☐ Feature ☐ Plugin Are all requirements met? ☑︎ Code completed ☑︎ Smoke tested ☑︎ Unit tests added ☐ Code documentation added ☐ Any pending items have an associated Issue Complete description

    image▾

    Tracking Issue https://flyte-org.slack.com/archives/C01P3B761A6/p1664972219043289 flyteorg/flyte#2979 flyteorg/flytekit ✅ All checks have passed 30/30 successful checks
    • 1
    • 1
  • g

    GitHub

    01/11/2023, 9:21 PM
    #517 use different perm Pull request opened by wild-endeavor Signed-off-by: Yee Hing Tong wild-endeavor@users.noreply.github.com Read then delete this section - Make sure to use a concise title for the pull-request. _- Use #patch, #minor or #major in the pull-request title to bump the corresponding version. Otherwise, the patch version will be bumped. More details_ TL;DR Please replace this text with a description of what this PR accomplishes. Type ☐ Bug Fix ☐ Feature ☐ Plugin Are all requirements met? ☐ Code completed ☐ Smoke tested ☐ Unit tests added ☐ Code documentation added ☐ Any pending items have an associated Issue Complete description How did you fix the bug, make the feature etc. Link to any design docs etc Tracking Issue _Remove the '_fixes_' keyword if there will be multiple PRs to fix the linked issue_ fixes https://github.com/flyteorg/flyte/issues/ Follow-up issue NA OR https://github.com/flyteorg/flyte/issues/ flyteorg/flytepropeller
  • g

    GitHub

    01/11/2023, 9:50 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flytepropeller/tree/master|master>
    by hamersaw
    <https://github.com/flyteorg/flytepropeller/commit/ce57dbf15274a77c7037f367bda738e6ce33b453|ce57dbf1>
    - use different perm (#517) flyteorg/flytepropeller
  • g

    GitHub

    01/11/2023, 10:11 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flyte/tree/master|master>
    by eapolinario
    <https://github.com/flyteorg/flyte/commit/782fe745452a07294db47f6ff782e5c559ac3c92|782fe745>
    - Add dask operator (#3145) flyteorg/flyte
  • g

    GitHub

    01/11/2023, 10:13 PM
    #3209 Changelog for 1.3 Pull request opened by wild-endeavor on <!date^1672871808^{date_short}|2023-01-04T22:36:48Z> Just the changelog, no code changes. flyteorg/flyte GitHub Actions: generate_kustomize GitHub Actions: compile GitHub Actions: docs ✅ 3 other checks have passed 3/6 successful checks
  • g

    GitHub

    01/11/2023, 10:46 PM
    Release - v1.1.62 New release published by flyte-bot Changelog • ce57dbf use different perm (#517) flyteorg/flytepropeller
  • g

    GitHub

    01/11/2023, 10:48 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flyte/tree/master|master>
    by jeevb
    <https://github.com/flyteorg/flyte/commit/c674bfa31ef0d262112b083950c4425480a18bc3|c674bfa3>
    - Add Kubernetes objects for dev mode with single-binary sandbox (#3228) flyteorg/flyte
  • g

    GitHub

    01/11/2023, 10:50 PM
    #380 change host Pull request opened by wild-endeavor Signed-off-by: Yee Hing Tong wild-endeavor@users.noreply.github.com Read then delete • Make sure to use a concise title for the pull-request. • Use #patch, #minor #majora or #none in the pull-request title to bump the corresponding version. Otherwise, the patch version will be bumped. More details TL;DR Please replace this text with a description of what this PR accomplishes. Type ☐ Bug Fix ☐ Feature ☐ Plugin Are all requirements met? ☐ Code completed ☐ Smoke tested ☐ Unit tests added ☐ Code documentation added ☐ Any pending items have an associated Issue Complete description How did you fix the bug, make the feature etc. Link to any design docs etc Tracking Issue https://github.com/flyteorg/flyte/issues/ Follow-up issue NA OR https://github.com/flyteorg/flyte/issues/ flyteorg/flytectl
  • g

    GitHub

    01/11/2023, 10:55 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flyte/tree/master|master>
    by eapolinario
    <https://github.com/flyteorg/flyte/commit/c19c282df236b4f7917a64f7928eeaa121723b56|c19c282d>
    - Changelog for 1.3 (#3209) flyteorg/flyte
  • g

    GitHub

    01/11/2023, 11:12 PM
    #3230 Update Flyte components Pull request opened by flyte-bot Updated flyte deployment • Updated GCP Flyte kustomize generated manifest file • Updated EKS Flyte kustomize generated manifest file • Updated Sandbox Flyte kustomize generated manifest file • Updated TEST Flyte kustomize generated manifest file • Updated GCP Flyte helm generated manifest file • Updated EKS Flyte helm generated manifest file • Updated Sandbox Flyte helm generated manifest file • Updated TEST Flyte helm generated manifest file • Auto-generated by [flyte-bot] flyteorg/flyte
  • g

    GitHub

    01/11/2023, 11:14 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flytectl/tree/master|master>
    by wild-endeavor
    <https://github.com/flyteorg/flytectl/commit/b0d98931bf8780c193f9fc101242309058138795|b0d98931>
    - Change extra host to host-gateway (#380) flyteorg/flytectl
  • g

    GitHub

    01/11/2023, 11:33 PM
    1 new commit pushed to
    <https://github.com/flyteorg/flyte/tree/master|master>
    by eapolinario
    <https://github.com/flyteorg/flyte/commit/f69fb09ca189e8bf57e1a6a12db168274f640d15|f69fb09c>
    - Update Flyte components (#3230) flyteorg/flyte
  • g

    GitHub

    01/11/2023, 11:37 PM
    Release - v0.6.26 New release published by flyte-bot Changelog • b0d9893 Change extra host to host-gateway (#380) flyteorg/flytectl
  • g

    GitHub

    01/11/2023, 11:37 PM
    1 new commit pushed to
    <https://github.com/flyteorg/homebrew-tap/tree/main|main>
    by flyte-bot
    <https://github.com/flyteorg/homebrew-tap/commit/6d942c75f7f2a62d6180fb5d16c454d6e28c8d54|6d942c75>
    - Brew formula update for flytectl version v0.6.26 flyteorg/homebrew-tap
  • g

    GitHub

    01/11/2023, 11:57 PM
    #151 auto-update release Pull request opened by flyte-bot Automated changes by create-pull-request GitHub action flyteorg/flyteorg.github.io
  • g

    GitHub

    01/11/2023, 11:57 PM
    Release - Flyte v1.3.0 milestone release New release published by flyte-bot Flyte v1.3.0 The main features of this 1.3 release are • Databricks support as part of the Spark plugin • New Helm chart that offers a simpler deployment using just one Flyte service • Signaling/gate node support (human in the loop tasks) • User documentation support (backend and flytekit only, limited types) The latter two are pending some work in Flyte console, they will be piped through fully by the end of Q1. Support for setting and approving gate nodes is supported in
    FlyteRemote
    however, though only a limited set of types can be passed in. Notes There are a couple things to point out with this release. Caching on Structured Dataset Please take a look at the flytekit PR notes for more information but if you haven't bumped Propeller to version v1.1.36 (aka Flyte v1.2) or later, tasks that take as input a dataframe or a structured dataset type, that are cached, will trigger a cache miss. If you've upgraded Propeller, it will not. Flytekit Remote Types In the
    FlyteRemote
    experience, fetched tasks and workflows will now be based on their respective "spec" classes in the IDL (task/wf) rather than the template. The spec messages are a superset of the template messages so no information is lost. If you have code that was accessing elements of the templates directly however, these will need to be updated. Usage Overview Databricks Please refer to the documentation for setting up Databricks. Databricks is a subclass of the Spark task configuration so you'll be able to use the new class in place of the more general
    Spark
    configuration.
    from flytekitplugins.spark import Databricks
    @task(
        task_config=Databricks(
            spark_conf={
                "spark.driver.memory": "1000M",
                "spark.executor.memory": "1000M",
                "spark.executor.cores": "1",
                "spark.executor.instances": "2",
                "spark.driver.cores": "1",
            },
            databricks_conf={
                "run_name": "flytekit databricks plugin example",
                "new_cluster": {
                    "spark_version": "11.0.x-scala2.12",
                    "node_type_id": "r3.xlarge",
                    "aws_attributes": {
                        "availability": "ON_DEMAND",
                        "instance_profile_arn": "arn:aws:iam::1237657460:instance-profile/databricks-s3-role",
                    },
                    "num_workers": 4,
                },
                "timeout_seconds": 3600,
                "max_retries": 1,
            }
        ))
    New Deployment Type A couple releases ago, we introduced a new Flyte executable that combined all the functionality of Flyte's backend into one command. This simplifies the deployment in that only one image needs to run now. This approach is now our recommended way for new comers to the project to install and administer Flyte and there is a new Helm chart also. Documentation has been updated to take this into account. For new installations of Flyte, clusters that do not already have the
    flyte-core
    or
    flyte
    charts installed, users can
    helm install flyte-server flyteorg/flyte-binary --namespace flyte --values your_values.yaml
    New local demo environment Users may have noticed that the environment provided by
    flytectl demo start
    has also been updated to use this new style of deployment, and internally now installs this new Helm chart. The demo cluster now also exposes an internal docker registry on port
    30000
    . That is, with the new demo cluster up, you can tag and push to
    localhost:30000/yourimage:tag123
    and the image will be accessible to the internal Docker daemon. The web interface is still at
    localhost:30080
    , Postgres has been moved to
    30001
    and the Minio API (not web server) has been moved to
    30002
    . Human-in-the-loop Workflows Users can now insert sleeps, approval, and input requests, in the form of gate nodes. Check out one of our earlier issues for background information.
    from flytekit import wait_for_input, approve, sleep
    
    @workflow
    def mainwf(a: int):
        x = t1(a=a)
        s1 = wait_for_input("signal-name", timeout=timedelta(hours=1), expected_type=bool)
        s2 = wait_for_input("signal name 2", timeout=timedelta(hours=2), expected_type=int)
        z = t1(a=5)
        zzz = sleep(timedelta(seconds=10))
        y = t2(a=s2)
        q = t2(a=approve(y, "approvalfory", timeout=timedelta(hours=2)))
        x >> s1
        s1 >> z
        z >> zzz
        ...
    These also work inside
    @dynamic
    tasks. Interacting with signals from flytekit's remote experience looks like
    from flytekit.remote.remote import FlyteRemote
    from flytekit.configuration import Config
    r = FlyteRemote(
        Config.auto(config_file="/Users/ytong/.flyte/dev.yaml"),
       default_project="flytesnacks",
       default_domain="development",
    )
    r.list_signals("atc526g94gmlg4w65dth")
    r.set_signal("signal-name", "execidabc123", True)
    Overwritten Cached Values on Execution Users can now configure workflow execution to overwrite the cache. Each task in the workflow execution, regardless of previous cache status, will execute and write cached values - overwritting previous values if necessary. This allows previously corrupted cache values to be corrected without the tedious process of incrementing the
    cache_version
    and re-registering Flyte workflows / tasks. Support for Dask Users will be able to spawn Dask ephemeral clusters as part of their workflows, similar to the support for Ray and Spark. Looking Ahead In the coming release, we are focusing on... 1. Out of core plugin: Make backend plugin scalable and easy to author. No need of code generation, using tools that MLEs and Data Scientists are not accustomed to using. 2. Performance Observability: We have made great progress on exposing both finer-grained runtime metrics and Flytes orchestration metrics. This is important to better understand workflow evaluation performance and mitigate inefficiencies thereof. flyteorg/flyte
  • g

    GitHub

    01/11/2023, 11:58 PM
    Deployment to github-pages by flyte-bot flyteorg/flyte
  • g

    GitHub

    01/12/2023, 1:12 AM
    #3232 [Core feature] Pass inputs inline with execution event data Issue created by katrogan Motivation: Why do you think this is important? Follow-up to #1327. For a distributed propeller deployment it's useful to pass executions inputs inline rather than as an offloaded URI. Goal: What should the final outcome look like, ideally? Raw inputs should be (configurably) sent inline in execution events Describe alternatives you've considered Current implementation Propose: Link/Inline OR Additional context No response Are you sure this issue hasn't been raised already? ☑︎ Yes Have you read the Code of Conduct? ☑︎ Yes flyteorg/flyte
Powered by Linen
Title
g

GitHub

01/12/2023, 1:12 AM
#3232 [Core feature] Pass inputs inline with execution event data Issue created by katrogan Motivation: Why do you think this is important? Follow-up to #1327. For a distributed propeller deployment it's useful to pass executions inputs inline rather than as an offloaded URI. Goal: What should the final outcome look like, ideally? Raw inputs should be (configurably) sent inline in execution events Describe alternatives you've considered Current implementation Propose: Link/Inline OR Additional context No response Are you sure this issue hasn't been raised already? ☑︎ Yes Have you read the Code of Conduct? ☑︎ Yes flyteorg/flyte
View count: 3