https://flyte.org logo
Join the conversationJoin Slack
Channels
announcements
ask-the-community
auth
conference-talks
contribute
databricks-integration
datahub-flyte
deployment
ecosystem-unionml
engineeringlabs
events
feature-discussions
flyte-bazel
flyte-build
flyte-console
flyte-deployment
flyte-documentation
flyte-github
flyte-ui-ux
flytekit
flytekit-java
flytelab
great-content
hacktoberfest-2022
helsing-flyte
in-flyte-conversations
introductions
jobs
konan-integration
linkedin-flyte
random
ray-integration
ray-on-flyte
release
scipy-2022-sprint
sig-large-models
workflow-building-ui-proj
writing-w-sfloris
Powered by Linen
ask-the-community
  • s

    Sanjay Chouhan

    09/28/2022, 9:39 AM
    https://docs.flyte.org/projects/cookbook/en/latest/auto/larger_apps/larger_apps_deploy.html#build-deploy-your-application-to-the-cluster Does the docker image should have the workflow code inside it?
    k
    s
    • 3
    • 2
  • h

    Hampus Rosvall

    09/28/2022, 12:41 PM
    Hey, what’s the usage of
    {flyte/sandbox}.config
    and what are best practices for usage as a production config? Referencing this sandbox.config
    k
    • 2
    • 2
  • a

    Anthony

    09/28/2022, 12:53 PM
    Hey everyone again 🙌 I see a stuck in triggering to the next task. My main workflow is depicted in attached pic. First
    preproc_and_split
    step was executed successfully:
    pyflyte-execute
    --inputs
    <s3://my-s3-bucket/metadata/propeller/flyte-anti-fraud-ml-development-a27rchl5z9ndpw297nk8/n0/data/inputs.pb>
    --output-prefix
    <s3://my-s3-bucket/metadata/propeller/flyte-anti-fraud-ml-development-a27rchl5z9ndpw297nk8/n0/data/0>
    --raw-output-data-prefix
    <s3://my-s3-bucket/vo/a27rchl5z9ndpw297nk8-n0-0>
    --checkpoint-path
    <s3://my-s3-bucket/vo/a27rchl5z9ndpw297nk8-n0-0/_flytecheckpoints>
    --prev-checkpoint
    ""
    --resolver
    flytekit.core.python_auto_container.default_task_resolver
    --
    task-module
    app.workflow
    task-name
    preproc_and_split
    On the output one should expect a small train dataset with 50k records. In Nods allocation i see a sufficient mem available. But then the first task has been succeeded i see an eternal hang in this step and flyte don’t produce next executions according to the workflow.
    task_resource_defaults
    conf is the next:
    task_resource_defaults.yaml: |
        task_resources:
          defaults:
            cpu: 1
            memory: 3000Mi
            storage: 200Mi
          limits:
            cpu: 5
            gpu: 1
            memory: 8Gi
            storage: 500Mi
    I have one task that generates a dataclasses instances on the exit and another task should takes these classes as input params:
    @workflow
    def main_flow() -> Forecast:
        """
        Main Flyte WorkFlow consisting of three tasks:
            -  @preproc_and_split
            -  @train_xgboost_clf
            -  @get_predictions
        """
        <http://logger.info|logger.info>(log="#START -- START Raw Preprocessing and Splitting", timestamp=None)
        train_cls, target_cls = preproc_and_split()
    
        <http://logger.info|logger.info>(log="#START -- START Initialize Boosting Params", timestamp=None)
        saved_mpath = train_xgboost_clf(
                                feat_cls=train_cls,
                                target_cls=target_cls,
                                xgb_params=xgb_params,
                                cust_metric=BoostingCustMetric
                             )
    Where
    def preproc_and_split() -> Tuple[Fraud_Raw_PostProc_Data_Class, Fraud_Raw_Target_Data_Class]:
    Any advices why I faced this behaviour?
    k
    s
    y
    • 4
    • 8
  • h

    Hampus Rosvall

    09/28/2022, 2:09 PM
    Hey, I am trying to understand how to work with launch plans. Let’s say I have this workflow and I package and deploy it. When I look in the UI the launch plan is there, but when I inspect the workflow I can’t select the launch plan. Should I package and deploy the launchplan individually or what is the preferred way? Code in comments
    s
    k
    • 3
    • 7
  • a

    Augie Palacios

    09/28/2022, 7:15 PM
    has anyone put the flyte services behind their own api gateway before? having issues rerouting the js assets for the UI. I am able to update the url for
    /console
    to
    /my/url/console
    but the UI still looks for the js assests in
    /console/assets/*
    . This is just the beginning though, will eventually need to prepend a path to
    /api/v1/projects
    as well so it doesn't interfere with our endpoints
    k
    • 2
    • 5
  • f

    Frank Shen

    09/28/2022, 7:24 PM
    Hi, I have a technical question. Background of my question. I learned that in order to chain two tasks as dependent tasks, one task’s output must be the input of the other. I have a task that depends on a task (say to remove data in a folder) that doesn’t produce any output. How do I do it?
    a
    e
    • 3
    • 4
  • d

    Dylan Wilder

    09/28/2022, 9:12 PM
    is there a way to query flyte admin for all workflows in a project, domain, and version? i see there's a filters construct, but can't tell which filters are available
    y
    m
    • 3
    • 19
  • k

    Kim Junil

    09/29/2022, 1:36 AM
    Hi! Can i set default value of workflow parameter dynamically? like date type parameter set date.now() I want always set date parameter default value to today When i run workflow on Flyte console UI.
    k
    • 2
    • 1
  • c

    Chandramoulee K V

    09/29/2022, 6:42 AM
    Hi, I was working on registering a task to execute it whenever needed and i was able to register and run the task from the flyte -console but when i tried to create a execution for the same task using command line i was able to retrieve the task using the command:
    flytectl get task --project flytesnacks --domain development {taskname} --latest --execFile exec_spec.yaml
    and this modified the
    exec_spec.yaml
    and it looks like this:
    iamRoleARN: ""
    inputs: {}
    kubeServiceAcct: ""
    targetDomain: ""
    targetProject: ""
    task: {taskname}
    version: "1.0"
    and i tried to create an execution with the following command :
    flytectl create execution --execFile exec_spec.yaml -p flytesnacks -d development --targetProject flytesnacks
    and it throws an error like:
    Error: rpc error: code = Internal desc = failed to create workflow in propeller json: error calling MarshalJSON for type *v1alpha1.Inputs: Marshal called with nil
    s
    p
    • 3
    • 10
  • k

    KS Tarun

    09/29/2022, 6:44 AM
    Hi, I've setup a flyte sandbox cluster in an EC2 instance. And created a custom docker image with the required dependencies installed in it. But when I try to run any script, I am getting
    ModuleNotFoundError
    , though that module was installed when creating the docker image. And this is getting resolved when I
    pip install
    that module in the instance. So, please clarify whether on running a workflow, would flyte propeller look for the dependencies in the specified docker container or it tries to use the dependencies from the server it is running in ?
    s
    k
    • 3
    • 16
  • s

    Sanjay Chouhan

    09/29/2022, 8:29 AM
    When registering workflow with the pyflyte register command. The workflow name becomes script_name.workflow_name on console. Is there any way to ignore the script file name? I want the workflow name to be just workflow_name. For example, for this https://docs.flyte.org/en/latest/getting_started/index.html#create-a-workflow the workflow name will be example.wf instead of wf.
    s
    k
    • 3
    • 6
  • h

    Hampus Rosvall

    09/29/2022, 3:19 PM
    Let’s say we have a workflow that should be associated with N different inputs, entailing N different launch plans. How would you structure/manage this in code? We have adopted an idea of managing the launch plans in a separate folder, next to the flyte workflows and thus deploying the launch plans and workflow in separate - see example below. Does this make sense for the N:1 relationship between launch plans and workflows? This entails a lot of duplication of boilerplate code in
    lp.py
    but it is maybe the cleanest solution to this problem?
    (.venv) ☁  flyte [main] tree .
    .
    ├── <http://in_container.mk|in_container.mk>
    ├── launchplans
    │   └── lp.py
    └── workflows
        ├── wf.py
    h
    s
    • 3
    • 17
  • j

    James Evers

    09/29/2022, 7:33 PM
    hi everybody, i'm having some trouble running
    ContainerTask
    . Both a small test that I wrote myself and the cookbook example here fail when run locally. The cookbook example runs successfully when its run remotely. Any help here is appreciated!
    k
    • 2
    • 9
  • k

    karthikraj

    09/30/2022, 2:52 AM
    Hi, I have a custom plugin named MyCustomTask written in a python file(say mycustomtask.py). How can I include this in the Flyte environment so that any one uses this plugin should be able to import and use it like other default plugins. Example: This is how we import Snowflake task ->
    from flytekitplugins.snowflake import SnowflakeConfig, SnowflakeTask
    . I need to do the same thing with my custom plugin too. ->
    from flytekitplugins.mycustomtask import MyCustomTask
    or similarly.
    s
    • 2
    • 3
  • a

    Arshak Ulubabyan

    09/30/2022, 8:34 AM
    Hi, I have a question: Is there a way from within a task code execution to retrieve information on under what domain & project is it running. on? E.g. I have Development/Staging/Production domains, and when I run the workflow, I want task to load configs for the right environment.
    n
    r
    +3
    • 6
    • 12
  • s

    Sanjay Chouhan

    09/30/2022, 11:27 AM
    How to use multiple images with a config file in `pyflyte register`command? I am running,
    pyflyte --config config.yaml register test1.py --version 1.0.1
    I have created the code as mentioned in the doc, https://docs.flyte.org/projects/cookbook/en/latest/auto/core/containerization/multi_images.html# The config.yaml is,
    admin:
      # For GRPC endpoints you might want to use dns:///flyte.myexample.com
      endpoint: dns:///##############-16##<http://26454.us-west-1.elb.amazonaws.com:80|26454.us-west-1.elb.amazonaws.com:80>
      authType: Pkce
      insecure: true
    logger:
      show-source: true
      level: 0
    images:
      trainer: <http://ghcr.io/flyteorg/flytecookbook:core-latest|ghcr.io/flyteorg/flytecookbook:core-latest>
      predictor: moulee31/sample:1.0
    The error is,
    Traceback (most recent call last):
      File "/home/sanjaychouhan/.local/bin/pyflyte", line 8, in <module>
        sys.exit(main())
      File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
        return self.main(*args, **kwargs)
      File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
        rv = self.invoke(ctx)
      File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
        return callback(*args, **kwargs)
      File "/usr/lib/python3/dist-packages/click/decorators.py", line 17, in new_func
        return f(get_current_context(), *args, **kwargs)
      File "/home/sanjaychouhan/.local/lib/python3.8/site-packages/flytekit/clis/sdk_in_container/register.py", line 174, in register
        registerable_entities = load_packages_and_modules(
      File "/home/sanjaychouhan/.local/lib/python3.8/site-packages/flytekit/tools/repo.py", line 225, in load_packages_and_modules
        registrable_entities = serialize(pkgs_and_modules, ss, str(project_root), options)
      File "/home/sanjaychouhan/.local/lib/python3.8/site-packages/flytekit/tools/repo.py", line 54, in serialize
        registrable_entities = get_registrable_entities(ctx, options=options)
      File "/home/sanjaychouhan/.local/lib/python3.8/site-packages/flytekit/tools/serialize_helpers.py", line 75, in get_registrable_entities
        get_serializable(new_api_serializable_entities, ctx.serialization_settings, entity, options=options)
      File "/home/sanjaychouhan/.local/lib/python3.8/site-packages/flytekit/tools/translator.py", line 578, in get_serializable
        cp_entity = get_serializable_task(entity_mapping, settings, entity)
      File "/home/sanjaychouhan/.local/lib/python3.8/site-packages/flytekit/tools/translator.py", line 173, in get_serializable_task
        container = entity.get_container(settings)
      File "/home/sanjaychouhan/.local/lib/python3.8/site-packages/flytekit/core/python_auto_container.py", line 164, in get_container
        image=get_registerable_container_image(self.container_image, settings.image_config),
      File "/home/sanjaychouhan/.local/lib/python3.8/site-packages/flytekit/core/python_auto_container.py", line 235, in get_registerable_container_image
        raise AssertionError(f"Image Config with name {name} not found in the configuration")
    AssertionError: Image Config with name trainer not found in the configuration
    It was working with pyflyte run command,
    pyflyte --config config.yaml run --remote test1.py test_workflow
    k
    e
    s
    • 4
    • 4
  • s

    Sanjay Chouhan

    09/30/2022, 1:45 PM
    With Flyte in EKS cluster, can we mention what type of EC2 instance to use for a task? Instead of just cpu and ram.
    k
    • 2
    • 1
  • j

    James Evers

    09/30/2022, 3:09 PM
    a few general questions: 1. with many of the machine-learning tutorials, i notice that the whole workflow is typically wrapped into one task. it seems natural to me that this approach doesn't take advantage of a lot of flyte's features (caching, being able to share tasks across workflows, etc). is there a reason that most of the ML examples are written this way? 2. i often notice that after i tear down a sandbox cluster, running
    docker system prune -a --volumes
    frees up a sometimes surprising amount of space (i ran the mnist tutorial a few times and pruning the volumes revealed that it had used ~14GB on my machine). is there a way to reduce/mitigate this or is it just a necessary result of saving workflow executions?
    s
    k
    • 3
    • 3
  • n

    Nicholas LoFaso

    09/30/2022, 4:14 PM
    Hi we’re doing some performance testing and when we start a large number of tasks at once it seems that FlytePropeller loses track of some of the running pods. For example a pod will be successful, but FlytePropeller logs the following. Full log in thread
    Failed to find the Resource with name: dpp-default/g20210730154015-yjww-n0-0-dn4-0-dn108-0. Error: pods \"g20210730154015-yjww-n0-0-dn4-0-dn108-0\" not found
    Flyte restarts the task and it succeeds on the 2nd or 3rd try, but this is obviously wasted work. I’m curious if this is FlytePropeller needing more CPU/Memory to accommodate or if we are overwhelming the k8s metadata server. Any thoughts would be appreciated
    d
    • 2
    • 18
  • h

    Hank Fanchiu

    09/30/2022, 5:59 PM
    how might i enforce that every task execution always first runs some arbitrary setup script?
    y
    • 2
    • 6
  • h

    Hampus Rosvall

    10/01/2022, 8:07 AM
    Hey, playing around with
    pyflyte register
    and how do adopt it in our set up. When I package code using
    pyflyte package
    I usually provide the dot-delineated python packages to operate on i.e.,
    pyflyte --pkgs flyte.workflows package --image $REGISTRY/$REPO:$TAG -o $(PACKAGE_OUTPUT_DIR)/package.tgz
    When I run
    pyflyte register
    I would like to do the same in order to fast-register to the same workflow i.e.,
    pyflyte register flyte.workflows \
    		             --version=$VERSION \
    					 --image $REGISTRY/$REPO:$TAG \
    					 --project $PROJECT \
    	                 --domain $DOMAIN
    However running that command yields
    Usage: pyflyte register [OPTIONS] [PACKAGE_OR_MODULE]...
    Try 'pyflyte register --help' for help.
    
    Error: Invalid value for '[PACKAGE_OR_MODULE]...': Path 'flyte.workflows' does not exist.
    gmake: *** [Makefile:42: flyte-fast-register] Error 2
    Do I need to provide the path to the workflow? If I pass
    flyte/workflows/wf.py
    to
    pyflyte register
    my workflow is registered under
    <http://wf.wf|wf.wf>
    instead of
    <http://flyte.workflows.wf.wf|flyte.workflows.wf.wf>
    as with
    pyflyte package
    (.venv) tree .
    .
    ├── Dockerfile
    ├── Makefile
    ├── flyte
    │   ├── __init__.py
    │   ├── <http://in_container.mk|in_container.mk>
    │   ├── launchplans
    │   │   └── lp.py
    │   └── workflows
    │       ├── __init__.py
    │       └── wf.py
    s
    • 2
    • 1
  • t

    Taeef Najib

    10/01/2022, 11:04 AM
    Hi, team, I’m aware of the fact that Flytekit automatically converts Python type hint into Flyte type hint.
    # Creating the Multiple Linear Regression model
    def build_model(model, X_train: pd.DataFrame, y_train: pd.DataFrame):
        # Use the following values: model = LinearRegression(), X_train = X_train, y_train =    y_train
        reg = model
        reg.fit(X_train, y_train)
        return reg
    I’m using
    int
    ,
    float
    ,
    pd.DataFrame
    ,
    np.ndarray
    etc. as my type hints. My question is what type hint do I use for the argument
    model
    and since it is returning the model, what type hint should I use for
    reg
    ?
    k
    • 2
    • 12
  • h

    Hridya Agrawal

    10/01/2022, 6:41 PM
    Regarding the hacktoberfest do you get the rewards/prizes for the other numbers too
    k
    • 2
    • 1
  • h

    Hridya Agrawal

    10/01/2022, 6:41 PM
    Like if i did 4 prs and they get merged
  • h

    Hridya Agrawal

    10/01/2022, 6:42 PM
    so i get a north face hoodie, mug, tshirt or just the hoodie
  • s

    Sanjiv Anand

    10/02/2022, 3:45 AM
    Hey I want to contribute blog posts to flyte, Can someone help me get started?
    k
    • 2
    • 1
  • y

    Yash Panchwatkar

    10/02/2022, 10:35 AM
    hello guys I.g am stuck in installation process please help
  • y

    Yash Panchwatkar

    10/02/2022, 10:35 AM
    message has been deleted
  • y

    Yash Panchwatkar

    10/02/2022, 10:35 AM
    this is the msg i am getting here
  • y

    Yash Panchwatkar

    10/02/2022, 10:47 AM
    i have got this error while running flytectl demo start
Powered by Linen
Title
y

Yash Panchwatkar

10/02/2022, 10:47 AM
i have got this error while running flytectl demo start
View count: 1