Anyone has any idea what I m doing wrong here Flyte #flyte-support

Join Slack

Anyone has any idea what I’m doing wrong here?

# flyte-support

sticky-angle-28419

11/23/2022, 2:16 PM

Anyone has any idea what I’m doing wrong here?

high-accountant-32689

11/23/2022, 10:57 PM

@sticky-angle-28419, the support for named outputs is a bit confusing. Essentially you can think of them as the equivalent of kwargs but for outputs in the context of flyte. In other words, they can't be used as actual as inputs of downstream tasks directly, but you can use members of a named tuple as inputs to downstream tasks. An example will clarify:

Copy code

import typing
from flytekit import task, workflow, dynamic

my_tuple = typing.NamedTuple("A", b=str, c=int)

@task
def t1() -> my_tuple:
    return my_tuple(b="hello world", c=42)

@task
def t3(b: str, c: int):
    print(f"{b} - {c}")

@workflow
def wf_valid():
    res = t1()
    t3(b=res.b, c=res.c)

Notice how we have to access the values separately in the invocation of the downstream

t3

. In other words, we couldn't have

NamedTuple

as an input of

t3

. In your example, you're probably using the result of calling

train_task

as an input to another task, right? Can you share how that's happening in your case?

sticky-angle-28419

11/23/2022, 11:01 PM

No actually there’s only one task right now

sticky-angle-28419

11/23/2022, 11:02 PM

Copy code

_wf_outputs=typing.NamedTuple("WfOutputs",train_task_o0=flytekit.types.file.file.FlyteFile)

@workflow
def mnist(_wf_args:Hyperparameters)->_wf_outputs:
    train_task_o0_=train_task(hp=_wf_args)
    return _wf_outputs(train_task_o0_)

high-accountant-32689

11/23/2022, 11:12 PM

got it, what if you substitute the definition of the workflow with:

Copy code

@workflow
def mnist(_wf_args:Hyperparameters)->_wf_outputs:
    return train_task(hp=_wf_args)

high-accountant-32689

11/23/2022, 11:57 PM

@sticky-angle-28419 ^

sticky-angle-28419

11/24/2022, 12:05 AM

So are you suggesting workflows cannot output tuples?

sticky-angle-28419

11/24/2022, 12:05 AM

In the tutorial though, that’s exactly what doc is doing I think

high-accountant-32689

11/24/2022, 12:05 AM

no, I'm supposing that

train_task

already returns a named tuple

sticky-angle-28419

11/24/2022, 12:06 AM

Oh I see - so you’re saying I’m returning a tuple of tuple

sticky-angle-28419

11/24/2022, 12:06 AM

So the inner tuple is violating the validation

high-accountant-32689

11/24/2022, 12:07 AM

it's a little bit more involving than this. If you really want to do this you'll have to do something like:

Copy code

_wf_outputs=typing.NamedTuple("WfOutputs",train_task_o0=flytekit.types.file.file.FlyteFile)

@workflow
def mnist(_wf_args:Hyperparameters)->_wf_outputs:
    x = train_task(hp=_wf_args)
    return _wf_outputs(hp=x.hp)

high-accountant-32689

11/24/2022, 12:08 AM

essentially named tuples are treated especially. Their only purpose is so you have a way to refer to returned objects by name

high-accountant-32689

11/24/2022, 12:08 AM

99% of the time you can use a dataclass to achieve what you want

high-accountant-32689

11/24/2022, 12:09 AM

(So instead of returning a named tuple from

train_task

you return a dataclass)

sticky-angle-28419

11/24/2022, 12:11 AM

Hmm sorry I’m a bit confused as to the cause of this error - am I allowed to return a named tuple as task output and workflow output?

sticky-angle-28419

11/24/2022, 12:11 AM

I get that you cannot receive tuples as task inputs

high-accountant-32689

11/24/2022, 12:12 AM

yeah, I'll admit, the support for named tuples is really confusing. The tldr is that you're allowed to return them, but you need to be careful how to use them in downstream tasks (as named tuples are not allowed to be passed as inputs)

sticky-angle-28419

11/24/2022, 12:12 AM

Yes that was my understanding - but I only have one task here and it’s still giving me the error which is why I’m a bit confused here

high-accountant-32689

11/24/2022, 12:13 AM

so, going back to your code, did my suggestion of returning the result of calling

train_task

work?

sticky-angle-28419

11/24/2022, 12:14 AM

I got to run but let me try that and report back

sticky-angle-28419

11/24/2022, 12:14 AM

It sounds like that might do the trick but I’ll confirm and get back to you

👍 1

sticky-angle-28419

11/26/2022, 2:53 AM

Hey sorry it took a while - I was pulled into something else for a couple of days. I just deployed the workflow successfully - the problem was as you suspected - I was returning named tuple inside a named tuple.

sticky-angle-28419

11/26/2022, 2:54 AM

Since I have you though - if you don’t mind me asking another question. The model I deployed is a simple mnist model (using Pytorch Lightning). I’m using pip-compile with these requirements:

sticky-angle-28419

11/26/2022, 2:54 AM

Copy code

torch>=1.13.0
torchvision>=0.14.0
pytorch_lightning>=1.8.1
flytekit>=1.2.3
matplotlib>=3.6.2

sticky-angle-28419

11/26/2022, 2:56 AM

I’m uploading the image to gcr - and it shows that the image size of 4GB (!!). This seems exceptionally large for something so simple. Also it takes a very long time to build and deploy the workflow (11m+)

sticky-angle-28419

11/26/2022, 2:56 AM

Is this expected or do you think I’m doing something wrong here?

sticky-angle-28419

11/26/2022, 2:59 AM

Here’s my dockerfile and it’s mostly based on flyte docs (except for pip-compile part):

sticky-angle-28419

11/26/2022, 3:00 AM

Copy code

FROM python:3.8-slim-buster

WORKDIR /root
ENV VENV /opt/venv
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PYTHONPATH /root

RUN apt-get update && apt-get install -y build-essential curl

# Install pip-tools
RUN pip3 install pip-tools

# Install the AWS cli separately to prevent issues with boto being written over
RUN pip3 install awscli

# Install flytectl
RUN curl -sL <https://ctl.flyte.org/install> | bash
ENV PATH="/root/bin:$PATH"

ENV VENV /opt/venv
# Virtual environment
RUN python3 -m venv ${VENV}
ENV PATH="${VENV}/bin:$PATH"

# Compile source dependencies (i.e. <http://requirements.in|requirements.in>) to requirements.txt and then use that to install Python dependencies
COPY ./requirements.in /root
RUN pip-compile --output-file=/root/requirements.txt /root/requirements.in
# --no-cache-dir to prevent OOMKilled
RUN pip install --no-cache-dir -r /root/requirements.txt

# Copy the actual code
COPY . /root

# Init flytectl to use the correct remote host
RUN flytectl config init --host=<https://flyte.sidetrek.com>

# This tag is supplied by the build script and will be used to determine the version
# when registering tasks, workflows, and launch plans
ARG tag
ENV FLYTE_INTERNAL_IMAGE $tag

high-accountant-32689

11/29/2022, 4:39 AM

@sticky-angle-28419, sorry for the delay. Can you run

docker history <image_tag>

to have a sense of which step is taking space? I have a feeling that all the pip operations you're running while building the image are the culprit.

171 Views

Open in Slack

Previous Next