I m curious about the API choice to have `ShellTask` be a <h Flyte #flytekit

I’m curious about the API choice to have `ShellTas...

brief-nest-91031

12/13/2023, 5:49 AM

I’m curious about the API choice to have

ShellTask

be a class that users instantiate while python tasks are functions. Python tasks are presented to users as Python functions. Shell tasks are presented as data type to be instantiated. Is there a reason why these diverge?

freezing-airport-6809

12/13/2023, 6:21 AM

user functions, are still converted to classes

freezing-airport-6809

12/13/2023, 6:21 AM

the functions are to represent user code, written in python

freezing-airport-6809

12/13/2023, 6:21 AM

i would have loved to make the shell task a function, but did not know how to?

freezing-airport-6809

12/13/2023, 6:21 AM

do you have suggestions?

freezing-airport-6809

12/13/2023, 6:22 AM

There are many cases we use classes though, where there is not python function representation they are classes. examples are notebooks, sql tasks, meta tasks like deploy something etc

brief-nest-91031

12/14/2023, 3:29 PM

Looking at this example from the docs:

Copy code

ShellTask(
    name="task_1",
    debug=True,
    script="""
    set -ex
    echo "Hey there! Let's run some bash scripts using Flyte's ShellTask."
    echo "Showcasing Flyte's Shell Task." >> {inputs.x}
    if grep "Flyte" {inputs.x}
    then
        echo "Found it!" >> {inputs.x}
    else
        echo "Not found!"
    fi
    """,
    inputs=kwtypes(x=FlyteFile),
    output_locs=[OutputLocation(var="i", var_type=FlyteFile, location="{inputs.x}")],
)

what come to mind is something like (or similar)

Copy code

@shell_task(
    debug = True,
)
def task_1(x: FlyteFile) -> FlyteFile:
    return """
    set -ex
    echo "Hey there! Let's run some bash scripts using Flyte's ShellTask."
    echo "Showcasing Flyte's Shell Task." >> {inputs.x}
    if grep "Flyte" {inputs.x}
    then
        echo "Found it!" >> {inputs.x}
    else
        echo "Not found!"
    fi
    """"

brief-nest-91031

12/14/2023, 3:30 PM

I’m a little fuzzy on my flyte details so i’m not 💯 sure i got the inputs and outputs correct. I was thinking something like that.

freezing-airport-6809

12/14/2023, 3:56 PM

It’s not really returning a Flyte file right it’s returning a string- python linter will Complain

👆 1

broad-monitor-993

12/15/2023, 4:02 PM

The above

@shell_task

example I think highlights why it’s a little confusing to have functions that return templated strings… the function body of

task_1

is returning a string, not a `FlyteFile`… I’ve seen other frameworks do this and always found it odd, personally.

broad-monitor-993

12/15/2023, 4:05 PM

another way of doing this that’s perhaps a little less confusing, but still sort of syntax-sugary is defining the template string in the function docstring and then using

Annotated

to add output location metedata:

Copy code

@shell_task(
    debug = True,
)
def task_1(x: FlyteFile) -> typing.Annotated[
    FlyteFile,
    OutputLocation(var="i", location="{inputs.x}")
]:
    """
    set -ex
    echo "Hey there! Let's run some bash scripts using Flyte's ShellTask."
    echo "Showcasing Flyte's Shell Task." >> {inputs.x}
    if grep "Flyte" {inputs.x}
    then
        echo "Found it!" >> {inputs.x}
    else
        echo "Not found!"
    fi
    """"

broad-monitor-993

12/15/2023, 4:11 PM

In this case, the functional syntax does lend itself to defining the IO types quite nicely, and the template-as-docstring with an empty function body indicates “magic is going on here”

freezing-airport-6809

12/15/2023, 4:11 PM

But I do not like this too- as you can write docs then?

broad-monitor-993

12/15/2023, 4:12 PM

but… you currently can’t write docs for

ShellTask

right?

freezing-airport-6809

12/15/2023, 4:12 PM

That’s right

freezing-airport-6809

12/15/2023, 4:12 PM

We should expose a docs field

broad-monitor-993

12/15/2023, 4:17 PM

This makes it more natural to express IO types of a shell task as a function, but we’d need to make sure type-linters somehow handle the mismatch between returning a

str

None

, and the actual function return type. Not really sure how to assess this trade-off… feel like this is a “taste” thing @flaky-parrot-42438 any thoughts on this? ^^

flaky-parrot-42438

12/15/2023, 5:12 PM

Throwing an idea out there:

Copy code

@shell_task(debug=True)
def task_1(
    x: FlyteFile,
    y: FlyteDirectory,
) -> Annotated[FlyteFile, OutputLocation(var="j", location="{inputs.y}.tar.gz")]:
    shell_context = current_context().shell_context
    shell_context.set_script(""" 
    set -ex
    cp {inputs.x} {inputs.y}
    tar -zcvf {outputs.j} {inputs.y}
    """)
    return shell_context.output

🤔 1

3 Views

Open in Slack

Previous Next