I’m curious about the API choice to have `ShellTas...
# flytekit
j
I’m curious about the API choice to have
ShellTask
be a class that users instantiate while python tasks are functions. Python tasks are presented to users as Python functions. Shell tasks are presented as data type to be instantiated. Is there a reason why these diverge?
k
user functions, are still converted to classes
the functions are to represent user code, written in python
i would have loved to make the shell task a function, but did not know how to?
do you have suggestions?
There are many cases we use classes though, where there is not python function representation they are classes. examples are notebooks, sql tasks, meta tasks like deploy something etc
j
Looking at this example from the docs:
Copy code
ShellTask(
    name="task_1",
    debug=True,
    script="""
    set -ex
    echo "Hey there! Let's run some bash scripts using Flyte's ShellTask."
    echo "Showcasing Flyte's Shell Task." >> {inputs.x}
    if grep "Flyte" {inputs.x}
    then
        echo "Found it!" >> {inputs.x}
    else
        echo "Not found!"
    fi
    """,
    inputs=kwtypes(x=FlyteFile),
    output_locs=[OutputLocation(var="i", var_type=FlyteFile, location="{inputs.x}")],
)
what come to mind is something like (or similar)
Copy code
@shell_task(
    debug = True,
)
def task_1(x: FlyteFile) -> FlyteFile:
    return """
    set -ex
    echo "Hey there! Let's run some bash scripts using Flyte's ShellTask."
    echo "Showcasing Flyte's Shell Task." >> {inputs.x}
    if grep "Flyte" {inputs.x}
    then
        echo "Found it!" >> {inputs.x}
    else
        echo "Not found!"
    fi
    """"
I’m a little fuzzy on my flyte details so i’m not 💯 sure i got the inputs and outputs correct. I was thinking something like that.
k
It’s not really returning a Flyte file right it’s returning a string- python linter will Complain
n
The above
@shell_task
example I think highlights why it’s a little confusing to have functions that return templated strings… the function body of
task_1
is returning a string, not a `FlyteFile`… I’ve seen other frameworks do this and always found it odd, personally.
another way of doing this that’s perhaps a little less confusing, but still sort of syntax-sugary is defining the template string in the function docstring and then using
Annotated
to add output location metedata:
Copy code
@shell_task(
    debug = True,
)
def task_1(x: FlyteFile) -> typing.Annotated[
    FlyteFile,
    OutputLocation(var="i", location="{inputs.x}")
]:
    """
    set -ex
    echo "Hey there! Let's run some bash scripts using Flyte's ShellTask."
    echo "Showcasing Flyte's Shell Task." >> {inputs.x}
    if grep "Flyte" {inputs.x}
    then
        echo "Found it!" >> {inputs.x}
    else
        echo "Not found!"
    fi
    """"
In this case, the functional syntax does lend itself to defining the IO types quite nicely, and the template-as-docstring with an empty function body indicates “magic is going on here”
k
But I do not like this too- as you can write docs then?
n
but… you currently can’t write docs for
ShellTask
right?
k
That’s right
We should expose a docs field
n
This makes it more natural to express IO types of a shell task as a function, but we’d need to make sure type-linters somehow handle the mismatch between returning a
str
or
None
, and the actual function return type. Not really sure how to assess this trade-off… feel like this is a “taste” thing @Thomas Fan any thoughts on this? ^^
t
Throwing an idea out there:
Copy code
@shell_task(debug=True)
def task_1(
    x: FlyteFile,
    y: FlyteDirectory,
) -> Annotated[FlyteFile, OutputLocation(var="j", location="{inputs.y}.tar.gz")]:
    shell_context = current_context().shell_context
    shell_context.set_script(""" 
    set -ex
    cp {inputs.x} {inputs.y}
    tar -zcvf {outputs.j} {inputs.y}
    """)
    return shell_context.output