How would I go about write a map task using `flyte...
# flytekit-java
b
How would I go about write a map task using
flytekit-java
?
f
You mean is there support for it
Or shall we design it
b
Do we have support for it? But given your question, I assume no.
I assumed it'd a feature of the platform, not the client
f
It’s a feature of the platform
But you need to create the array node in the sdk and handle the array logic
b
So without that feature, there is no parallelization of the compute step? Or do I get that with a regular map task? Saying that, I don't even know if that's available in the current Java SDK.
f
@bumpy-match-83743 if you write this following code it will be auto parallelized
Copy code
@task
def foo(x: int):
    ...

@workflow
def wf():
    for i in range(10):
        foo(x=i)
map tasks is just a good construct to make the parallelism faster, more efficient in the back
it like writing
python map
or
java Stream.Map
on a collection
So when you write a map task / Array node in python flytekit, it generates the ArrayNode representation and I think that is not yet supported in java sdk (or i might be wrong),. so we will need to add that compilation logic the reason for doing this is, we are making Array node now work with all entities, for example map over other workflows, any task types etc
e
The map_task are not supported on flytekit-java because of the design.. flytekit-java uploads (registration time) and downloads (runtime) the JAR dependencies inside the container, so if you use map_task with this design you will download the jars a lot of time, this is the main reason why the flytekit-java doesn't support the map task feature 😞
f
thats ok to download it multiple times right?
it may cause some load on s3/gcs
but thats it
b
It causes massive load in the k8s cluster. Specially disk.
Depending on the n of pods you're running per node it might bring some I/O problems.