https://flyte.org logo
r

Rupsha Chaudhuri

08/31/2022, 10:28 PM
Hi team.. for task caching to work, do the tasks need to return something? I’m running a workflow with 3 tasks that don’t return anything, have caching enabled but I don’t see anything being written to or read from the cache and the tasks are rerunning even when inputs are the same
y

Yee

08/31/2022, 10:33 PM
what are your inputs? what does the ui show? is there a check mark next to it?
next to the node in the execution on the console page
you are using the tasks purely for side-effects?
r

Rupsha Chaudhuri

08/31/2022, 10:45 PM
I’m using tasks because I need to run them in a specific sequence and they have dependencies
inputs for each task are a bunch of s3 urls
I don’t see the check mark
y

Yee

08/31/2022, 10:47 PM
can you say a bit more? what do you mean by specific sequence and dependencies?
and you can confirm that the tasks have caching enabled?
like cache=True in the Task JSON tab?
r

Rupsha Chaudhuri

08/31/2022, 10:49 PM
workflow has 3 python tasks. They all have (cache=True, cache_version=“1.0”) I’ve provided the dependencies which enforces the ordering
each task takes say a couple of s3 urls as input.. but does not return anything
I don’t see cache in the task details
even though I have this
@task(cache=True, cache_version="1.0")
y

Yee

08/31/2022, 10:55 PM
are you sure you’re on the version you expect?
how are you registering?
r

Rupsha Chaudhuri

08/31/2022, 10:56 PM
flytectl register files
y

Yee

08/31/2022, 10:56 PM
and how are you registering?
package?
package will produce a tgz file. do you think you can unzip it and look at one of the tasks?
r

Rupsha Chaudhuri

08/31/2022, 10:58 PM
yes.. package
I see the .pb files…
y

Yee

08/31/2022, 11:31 PM
can you pick a task pb file
and then run this command
Copy code
flyte-cli parse-proto -f /path/to/task.pb -p flyteidl.admin.tasks_pb2.TaskSpec
that should dump out the task json
and if that is not working…
r

Rupsha Chaudhuri

08/31/2022, 11:35 PM
Trying to resolve this error
ImportError: cannot import name '_message' from 'google.protobuf.pyext'
y

Yee

08/31/2022, 11:36 PM
what version of python are you on?
r

Rupsha Chaudhuri

08/31/2022, 11:36 PM
3.8.9
y

Yee

08/31/2022, 11:36 PM
that looks like the pip protobuf dependency is missing
r

Rupsha Chaudhuri

08/31/2022, 11:37 PM
after installing protobuf 3.14 I’m getting this error
'NoneType' object has no attribute 'message_types_by_name'
y

Yee

08/31/2022, 11:37 PM
this is what i have
Copy code
googleapis-common-protos==1.56.3
proto-plus==1.20.3
protobuf==3.20.1
r

Rupsha Chaudhuri

08/31/2022, 11:37 PM
let me install these
Now I’m back to
ImportError: cannot import name '_message' from 'google.protobuf.pyext'
are you on an m1?
r

Rupsha Chaudhuri

08/31/2022, 11:41 PM
yes
k

Ketan (kumare3)

09/01/2022, 12:12 AM
@yee why not get the task Json from the ui
r

Rupsha Chaudhuri

09/01/2022, 12:32 AM
I looked at the task details in UI too
not sure where I can find the caching info?
I confirmed my theory with a very basic task.. if the task returns something it is cached, else it is not
k

Ketan (kumare3)

09/01/2022, 8:21 PM
Yes output is required
else what are we catching
r

Rupsha Chaudhuri

09/01/2022, 8:22 PM
I guess it was just “knowing” that the task has been run already with those inputs.. not a problem… I’m now returning a status string so we are good