wave I m seeing a discrepancy in Flyte datacatalog between Flyte #flyte-deployment

:wave: I’m seeing a discrepancy in Flyte datacatal...

brash-london-45337

04/28/2022, 9:50 PM

👋 I’m seeing a discrepancy in Flyte datacatalog between the docs and my testing: • In the docs — it says the cache is only invalidated if the

cache_version

, task signature, or inputs change. This aligns with the example provided:

In the above example, calling square(n=2) twice (even if it’s across different executions or different workflows) will only execute the multiplication operation once.

• However, in my testing — caching a task and then making a change to a different task in the workflow causes the “cached” task to be recomputed. The code shows the hashed key contains a

core.Identifier

that has the task version in it. Logs are also showing the same, which would mean each iteration on the workflow would recompute the “cached” task:

Copy code

"Successfully cached results to catalog - Task [resource_type:TASK project:\"fraud-intelligence\" domain:\"development\" name:\"src.python.flyte.fraud_intelligence.training_set.main.filter_and_label\" version:\"e8758db3b7a1e6fffbb0bb73c742310acd7e774f\" ]"

Are the docs just outdated? If the implementation is correct, how are cached tasks expected to be reused when iterating on a workflow? Is the only way to reuse a cached task through a reference task?

thankful-minister-83577

04/28/2022, 9:57 PM

what’s your task signature? could you copy/paste it here?

thankful-minister-83577

04/28/2022, 9:58 PM

it shouldn’t take into account task version… that should be okay to change frequently

brash-london-45337

04/28/2022, 9:59 PM

Copy code

@task(cache=True, cache_version="v1")
def filter_and_label(
    user: str,
    snapshot_date: str,
    start_date: str,
    end_date: str,
    label_type: str,
    final_features_path: str,
    output_path: str,
) -> str:

thankful-minister-83577

04/28/2022, 10:04 PM

if you re-run a workflow at the same version, does the cache get read?

brash-london-45337

04/28/2022, 10:05 PM

yep

brash-london-45337

04/28/2022, 10:06 PM

The first one is a rerun of the same workflow, the 2nd is a change in a different task that is causing the first “cached” task to be recomputed

thankful-minister-83577

04/28/2022, 10:07 PM

and if you go to the task json definition, the

discoveryVersion

is what you expect in all cases?

brash-london-45337

04/28/2022, 10:09 PM

Copy code

$ flytectl get tasks -d development -p fraud-intelligence src.python.flyte.fraud_intelligence.training_set.main.filter_and_label -oyaml
- closure:
    compiledTask:
      template:
...
        metadata:
          discoverable: true
          discoveryVersion: v1

brash-london-45337

04/28/2022, 10:10 PM

yeah that task hasn’t changed — so that metadata should be consistent

thankful-minister-83577

04/28/2022, 10:11 PM

oh… sorry, the json is also on the Task tab in the UI.

brash-london-45337

04/28/2022, 10:11 PM

I traced the logs and the issue is the cache key is different between the runs. The original run has a key of

flyte_cached-goqzg39XfX_GSwutxjbTzJghG38yCEerd52cCCV6zzA

and running an updated workflow is showing

Copy code

"DataCatalog failed to get artifact by tag flyte_cached-2zsk_u8ljbfKgGhocfVu4Cmm6aHxU8YfO63yFx92duk"

thankful-minister-83577

04/28/2022, 10:12 PM

where did you see the

core.Identifier

you were talking about?

brash-london-45337

04/28/2022, 10:13 PM

it’s in the github code link I provided above —

catalog.Key

brash-london-45337

04/28/2022, 10:13 PM

Ah yes. that is much easier to see the task metadata 🙂

Copy code

"type": "python-task",
  "metadata": {
    "discoverable": true,
    "runtime": {
      "type": 1,
      "version": "0.26.0",
      "flavor": "python"
    },
    "retries": {},
    "discoveryVersion": "v1"
  },

looks consistent between both of them

hallowed-mouse-14616

04/28/2022, 10:40 PM

Hey @brash-london-45337 it looks like the tag where the hash is different is computed as a hash over the tasks input values. So it shouldn't be an issue of task version if those are different. Can you show the inputs tabs in the UI from the different task runs?

brash-london-45337

04/28/2022, 10:42 PM

oh crud —

Copy code

end_date:
2022-04-26
snapshot_date:
2022-04-26
start_date:
2022-04-25
user:
btang

brash-london-45337

04/28/2022, 10:42 PM

i forgot one of the inputs was a date, that was computed by current

brash-london-45337

04/28/2022, 10:42 PM

lemme try it again

🙏 1

brash-london-45337

04/28/2022, 10:54 PM

😬 yep my mistake. The cache key is consistent!

brash-london-45337

04/28/2022, 10:54 PM

thanks for the help!

hallowed-mouse-14616

04/28/2022, 10:57 PM

No problem! thanks @thankful-minister-83577! Looks like you took a pretty deep dive, hope it wasn't too painful 😅

freezing-airport-6809

04/29/2022, 1:08 AM

@brash-london-45337 help us understand what ux can make this process better?

freezing-airport-6809

04/29/2022, 1:08 AM

cc @jolly-alligator-53212 / @late-eye-50215

👀 1

brash-london-45337

04/29/2022, 4:22 PM

This was mostly user error on my part 😅, but perhaps it would be helpful if the logs also included the input names/types that are part of the cache key. Reading

Copy code

"DataCatalog failed to get dataset for ID resource_type:TASK project:\"fraud-intelligence\" domain:\"development\" name:\"src.python.flyte.fraud_intelligence.training_set.main.filter_and_label\" version:\"e8758db3b7a1e6fffbb0bb73c742310acd7e774f\"

at first glance made me believe those were all a part of the cache key

167 Views

Open in Slack

Previous Next