ripe-nest-20732
10/16/2024, 1:23 PMShellTask
to curl + gunzip
a file. The input filename is formatted like date.json.gz
and I want to set the output location to date.json
, that is, to strip off the .gz
from the {input.filename}
. I see the OutputLocation
dataclass allows using a regex: https://github.com/flyteorg/flytekit/blob/f16419136abcf971d30d3398bd7b35a7b6aec904/flytekit/extras/tasks/shell.py#L37 but I can't figure out how that works. Are there any examples? Thanks!damp-lion-88352
10/16/2024, 2:58 PMdamp-lion-88352
10/16/2024, 2:58 PMripe-nest-20732
10/16/2024, 2:58 PMaverage-finland-92144
10/18/2024, 9:50 PMOutputLocation(
var="output_file",
var_type=FlyteFile,
location="{re.sub(r'\.gz$', '', inputs.filename)}"
)
average-finland-92144
10/18/2024, 9:50 PMripe-nest-20732
10/18/2024, 10:58 PMaverage-finland-92144
10/18/2024, 11:23 PMripe-nest-20732
10/23/2024, 1:06 PMripe-nest-20732
10/31/2024, 7:23 PM"".format()
So there doesn't seem to be a way to do what I'm trying to do, but I realized I can just pass the filename
without the .gz
and then add that suffix in my shell script where needed.ripe-nest-20732
10/31/2024, 7:25 PM[0]: Traceback (most recent call last):
File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/core/base_task.py", line 745, in dispatch_execute
native_outputs = self.execute(**native_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/core/array_node_map_task.py", line 270, in execute
return self.python_function_task.execute(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/extras/tasks/shell.py", line 326, in execute
outputs[v.var] = self._interpolizer.interpolate(v.location, inputs=kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/flytekit/extras/tasks/shell.py", line 174, in interpolate
raise ValueError(f"Variable {e} in Query not found in inputs {consolidated_args.keys()}")
Message:
ValueError: Variable 're' in Query not found in inputs dict_keys(['inputs', 'outputs', 'ctx'])
average-finland-92144
10/31/2024, 8:02 PMI can just pass thethis works for you?without thefilename
and then add that suffix in my shell script where needed.gz
ripe-nest-20732
10/31/2024, 8:39 PMripe-nest-20732
10/31/2024, 8:40 PMaverage-finland-92144
10/31/2024, 8:47 PMripe-nest-20732
10/31/2024, 8:47 PM