Hi, another probably silly question. I would like ...
# ask-the-community
s
Hi, another probably silly question. I would like to be able to access the outputs from an already executed task programmatically. I was expecting an API that I could list the executions (I could do this using
flytekit.remote.FlyteRemote.recent_executions
) and then choose an execution, and get/download the output of a particular task in that workflow. What is the recommended way?
k
This should be possible
You want to look up the execution programmatically too right?
s
Yes, I want to be able to inspect artifacts from runs, without having to go to the UI and figure out what files to download manually. Which API should I be looking for this functionality?
s
s
Ok, managed to move a bit forward. Had to execute
remote.sync_execution(execution, sync_nodes = True)
in order to be able to get the task input and outputs. I ran into other problem. I get the node execution (
node_execution = execution.node_executions['n3']
) and then, if I get the output using
node_execution.outputs['o0']
then output is
StructuredDataset(uri=None, file_format='parquet')
. To get the actual uri, then I have to use
node_execution.outputs.data['o0'].value.structured_dataset.uri
which returns
<s3://my-s3-bucket/data/ow/f2d443d3adb604c94869-n3-0/26f65254ebd42579e5f67433d38efc01>
. This does not look right, it seems like I am not using the right API. At least, I can then load the data from this URI using boto.
s
Can you send
type_hints
? If your structured dataset is a pandas dataframe, you can send
pandas.DataFrame
. https://discuss.flyte.org/t/10466387/attributeerror-when-fetching-inputs-from-node-execution-for-#c048724e-aab5-403f-84ac-1b5b842b6b44
s
Oh, it worked by using
node_execution.inputs.get('param_name', as_type=pd.DataFrame)
Nice! thanks Samhita, really appreciate it!
149 Views