Thomas Blom
07/06/2023, 7:30 PMFlyteDirectory
after version 1.4.2
.
Our tasks/workflows create dataclass
-based result objects, and we wrap these with a "FlyteResult" class that contains the result object as well as a FlyteDirectory
. Some tasks may write files during execution, the FlyteDirectory
is pointed at the folder containing these files, and these will automatically get downloaded (from e.g. s3) later by FlyteDirectory::download()
when they are requested.
However, some tasks don't write any files. The FlyteResult class that generically manages access to these things doesn't know which tasks do or don't write files, and always just calls self.dir.download()
to get anything that might have been written (at the very least, a subfolder is written, but it may be empty).
This worked fine in flytekit 1.4.2
and previous, but starting with 1.5
, this raises an Access Denied
, because the remote path doesn't exist.
It seems that you actually have to write files during the task to get this remote folder to exist. It is not enough to create a subfolder under current_context().working_directory
.
One way to handle this is to check if the remote s3 "path" exists -- but I don't see how to do this with FlyteDirectory
.
At present I'm just creating a file with each task, to ensure the FlyteDirectory has something in it, such that the remote s3 path will be created, such that the call to FlyteDirectory::download()
doesn't result in an Access Denied exception. This feels clunky.
Better ideas?Yee
ctx.file_access.get_filesystem_for_path(self.dir.remote_source).exists(self.dir.remote_source)
?Thomas Blom
07/07/2023, 2:04 PMctx.file_access.get_filesystem_for_path(flytedirectory.remote_source).exists(flytedirectory.remote_source)
to check for existence of the remote source works. Thanks!
For any other readers, note that the ctx
above is FlyteContextManager::current_context()
, not flytekit.current_context()
-- the latter returns a context that is a small subset of params compared to the former (and in particular does not include file_access
)