shy-holiday-15500
09/20/2022, 4:53 PMthankful-minister-83577
thankful-minister-83577
thankful-minister-83577
return file_a, file_b
that they end up close to each other somehow? like possibly in the same prefix in s3?thankful-minister-83577
shy-holiday-15500
09/20/2022, 8:08 PMshy-holiday-15500
09/20/2022, 8:09 PMshy-holiday-15500
09/20/2022, 8:10 PMshy-holiday-15500
09/20/2022, 8:12 PMthankful-minister-83577
broad-monitor-993
09/20/2022, 9:08 PMFlyteDirectory
would be appropriate to use here… how are these “file families” distinct from directories?shy-holiday-15500
09/20/2022, 9:09 PMshy-holiday-15500
09/20/2022, 9:11 PMshy-holiday-15500
09/20/2022, 9:11 PMbroad-monitor-993
09/20/2022, 9:16 PMSometimes you want to download just the index so you can check whether something exists in the large file it indexes before you download itinteresting… yeah I think if you can write down requirements like this in an issue it would help us figure out how to extend FlyteDirectories https://github.com/flyteorg/flyte/issues/new?assignees=&labels=enhancement%2Cuntriaged&template=feature_request.yaml&title=%5BCore+feature%5D+
rich-garden-69988
09/21/2022, 12:21 AMclass FlyteFileWithIndex(FlyteFile, metaclass=TypeTransformerMeta):
index_extensions: typing.List[str] = []
index_requirement: typing.Literal["all", "any"] = "any"
The type transformer looks at the list of index extensions, checks if “any” or “all” of the index files exists based on the specified index_requirement (“any” is useful for things like BAM that can have either “.bai” or “.bam.bai” suffix, “all” is useful for strict matching). If index(es) exists, it will download both the file and its associated index files in the same directory, otherwise it will error.
So for example, this is what a type would look like to handle BAM files, which can be defined inside the workflow or in a library that is used by the workflow:
class BamFile(FlyteFileWithIndex):
index_extensions = [".bai", ".bam.bai"]
index_requirement = "any"
shy-holiday-15500
09/21/2022, 2:42 PMthankful-minister-83577
shy-holiday-15500
09/21/2022, 4:54 PMthankful-minister-83577
shy-holiday-15500
09/22/2022, 1:12 AM