<#6257 [Core feature] Add `metadata` to FlyteFile>...
# flytekit
c
#6257 [Core feature] Add `metadata` to FlyteFile Issue created by davidmirror-ops ### Motivation: Why do you think this is important? Flyte's type system serializes to
pickle
if the data type doesn't have a registered
TypeTransformer
. This format is known to be insecure as it allows remote code execution at the deserialization phase. If FlyteFile supports a metadata field, we could add a hash to it as an additional control to prevent pickling attacks or other forms of data-at-rest corruption. It would help us even more to position Flyte as the right system to build a robust and secure ML supply chain. ### Goal: What should the final outcome look like, ideally? If this would be available, we could do something like: def calculate_file_hash(file_path: str) -> str: """Calculate the SHA256 hash of a file.""" with open(file_path, "rb") as f: sha256_hash = hashlib.sha256(f.read()) return sha256_hash.hexdigest() @task def process_file(file_path: str) -> FlyteFile: # Calculate the hash of the file file_hash = calculate_file_hash(file_path) # Create a FlyteFile with hash as metadata flyte_file = FlyteFile(path=file_path, metadata={"hash": file_hash}) return flyte_file ### Describe alternatives you've considered • Create and register a Custom Type like
ExtendedFlyteFile
• Encode models into a custom data class with a method that calculates and validates hash ### Propose: Link/Inline OR Additional context No response ### Are you sure this issue hasn't been raised already? • Yes ### Have you read the Code of Conduct? • Yes flyteorg/flyte