<@U06RKAMK0UC>, how do I download files from S3?
# ask-ai
s
@ancient-wolf-19325, how do I download files from S3?
j
To download files from S3 using Flyte, you can use the
FlyteRemote
class from
flytekit
. Here's a concise example:
Copy code
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config
from flytekit.types.file import FlyteFile
from tempfile import mktemp

def download_flyte_file(uri):
    flyteremote = FlyteRemote(config=Config.auto())
    tmp_file = mktemp()
    flyteremote.file_access.download(local_path=tmp_file, remote_path=uri)
    return FlyteFile(tmp_file)

my_flyte_file = download_flyte_file("s3://your-bucket/your-file.txt")
This code initializes a
FlyteRemote
instance, downloads the file from the specified S3 URI to a temporary local path, and returns a
FlyteFile
object pointing to the downloaded file. Would you like more details? Sources: - Flyte Community Discussion - Flyte GitHub Issue 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
s
Show it in a task
@ancient-wolf-19325, show it in a task
j
To download files from S3 within a Flyte task, you can use the
FlyteFile
type. Here's an example of how to do this:
Copy code
from flytekit import task
from flytekit.types.file import FlyteFile
import pandas as pd

@task
def download_and_read_file(s3_uri: str) -> pd.DataFrame:
    file = FlyteFile(s3_uri)
    file.download()
    return pd.read_csv(file.path)

# Example usage
s3_uri = "<s3://your-bucket/your-file.csv>"
df = download_and_read_file(s3_uri=s3_uri)
This task downloads a CSV file from the specified S3 URI and reads it into a pandas DataFrame. Would you like more details? Sources: - Flyte Community Discussion - Flyte Community Discussion 👍 1 button 👎 0 button 💡 Hint: Mention @ancient-wolf-19325 in the thread for followups.