Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.

Flyte

<https://github.com/flyteorg/flyte/issues/3708|#3708 [Core feature] StructuredDataset read/write kwargs>
Issue created by <https://github.com/peridotml|peridotml>
*Motivation: Why do you think this is important?*

It seems like we cannot configure read / write settings for dataframe types individually and as `StructuredDataset`. This can be very limiting in cases where we want to take advantage of specific features from a given library.

Some common use-cases

• Predicate pushdown when reading hive-style partitions
• Writing partitioned tables
• Appending to a BigQuery table instead of overwriting it

*Goal: What should the final outcome look like, ideally?*

I would like to take advantage of StructuredDataset's portability without losing important features of the underlying libraries.

*Describe alternatives you've considered*

The alternative would be using `FlyteDirectory` or `FlyteFile`.

```
@task 
def t(ff: StructuredDataset):
    # we have a pyspark DataFrame

    df = StructuredDataset(pyspark.sql.DataFrame).all()
    ff = FlyteFile.new_remote_file()
    df.write.parquet(ff.path, partitionBy=["country"])
   return ff
```

*Propose: Link/Inline OR Additional context*

Some examples...

```
# filter hive partitioned parquet with pandas to avoid loading it all into memory
def t_pandas():
   df = sd.open(pd.DataFrame).all(filters=[("year", "&gt;=",  2021)])

# write a hive partitioned table
def t_pandas()
    ...
    return StructuredDataset(dataframe=df, partition_cols=['year', 'month', 'day'])

# append to a table instead of overwrite it
def t_write_bq():
    ...
    return StructuredDataset(df, uri="bq:/", append=True)

# support polars streaming
def t_polars_stream():
    ...
    return StructuredDataset(dataframe=df, streaming=True)

# read with low memory setting
def t_polars_low_mem(sd: StructuredDataset):
    df = sd.open(pl.DataFrame).all(low_mem=True)
```

*Are you sure this issue hasn't been raised already?*

☑︎ Yes

*Have you read the Code of Conduct?*

☑︎ Yes
<https://github.com/flyteorg/flyte|flyteorg/flyte>