<#2502 [BUG] [flytekit] Improve handling around St...
# flytekit
c
#2502 [BUG] [flytekit] Improve handling around StructuredDataset and other lossy types in Flyte remote Issue created by wild-endeavor Description If you
remote.fetch
a task or workflow or launch plan where one of the inputs is a StructuredDataset, and then try to execute it, flytekit will try to "guess" the interface for that structured dataset input and the type that it will come up with is the Python/flytekit
StructuredDataset
class. This is correct, but when we go and try to create the execution, we need to translate the dataframe from a pd.DataFrame or whatever instance into a StructuredDataset Literal. Since flytekit thinks the type annotation is a Python StructuredDataset, it will try to look it up in the list of formats/encoders it has and fail because it's not a real dataframe type. An example stack trace:
Copy code
Traceback (most recent call last):
  File "/Users/nielsbantilan/miniconda3/envs/unionml/bin/unionml", line 33, in <module>
    sys.exit(load_entry_point('unionml', 'console_scripts', 'unionml')())
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/typer/main.py", line 500, in wrapper
    return callback(**use_params)  # type: ignore
  File "/Users/nielsbantilan/git/unionml/unionml/cli.py", line 99, in predict
    predictions = model.remote_predict(app_version, model_version, wait=True, **prediction_inputs)
  File "/Users/nielsbantilan/git/unionml/unionml/model.py", line 535, in remote_predict
    execution = self._remote.execute(
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/flytekit/remote/remote.py", line 796, in execute
    return self.execute_remote_wf(
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/flytekit/remote/remote.py", line 889, in execute_remote_wf
    return self.execute_remote_task_lp(
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/flytekit/remote/remote.py", line 862, in execute_remote_task_lp
    return self._execute(
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/flytekit/remote/remote.py", line 658, in _execute
    lit = TypeEngine.to_literal(ctx, v, hint, variable.type)
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/flytekit/core/type_engine.py", line 696, in to_literal
    lv = transformer.to_literal(ctx, python_val, python_type, expected)
  File "/Users/nielsbantilan/miniconda3/envs/unionml/lib/python3.9/site-packages/flytekit/types/structured/structured_dataset.py", line 486, in to_literal
    fmt = self.DEFAULT_FORMATS[python_type]
KeyError: <class 'flytekit.types.structured.structured_dataset.StructuredDataset'>
We need to improve the erroring/experience around this. Potential things include: • Asking the user to provide a StructuredDataset instance instead of the raw dataframe? • Continuing to error but informing the user to provide the type_hints map. Misc Are you sure this issue hasn't been raised already? ☑︎ Yes Have you read the Code of Conduct? ☑︎ Yes flyteorg/flyte