I am having some trouble executing a workflow with...
# ask-the-community
s
I am having some trouble executing a workflow with FlyteRemote which is a list[DataClasses] The input type of the workflow is a list[InputDocument] where input documents is as follows
Copy code
@dataclass_json
@dataclass
class InputDocument:
    document_id: str
    s3_path: str
The workflow looks like this and goes through multiple tasks which returned intermediary dataclass types
Copy code
def my_workflow(inputs: list[InputDocument]) -> list[ProcessedDocument]:
    ...
When executing the workflow via
FlyteRemote.execute
I get the following error
Copy code
File "flytekit/core/type_engine.py", line 321, in assert_type
    for f in dataclasses.fields(type(v)):  # type: ignore
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/dataclasses.py", line 1244, in fields
    raise TypeError('must be called with a dataclass type or instance') from None
TypeError: must be called with a dataclass type or instance
With the data
Copy code
{
  "inputs": [
    {
      "document_id": "test1",
      "s3_path": "<s3://inputs/123.txt>"
    }
  ]
}
Essentially FlyteRemote thinks that the
list
should be a dataclass. Any idea what is going on here? It seems like flytkit can’t handle a list (native) of dataclasses. I can get the workflow to run by adding an additional
type_hints={"inputs" : list[dict[str,str]]}
, but I would really rather not constrain the workflow to this. I want to be able to arbitrarily call multiple different workflows with different inputs.
I am wondering to the degree flytekit supports list[Dataclasses]. I am also seeing unexpected errors when using a dataclass with an nested list of dataclasses. Note this is only in test, but not when run on a flyte cluster (suggesting the problem is with flytekit) Can anyone give me some indication of the level of support for this in flytkit i.e 1. using a list of dataclasses as the input of a workflow 2. using nested lists of dataclasses within a dataclass field cc @Alberto Bracci
j
We have an extremely similar class and function to yours that works -- our
InputDocument
class extends
DataClassJsonMixin
(from
dataclasses_json
), can you see if that resolves your issue?
Per the package description, there is some better support for type checking with the mixin inheritance
s
Thanks @Joe Kelly - will try!
k
could you try
typing.List
instead of
list
s
@Kevin Su @Joe Kelly Thanks for your comments. Unfortunately neither of these have worked and we are seeing the same behaviour. A issue has been raised here in aid of progressing this https://github.com/flyteorg/flyte/issues/4098