<@U06RKAMK0UC> how do you pass struct as an input ...
# ask-ai
b
@ancient-wolf-19325 how do you pass struct as an input to a workflow
j
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up the best way to pass a struct as an input to a workflow, so I can give you clear steps or examples. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 68 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have all the details needed to explain how to pass a struct as an input to a workflow in Flyte, including code examples and input formats.
To pass a struct (such as a Python dataclass or Pydantic BaseModel) as an input to a Flyte workflow, define the struct using Python's dataclass or Pydantic BaseModel, then use it as a type annotation for the workflow input. Flytekit will handle serialization and deserialization automatically. You can provide the input as a JSON object when launching the workflow via the Flyte Console or CLI. For dataclasses, Flytekit uses MessagePack or JSON for serialization depending on the version, and the input form in the Console is generated from the struct's JSON schema. See the example below:
Copy code
from dataclasses import dataclass
from flytekit import task, workflow

@dataclass
class MyStruct:
    a: int
    b: str

@task
def my_task(x: MyStruct) -> int:
    return x.a

@workflow
def my_wf(x: MyStruct) -> int:
    return my_task(x=x)
You can launch this workflow and provide the input as a JSON object, e.g.,
{"a": 1, "b": "hello"}
. For more, see the Flyte dataclass documentation and related issue. Would you like more detail or examples for Pydantic models or advanced struct usage? Sources: - Flyte dataclass documentation - Flyte issue: Unable to input dataclass and optional dataclass types in Flyte Console šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
b
@ancient-wolf-19325 using dataclass input when using FlyteRemote results in "status = StatusCode.INVALID_ARGUMENT details = "invalid sink_table input wrong type. Expected simple:STRUCT but got simple:BINARY" How do you send struct inputs when using FlyteRemote to execute a workflow
j
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up how to correctly send dataclass (struct) inputs using FlyteRemote, and how to fix the error you're seeing. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 55 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have all the details needed to explain why this error happens and how to fix it when using FlyteRemote with dataclass inputs.
When using FlyteRemote to execute a workflow with a dataclass (struct) input, you must provide the input as a plain Python dictionary matching the dataclass structure, not as a dataclass instance. FlyteRemote expects the input to be serializable to JSON/struct, not a pickled binary object. For example, if your dataclass is:
Copy code
from dataclasses import dataclass

@dataclass
class SinkTable:
    name: str
    id: int
You should call FlyteRemote like this:
Copy code
inputs = {"sink_table": {"name": "my_table", "id": 123}}
execution = remote.execute(workflow, inputs=inputs)
Passing a dataclass instance directly will result in a type mismatch (
simple:BINARY
instead of
simple:STRUCT
). For more, see the Flyte dataclass docs and related issue. Would you like more detail or troubleshooting steps? Sources: - Flyte dataclass documentation - Flyte issue: Unable to input dataclass and optional dataclass types in Flyte Console šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
b
@ancient-wolf-19325 even using the the dict like input, I see the same error
j
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up why you might still get this error even when using a dict, and how to make sure the input is sent in the correct format for your workflow. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 59 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. šŸ”„ **: To help you fix this error, could you please tell me which versions of Flytekit (on your client) and Flyte backend you are using, and whether you've set the environment variable FLYTE_USE_OLD_DC_FORMAT to true?
Can you provide the versions of Flytekit (on your client) and the Flyte backend you are using, and confirm whether you have set the environment variable FLYTE_USE_OLD_DC_FORMAT to true when running FlyteRemote? šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
b
@ancient-wolf-19325 how do I set the FLYTE_USE_OLD_DC_FORMAT in FlyteRemote
j
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up the recommended way to set the FLYTE_USE_OLD_DC_FORMAT variable when using FlyteRemote, so you can be sure it's applied correctly. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 57 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have all the details needed to explain exactly how to set FLYTE_USE_OLD_DC_FORMAT when using FlyteRemote.
Set the FLYTE_USE_OLD_DC_FORMAT environment variable in the Python process before you create or use FlyteRemote. You can do this at the top of your script:
Copy code
import os
os.environ["FLYTE_USE_OLD_DC_FORMAT"] = "true"

from flytekit.remote import FlyteRemote
# ... rest of your code
This ensures Flytekit serializes dataclasses using the legacy Protobuf struct format, which may be required for compatibility with older Flyte backends. See Flyte dataclass documentation and Flyte PR 5991. Would you like more detail on version compatibility or troubleshooting? Sources: - Flyte dataclass documentation - Flyte PR 5991 šŸ‘ 1 button šŸ‘Ž 0 button šŸ’” Hint: Mention @ancient-wolf-19325 in the thread for followups.
b
@ancient-wolf-19325 how do you pass the dataclass to subworkflows?
j
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up the recommended way to pass dataclasses as inputs to subworkflows in Flyte, so you get accurate steps and any important caveats. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 56 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have all the details needed to explain how to pass a dataclass to a subworkflow in Flyte, including code examples and important notes.
To pass a dataclass to subworkflows in Flyte, simply use the dataclass as an input/output type in both the parent and subworkflow signatures. Flytekit will handle serialization and deserialization automatically, allowing you to pass the dataclass instance directly between workflows and tasks. See the example below:
Copy code
from dataclasses import dataclass
from flytekit import task, workflow

@dataclass
class MyData:
    foo: int
    bar: str

@task
def t1(x: MyData) -> MyData:
    return x

@workflow
def subwf(x: MyData) -> MyData:
    return t1(x=x)

@workflow
def main_wf(x: MyData) -> MyData:
    return subwf(x=x)
This pattern is supported and tested in Flyte, including for nested dataclasses. For more, see this integration test example. Would you like more detail or examples with nested dataclasses? Sources: - Flyte dataclass_with_optional_wf.py example šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.