< ancient wolf 19325> I am getting error but in my classclas Flyte #ask-ai

<@U06RKAMK0UC> I am getting error but in my classc...

microscopic-needle-82602

10/07/2024, 2:12 PM

@ancient-wolf-19325 I am getting error but in my classclass i am not specifying

DataClassJsonMixin

dataclass_json

Copy code

"Failed to extract schema for object <class 'ml_pipeline_components.data_processing.dataset.Image'>, error: m │
│ aximum recursion depth exceeded while calling a Python object\nPlease remove `DataClassJsonMixin` and `dataclass_json` decorator from the dataclass definition"}

jolly-florist-66728

10/07/2024, 2:12 PM

The error you're encountering suggests that Flyte is having trouble extracting the schema for your dataclass, likely due to a recursion issue. According to the Flyte documentation, if you're using Flytekit version >= v1.11.1, you don't need to decorate your dataclass with

@dataclass_json

or inherit from

DataClassJSONMixin

. Would you like more details on how to resolve this issue? Sources: - Flyte Dataclass Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 2:13 PM

@ancient-wolf-19325 how do i check my flytekit version?

jolly-florist-66728

10/07/2024, 2:13 PM

To check your Flytekit version, you can use the Flytekit CLI. According to the Flytekit CLI documentation, you can run the following command:

Copy code

flytekit --version

This will print out the current version of Flytekit installed in your environment. Would you like more details on this? Sources: - Flytekit CLI Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 2:19 PM

@ancient-wolf-19325 My loguru logs are not showing on the flyte console

jolly-florist-66728

10/07/2024, 2:19 PM

To ensure that your Loguru logs are visible in the Flyte console, you need to configure Flyte to capture and display these logs. According to Flyte's logging documentation, you can set up Flyte to connect to an external logging system like CloudWatch, Stackdriver, etc. Additionally, you need to ensure that Loguru is properly configured to work with Flyte's logging system. Here is an example of how to configure Loguru to integrate with Flyte's logging:

Copy code

from loguru import logger
import os

# Configure Loguru to use Flyte's logging level
logger.remove()
logger.add(sys.stdout, level=os.getenv("FLYTE_SDK_LOGGING_LEVEL", "INFO"))

# Example task
@task
def my_task():
    logger.info("This is an info message")
    logger.warning("This is a warning message")
    logger.error("This is an error message")

Make sure to set the

FLYTE_SDK_LOGGING_LEVEL

environment variable to the desired logging level (e.g., "INFO", "DEBUG"). Would you like more details on configuring Flyte for external logging systems? Sources: - https://github.com/flyteorg/flyte/issues/4830 - https://github.com/flyteorg/flyte/issues/4829 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 2:36 PM

@ancient-wolf-19325 my dataclass has another dataclass nested, does Flyte support this

jolly-florist-66728

10/07/2024, 2:36 PM

Flyte currently has limited support for nested dataclasses. According to this GitHub issue, accessing attributes on nested dataclasses can lead to type mismatches during workflow compilation. This limitation forces users to create numerous wrappers for inputs and outputs, complicating the workflow. Would you like more details on potential workarounds or further explanation of the issue? Sources: - BUG: Accessing attributes fails on complex types - Docs: Explicitly note that nested dataclasses are not supported in the type system 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 2:42 PM

@ancient-wolf-19325 If i want to pass around nested dataclass, how can I do that in Flyte?

jolly-florist-66728

10/07/2024, 2:42 PM

To pass around nested dataclasses in Flyte, you can use a workaround by serializing the nested dataclass into a JSON string and then deserializing it back into a dataclass within your tasks. This approach bypasses the current limitations of Flyte's type system regarding nested dataclasses. Here's an example of how you can achieve this: 1. Define your nested dataclasses:

Copy code

from dataclasses import dataclass
    from mashumaro.mixins.json import DataClassJSONMixin

    @dataclass
    class InnerDataClass(DataClassJSONMixin):
        a: int
        b: str

    @dataclass
    class OuterDataClass(DataClassJSONMixin):
        x: int
        y: InnerDataClass

2. Serialize and deserialize the dataclasses in your tasks:

Copy code

from flytekit import task, workflow
    import json

    @task
    def serialize_dataclass(data: OuterDataClass) -> str:
        return data.to_json()

    @task
    def deserialize_dataclass(data_str: str) -> OuterDataClass:
        return OuterDataClass.from_json(data_str)

    @workflow
    def my_workflow() -> OuterDataClass:
        inner = InnerDataClass(a=1, b="test")
        outer = OuterDataClass(x=10, y=inner)
        serialized = serialize_dataclass(data=outer)
        deserialized = deserialize_dataclass(data_str=serialized)
        return deserialized

3. Run the workflow:

Copy code

if __name__ == "__main__":
        result = my_workflow()
        print(result)

This approach ensures that the nested dataclass is passed around as a JSON string, which Flyte can handle without issues. Would you like more details on this approach or any other specific part of the implementation? Sources: - BUG: Accessing attributes fails on complex types 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 3:05 PM

@ancient-wolf-19325 I am trying to pass a enum in the workflow params and use the value and pass it to a task. But I am getting error

KeyError: "Error encountered while executing 'prepare_dataset_workflow':\n Promise(node:.dataset_type.['value'])"

jolly-florist-66728

10/07/2024, 3:05 PM

The error you're encountering suggests that there might be an issue with how the enum is being passed and used within the workflow. Flyte supports using enums as inputs and outputs for tasks and workflows, but there are specific ways to define and use them. Here's a step-by-step example of how to correctly pass an enum in the workflow parameters and use it in a task: 1. Define the Enum:

Copy code

from enum import Enum
    from flytekit import task, workflow

    class DatasetType(Enum):
        IMAGE = "image"
        TEXT = "text"

2. Define the Task:

Copy code

@task
    def process_dataset(dataset_type: DatasetType) -> str:
        return f"Processing dataset of type: {dataset_type.value}"

3. Define the Workflow:

Copy code

@workflow
    def prepare_dataset_workflow(dataset_type: DatasetType) -> str:
        return process_dataset(dataset_type=dataset_type)

4. Run the Workflow:

Copy code

if __name__ == "__main__":
        result = prepare_dataset_workflow(dataset_type=DatasetType.IMAGE)
        print(result)

Make sure that the enum values are correctly referenced using

.value

when needed, and ensure that the enum is properly imported and used consistently across tasks and workflows. Would you like more details on this example or any specific part of the implementation? Sources: - Flyte Enum Type Documentation - Flytesnacks Enum Example 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 3:09 PM

@ancient-wolf-19325 I am manually serializing the dataclass by doing json.dumps but getting error

RecursionError: maximum recursion depth exceeded while calling a Python object

jolly-florist-66728

10/07/2024, 3:09 PM

The

RecursionError: maximum recursion depth exceeded while calling a Python object

error typically occurs when there is a circular reference or deeply nested structure that Python's default

json

module cannot handle. To avoid this, you can use the

mashumaro

library, which Flyte already uses for dataclass serialization and deserialization. Here's how you can use

mashumaro

to serialize and deserialize your dataclass: 1. Define your dataclasses with
mashumaro
:

Copy code

from dataclasses import dataclass
    from mashumaro.mixins.json import DataClassJSONMixin

    @dataclass
    class InnerDataClass(DataClassJSONMixin):
        a: int
        b: str

    @dataclass
    class OuterDataClass(DataClassJSONMixin):
        x: int
        y: InnerDataClass

2. Serialize and deserialize using
mashumaro
:

Copy code

from flytekit import task, workflow

    @task
    def serialize_dataclass(data: OuterDataClass) -> str:
        return data.to_json()

    @task
    def deserialize_dataclass(data_str: str) -> OuterDataClass:
        return OuterDataClass.from_json(data_str)

    @workflow
    def my_workflow() -> OuterDataClass:
        inner = InnerDataClass(a=1, b="test")
        outer = OuterDataClass(x=10, y=inner)
        serialized = serialize_dataclass(data=outer)
        deserialized = deserialize_dataclass(data_str=serialized)
        return deserialized

3. Run the workflow:

Copy code

if __name__ == "__main__":
        result = my_workflow()
        print(result)

By using

mashumaro

, you can avoid the recursion issues that arise with deeply nested structures or circular references, as it is designed to handle such cases efficiently. Would you like more details on this approach or any other specific part of the implementation? Sources: - Flyte Dataclass Documentation 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 3:39 PM

@ancient-wolf-19325 I am getting this after serializing my dataclass and try to pass it to another task

Copy code

Workflow[krystal-test:development:workflows.prepare_dataset.prepare_dataset_workflow] failed. RuntimeExecutionError: max number of system retry attempts [11/10] exhausted. Last known status message: failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: output file @[<s3://flyte-metadata.dev-cluster.degould.com/metadata/propeller/krystal-test-development-ax2mzg5vrzh2rjvs9mxp/n0/data/0/outputs.pb>] is too large [3282219] bytes, max allowed [2097152] bytes

jolly-florist-66728

10/07/2024, 3:39 PM

The error you're encountering indicates that the serialized output of your dataclass exceeds the maximum allowed size for Flyte's output files. By default, Flyte has a limit on the size of outputs that can be passed between tasks, which is typically set to 2MB. To handle large outputs, you can use Flyte's built-in support for offloading large data to an object store (e.g., S3) using

FlyteFile

FlyteDirectory

. This approach allows you to store large data externally and pass references to this data between tasks. Here's how you can modify your workflow to use

FlyteFile

for large outputs: 1. Modify your tasks to use
FlyteFile
:

Copy code

from flytekit import task, workflow
    from flytekit.types.file import FlyteFile
    from dataclasses import dataclass
    from mashumaro.mixins.json import DataClassJSONMixin
    import json

    @dataclass
    class InnerDataClass(DataClassJSONMixin):
        a: int
        b: str

    @dataclass
    class OuterDataClass(DataClassJSONMixin):
        x: int
        y: InnerDataClass

    @task
    def serialize_dataclass(data: OuterDataClass) -> FlyteFile:
        file_path = "/tmp/serialized_data.json"
        with open(file_path, "w") as f:
            f.write(data.to_json())
        return FlyteFile(file_path)

    @task
    def deserialize_dataclass(file: FlyteFile) -> OuterDataClass:
        with open(file.download(), "r") as f:
            data_str = f.read()
        return OuterDataClass.from_json(data_str)

2. Modify your workflow to use the updated tasks:

Copy code

@workflow
    def my_workflow() -> OuterDataClass:
        inner = InnerDataClass(a=1, b="test")
        outer = OuterDataClass(x=10, y=inner)
        serialized_file = serialize_dataclass(data=outer)
        deserialized = deserialize_dataclass(file=serialized_file)
        return deserialized

3. Run the workflow:

Copy code

if __name__ == "__main__":
        result = my_workflow()
        print(result)

By using

FlyteFile

, you can offload the large serialized data to an external storage system, avoiding the size limitations of Flyte's internal metadata storage. Would you like more details on this approach or any other specific part of the implementation? Sources: - https://discuss.flyte.org/t/13152245/hi-i-just-experienced-this-error-failed-at-node-n4-bindingre/ - https://github.com/flyteorg/flyte/pull/5607 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 3:55 PM

@ancient-wolf-19325 can i not increase the output?

jolly-florist-66728

10/07/2024, 3:55 PM

Yes, you can increase the maximum allowed output size in Flyte by modifying the

max-output-size-bytes

configuration parameter. This can be done by updating the Flyte configuration, either through the Helm values file or directly in the Flyte configuration files. Here are the steps to increase the output size limit: 1. If using Helm: - Update the

values.yaml

file for your Flyte deployment to include the

max-output-size-bytes

parameter under the

propeller

configuration.

Copy code

configmap:
      core:
        propeller:
          max-output-size-bytes: 20971520  # 20MB

- Apply the changes by upgrading your Helm release:

Copy code

helm upgrade -f values.yaml <release-name> flyteorg/flyte

2. If not using Helm: - Edit the Flyte Propeller configuration directly. Locate the

flyte-propeller-config

ConfigMap in your Kubernetes cluster and add the

max-output-size-bytes

parameter.

Copy code

apiVersion: v1
    kind: ConfigMap
    metadata:
      name: flyte-propeller-config
      namespace: <namespace>
    data:
      core.yaml: |
        propeller:
          max-output-size-bytes: 20971520  # 20MB

- Apply the changes and restart the Flyte Propeller pod to pick up the new configuration. 3. Example Configuration:

Copy code

apiVersion: v1
    kind: ConfigMap
    metadata:
      name: flyte-propeller-config
      namespace: flyte
    data:
      core.yaml: |
        propeller:
          max-output-size-bytes: 20971520  # 20MB

- Apply the changes:

Copy code

kubectl apply -f flyte-propeller-config.yaml
    kubectl rollout restart deployment/flyte-propeller -n <namespace>

By increasing the

max-output-size-bytes

parameter, you can allow larger outputs to be passed between tasks in your Flyte workflows. Would you like more details on this approach or any other specific part of the implementation? Sources: - https://discuss.flyte.org/t/12964146/hi-there-how-to-increase-the-allowed-output-size-of-the-node/ - https://discuss.flyte.org/t/16023116/hi-everyone-is-it-possible-to-change-https-github-com-flyteo/ 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 3:57 PM

@ancient-wolf-19325 how about in flyte binary helm?

jolly-florist-66728

10/07/2024, 3:57 PM

Sorry, there was an internal error when processing your request. Consider starting a new conversation? Please contact support if this problem persists. 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 4:03 PM

@ancient-wolf-19325 how do i change

max-output-size-bytes

in flyte-binary helmchart?

jolly-florist-66728

10/07/2024, 4:03 PM

To change the

max-output-size-bytes

parameter in the Flyte-binary Helm chart, you need to update the

values.yaml

file to include this configuration under the

propeller

section. This can be done by adding the configuration inline. Here are the steps to achieve this: 1. Update the
values.yaml
file: - Add the

max-output-size-bytes

parameter under the

propeller

configuration in the

values.yaml

file.

Copy code

configuration:
      inline:
        propeller:
          max-output-size-bytes: 20971520  # 20MB

2. Apply the changes by upgrading your Helm release: - Use the following command to upgrade your Helm release with the updated

values.yaml

file:

Copy code

helm upgrade -f values.yaml <release-name> flyteorg/flyte-binary

3. Example
values.yaml
file:

Copy code

configuration:
      inline:
        propeller:
          max-output-size-bytes: 20971520  # 20MB

4. Apply the changes: - Run the Helm upgrade command to apply the changes:

Copy code

helm upgrade -f values.yaml <release-name> flyteorg/flyte-binary

By following these steps, you can increase the

max-output-size-bytes

parameter in the Flyte-binary Helm chart, allowing larger outputs to be passed between tasks in your Flyte workflows. Would you like more details on this approach or any other specific part of the implementation? Sources: - https://discuss.flyte.org/t/16023116/hi-everyone-is-it-possible-to-change-https-github-com-flyteo/ - https://github.com/flyteorg/flyte/issues/4951 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 4:08 PM

Copy code

RuntimeExecutionError: failed during plugin execution, caused by: output file @[<s3://flyte-metadata.dev-cluster.degould.com/metadata/propeller/krystal-test-development-ansg74s78fmwshssndnp/n0/data/0/outputs.pb>] is too large [3282219] bytes, max allowed [2097152] bytes

I am still getting the same error @ancient-wolf-19325

jolly-florist-66728

10/07/2024, 4:08 PM

If you are still encountering the same error after updating the

max-output-size-bytes

parameter, it is possible that the configuration changes have not been applied correctly or the Flyte Propeller pod has not been restarted to pick up the new configuration. Here are the steps to ensure the changes are applied correctly: 1. Verify the ConfigMap: - Check the

flyte-propeller-config

ConfigMap to ensure that the

max-output-size-bytes

parameter is correctly set.

Copy code

kubectl get configmap flyte-propeller-config -n <namespace> -o yaml

- Ensure that the output contains the

max-output-size-bytes

parameter with the correct value. 2. Restart the Flyte Propeller Pod: - Restart the Flyte Propeller pod to ensure it picks up the new configuration.

Copy code

kubectl rollout restart deployment/flyte-propeller -n <namespace>

3. Example
values.yaml
for Flyte-binary: - Ensure your

values.yaml

file looks like this:

Copy code

configuration:
      inline:
        propeller:
          max-output-size-bytes: 20971520  # 20MB

4. Apply the Helm Upgrade: - Run the Helm upgrade command again to ensure the changes are applied:

Copy code

helm upgrade -f values.yaml <release-name> flyteorg/flyte-binary

5. Verify the Changes: - After restarting the Flyte Propeller pod, verify that the changes have been applied by checking the logs of the Flyte Propeller pod.

Copy code

kubectl logs <flyte-propeller-pod-name> -n <namespace>

If the issue persists, it might be helpful to increase the

max-output-size-bytes

value further to ensure it accommodates the size of your outputs. Would you like more details on this approach or any other specific part of the implementation? Sources: - https://discuss.flyte.org/t/16023116/hi-everyone-is-it-possible-to-change-https-github-com-flyteo/ - https://github.com/flyteorg/flyte/issues/4951 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/07/2024, 4:15 PM

@ancient-wolf-19325 I have verified the configmap has been updated but still getting the same error when running workflow

jolly-florist-66728

10/07/2024, 4:15 PM

If the

configmap

has been updated and the Flyte Propeller pod has been restarted but you are still encountering the same error, there might be additional configurations or steps that need to be addressed. Here are some additional steps to troubleshoot and resolve the issue: 1. Verify the Configuration in the Pod: - Ensure that the Flyte Propeller pod is using the updated configuration. You can check the pod's environment variables or configuration files to confirm that the

max-output-size-bytes

parameter is set correctly.

Copy code

kubectl exec -it <flyte-propeller-pod-name> -n <namespace> -- cat /etc/flyte/config/*.yaml

2. Check for Other Limits: - Ensure there are no other limits set in the Flyte configuration that might be causing the issue. For example, check the

flyteadmin

and

flyteconsole

configurations for any size limits. 3. Increase the Limit Further: - Try increasing the

max-output-size-bytes

value further to ensure it accommodates the size of your outputs. For example, set it to 50MB (52428800 bytes).

Copy code

configuration:
      inline:
        propeller:
          max-output-size-bytes: 52428800  # 50MB

4. Check Flyte Propeller Logs: - Check the logs of the Flyte Propeller pod for any errors or warnings that might provide more insight into why the limit is being exceeded.

Copy code

kubectl logs <flyte-propeller-pod-name> -n <namespace>

5. Ensure All Components are Updated: - Ensure that all Flyte components (Flyte Propeller, Flyte Admin, Flyte Console) are using the updated configuration. Sometimes, changes in one component might not propagate correctly to others. 6. Example
values.yaml
for Flyte-binary: - Ensure your

values.yaml

file looks like this and reapply the Helm upgrade:

Copy code

configuration:
      inline:
        propeller:
          max-output-size-bytes: 52428800  # 50MB

Copy code

helm upgrade -f values.yaml <release-name> flyteorg/flyte-binary

7. Restart All Flyte Components: - Restart all Flyte components to ensure they pick up the new configuration.

Copy code

kubectl rollout restart deployment/flyte-propeller -n <namespace>
    kubectl rollout restart deployment/flyte-admin -n <namespace>
    kubectl rollout restart deployment/flyte-console -n <namespace>

If the issue persists after following these steps, it might be helpful to provide more details about your Flyte setup and configuration for further troubleshooting. Would you like more details on any specific part of the implementation or additional troubleshooting steps? Sources: - https://github.com/flyteorg/flyte/issues/4951 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

microscopic-needle-82602

10/08/2024, 8:40 AM

@ancient-wolf-19325 The Flyte console is not showing the logs, but I can see them in my pod in the k8s cluster

jolly-florist-66728

10/08/2024, 8:40 AM

If the Flyte console is not showing the logs but you can see them in your Kubernetes pod, it indicates that the logs are not being correctly forwarded or linked to the Flyte console. Here are steps to ensure that logs are correctly configured and visible in the Flyte console: 1. Configure Logging Links: - Ensure that the Flyte configuration includes the correct logging links. This configuration allows Flyte to link to the logs stored in your logging system (e.g., CloudWatch, Stackdriver, etc.). Example configuration for CloudWatch:

Copy code

task_logs:
      plugins:
        logs:
          templates:
            - displayName: "CloudWatch Logs"
              templateUris:
                - "https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/flyte-production/kubernetes;stream=var.log.containers.{{.podName}}_{{.namespace}}_{{.containerName}}-{{.containerId}}.log"

2. Update the Flyte Configuration: - Update the Flyte configuration to include the logging links. This can be done in the

values.yaml

file for the Flyte Helm chart.

Copy code

configuration:
      inline:
        task_logs:
          plugins:
            logs:
              templates:
                - displayName: "CloudWatch Logs"
                  templateUris:
                    - "https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/flyte-production/kubernetes;stream=var.log.containers.{{.podName}}_{{.namespace}}_{{.containerName}}-{{.containerId}}.log"

3. Apply the Helm Upgrade: - Apply the changes by upgrading your Helm release:

Copy code

helm upgrade -f values.yaml <release-name> flyteorg/flyte-binary

4. Restart Flyte Components: - Restart the Flyte components to ensure they pick up the new configuration.

Copy code

kubectl rollout restart deployment/flyte-propeller -n <namespace>
    kubectl rollout restart deployment/flyte-admin -n <namespace>
    kubectl rollout restart deployment/flyte-console -n <namespace>

5. Verify the Configuration: - Verify that the Flyte console is now showing the logs. You can check the Flyte console UI to see if the logs are linked correctly. 6. Check Flyte Console Logs: - If the logs are still not showing, check the logs of the Flyte console pod for any errors or warnings that might provide more insight.

Copy code

kubectl logs <flyte-console-pod-name> -n <namespace>

7. Example
values.yaml
for Flyte-binary: - Ensure your

values.yaml

file looks like this and reapply the Helm upgrade:

jolly-florist-66728

10/08/2024, 8:41 AM

Copy code

configuration:
      inline:
        task_logs:
          plugins:
            logs:
              templates:
                - displayName: "CloudWatch Logs"
                  templateUris:
                    - "<https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/flyte-production/kubernetes;stream=var.log.containers.{{.podName}}_{{.namespace}}_{{.containerName}}-{{.containerId}}.log>"

Copy code

helm upgrade -f values.yaml <release-name> flyteorg/flyte-binary

By following these steps, you should be able to configure Flyte to correctly link and display logs in the Flyte console. Would you like more details on this approach or any other specific part of the implementation? Sources: - Configuring Logging Links in the UI 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.

25 Views

Open in Slack

Previous Next