Hello Everyone I am new so please let me know if I am breaki Flyte #flyte-support

Hello Everyone! I am new so please let me know if...

bumpy-pager-32541

04/29/2024, 12:00 AM

Hello Everyone! I am new so please let me know if I am breaking any etiquette. I am trying to build on the following example and modify it slightly to have tracking with the mlflow_autolog() decorator to enable automatic experiment tracking. https://github.com/flyteorg/flytekit-python-template/tree/main/wine-classification/%7B%7Bcookiecutter.project_name%7D%7D I have it running locally fine.

pyflyte run wine_classification_example.py training_workflow

And I have it running fine in the local flytectl demo cluster if I comment out the mlflow_autolog and mlflow dependencies.

pyflyte run --remote -p my-project-wine-4 -d development wine_classification_example.py training_workflow

I think my only problem at this point is changing the requirements.txt for the demo cluster. When I run in the demo cluster it cannot find the libraries and errors on the following lines.

from flytekitplugins.mlflow import mlflow_autolog

import mlflow

How can I add to the requirements in the demo cluster. Ideally I would like to do it using a requirements.txt and dockerfile, but I will take anything that works at this point...

freezing-airport-6809

04/29/2024, 12:48 AM

Checkout imagespec

freezing-airport-6809

04/29/2024, 12:49 AM

https://docs.flyte.org/en/latest/user_guide/customizing_dependencies/index.html

bumpy-pager-32541

04/29/2024, 1:04 AM

Is there a good example or tutorial for this? I was experimenting with this a bit earlier today and struggling. And now I have been looking at the documentation and trying lots of different things with no luck.

freezing-airport-6809

04/29/2024, 3:59 AM

Ohh that is sad

freezing-airport-6809

04/29/2024, 3:59 AM

Yes there are few examples

freezing-airport-6809

04/29/2024, 4:00 AM

But we would love to learn what was not working

freezing-airport-6809

04/29/2024, 4:00 AM

Cc @tall-lock-23197 can you share some examples for imagespec here

freezing-airport-6809

04/29/2024, 4:32 AM

also @glamorous-carpet-83516 or @tall-lock-23197 help here

glamorous-carpet-83516

04/29/2024, 4:42 AM

here is an example to use mlflow plugin with ImageSpec. https://github.com/flyteorg/flytesnacks/blob/master/examples/mlflow_plugin/mlflow_plugin/mlflow_example.py#L10-L73

bumpy-pager-32541

04/29/2024, 4:42 AM

I will send a bit more context on my current issue Here is the workflow I have modified. https://github.com/tchase56/flyte_demo/blob/main/wine-classification/workflows/wine_classification_example.py It works ok locally, although I get some warnings. Then when I run in the demo cluster I get the following error, likely because I am doing the imagespec wrong.

glamorous-carpet-83516

04/29/2024, 4:45 AM

could you remove

if sklearn_image_spec.is_container():

and try to run it again

👍 1

glamorous-carpet-83516

04/29/2024, 4:45 AM

for some reason, flytekit doesn’t import mlflow_autolog

glamorous-carpet-83516

04/29/2024, 4:46 AM

btw, could you add sklearn_image_spec to the

get_data

training_model_loop

and

process_data

task as well

👍 1

bumpy-pager-32541

04/29/2024, 4:52 AM

It seems to be hanging on this command when this happens should I kill the process and start again in general?

glamorous-carpet-83516

04/29/2024, 4:58 AM

odd. yes, kill it and try it again

bumpy-pager-32541

04/29/2024, 5:00 AM

It keeps getting stuck I guess I will try rebuilding the demo cluster, and if that doesn't work restart vscode.

glamorous-carpet-83516

04/29/2024, 5:01 AM

let me try you example. one sec

glamorous-carpet-83516

04/29/2024, 5:04 AM

it works for me

bumpy-pager-32541

04/29/2024, 5:09 AM

mine keeps getting stuck that is so weird you think it is an environment issue?

glamorous-carpet-83516

04/29/2024, 5:24 AM

hmm, did you see the same issue before

bumpy-pager-32541

04/29/2024, 5:24 AM

I got this issue after adding the sklearn image spec to all of the tasks

glamorous-carpet-83516

04/29/2024, 5:25 AM

you already install flytekitplugins-envd, right

bumpy-pager-32541

04/29/2024, 5:26 AM

I ran this in my virtual environment •

pip install flytekitplugins-envd

glamorous-carpet-83516

04/29/2024, 5:27 AM

could you send me the example you run

bumpy-pager-32541

04/29/2024, 5:28 AM

yep

Copy code

import pandas as pd
import numpy as np

from sklearn.datasets import load_wine
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error
from flytekit import task, workflow
# import seaborn as sns

from typing import List, Tuple, Dict
from flytekit import ImageSpec

import mlflow
from flytekitplugins.mlflow import mlflow_autolog

sklearn_image_spec = ImageSpec(
    base_image="<http://ghcr.io/flyteorg/flytekit:py3.8-1.6.2|ghcr.io/flyteorg/flytekit:py3.8-1.6.2>",
    packages=["mlflow", "flytekitplugins-mlflow"],
    registry="localhost:30000"    
)

# if sklearn_image_spec.is_container():


@task(container_image=sklearn_image_spec)
def get_data() -> pd.DataFrame:
    """Get the wine dataset."""
    return load_wine(as_frame=True).frame

@task(container_image=sklearn_image_spec)
def process_data(data: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame, pd.DataFrame]:
    """Simplify the task from a 3-class to a binary classification problem."""
    data_out = data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1))
    data_out_train, data_out_test = train_test_split(data_out, test_size=0.2, random_state=42)

    train_x = data_out_train.drop("target", axis=1)
    test_x = data_out_test.drop("target", axis=1)
    train_y = data_out_train[["target"]]
    test_y = data_out_test[["target"]]

    return train_x, test_x, train_y, test_y
    
@task(container_image=sklearn_image_spec)
@mlflow_autolog(framework=mlflow.sklearn)
def train_model(
    train_x: pd.DataFrame, 
    test_x: pd.DataFrame, 
    train_y: pd.DataFrame, 
    test_y: pd.DataFrame, 
    params: Dict[str, float]) -> Tuple[float, float, LogisticRegression]:
    """Train a model on the wine dataset."""

    lr = LogisticRegression(max_iter=3000, **params)
    lr.fit(train_x, train_y.iloc[:, 0])

    pred_y = lr.predict(test_x)
    mse = float(mean_squared_error(test_y, pred_y))
    mae = float(mean_absolute_error(test_y, pred_y))

    return mse, mae, lr


@task(container_image=sklearn_image_spec)
def training_model_loop(
    train_x: pd.DataFrame, 
    test_x: pd.DataFrame, 
    train_y: pd.DataFrame, 
    test_y: pd.DataFrame, 
    params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]
) -> None:

    for params_i in params_list:
        print('ahhhh')
        print(params_i)
        rmse_i, mae_i, lr_i = train_model(
            train_x = train_x,
            test_x = test_x,
            train_y = train_y,
            test_y = test_y,
            params=params_i,
        )

@workflow
def training_workflow(params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]) -> None:
    """Put all of the steps together into a single workflow."""
    # raise Exception("This is a test")
    data = get_data()
    train_x, test_x, train_y, test_y = process_data(data=data)

    training_model_loop(
        train_x = train_x,
        test_x = test_x,
        train_y = train_y,
        test_y = test_y,
        params_list=params_list,
    )

if __name__ == "__main__":
    training_workflow(params_list=[{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}])

glamorous-carpet-83516

04/29/2024, 5:29 AM

which version of flytekit are you using

bumpy-pager-32541

04/29/2024, 5:30 AM

1.11.0

Copy code

>>> import flytekit
>>> flytekit.__version__
'1.11.0'

glamorous-carpet-83516

04/29/2024, 5:32 AM

could you show me the output of

tree

in the current directory?

bumpy-pager-32541

04/29/2024, 5:39 AM

this right?

Copy code

(flyte_env) (base) tchase@HQ9322OSX wine-classification % pwd
/Users/tchase/Documents/repos/flyte_demo/wine-classification
(flyte_env) (base) tchase@HQ9322OSX wine-classification % tree .

It is spitting out a lot of stuff

glamorous-carpet-83516

04/29/2024, 5:42 AM

flytekit tries to copy you mlflow metadata to remote as well I guess

glamorous-carpet-83516

04/29/2024, 5:43 AM

could you create a new directory

workflow

,and put

wine_classification_example.py

inside it?

👍 1

bumpy-pager-32541

04/29/2024, 5:49 AM

still getting stuck 😢

glamorous-carpet-83516

04/29/2024, 5:52 AM

your docker is also running, right

bumpy-pager-32541

04/29/2024, 6:00 AM

my docker is running I just had to prune a bunch of stuff in order to rebuild my demo cluster I re-ran and it is hanging again I thought for sure the pruning would fix it... this is so frustrating

tall-lock-23197

04/29/2024, 6:03 AM

have you tried restarting docker?

bumpy-pager-32541

04/29/2024, 6:03 AM

I closed and re-opened rancher desktop before the latest attempt yeah

bumpy-pager-32541

04/29/2024, 6:04 AM

maybe I should restart my computer... lol

bumpy-pager-32541

04/29/2024, 6:10 AM

alrighty then restarting the computer stopped the hanging but gave me a new error

Copy code

Failed to get signed url for script_mode.tar.gz, reason: SYSTEM:Unknown: error=None, cause=<_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:30080: Failed to connect to remote host: Connection refused"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:30080: Failed to connect to remote host: Connection refused", grpc_status:14, created_time:"2024-04-28T23:08:42.912323-07:00"}"

glamorous-carpet-83516

04/29/2024, 6:10 AM

is your sandbox running?

Copy code

kubectl get pods

tall-lock-23197

04/29/2024, 6:15 AM

you may also want to add

print

statements here after you resolve the above issue, in case the command still freezes: https://github.com/flyteorg/flytekit/blob/3966d1a0a1e33137a4bc41d9860d4ed5e264cbdf/flytekit/image_spec/image_spec.py#L218-L244

bumpy-pager-32541

04/29/2024, 6:18 AM

yep when I restarted the computer it must have stopped the cluster I rebuilt the demo cluster and now it is hanging again...

tall-lock-23197

04/29/2024, 6:19 AM

can you try this? https://flyte-org.slack.com/archives/CP2HDHKE1/p1714371343897929?thread_ts=1714348808.860049&cid=CP2HDHKE1

bumpy-pager-32541

04/29/2024, 6:32 AM

like this? Now it hangs and says hello world

tall-lock-23197

04/29/2024, 6:33 AM

can you add more prints so that we can find the root cause of this issue?

👍 1

bumpy-pager-32541

04/29/2024, 6:40 AM

this should be slightly more helpful

tall-lock-23197

04/29/2024, 6:44 AM

i think

image_spec.exist()

is the culprit. can you verify please? also if that's the case, can you check if

client.images.get_registry_data(self.image_name())

line in

exist()

is the actual culprit?

bumpy-pager-32541

04/29/2024, 6:54 AM

Yep, that seems to be it

tall-lock-23197

04/29/2024, 7:02 AM

then it's related to docker. is it possible to switch context from rancher to docker desktop?

👍 1

tall-lock-23197

04/29/2024, 7:03 AM

i encountered a similar issue before when i used orbstack.

👍 1

bumpy-pager-32541

04/29/2024, 7:14 AM

I switched to docker, which fixed the hanging, but now a new error has popped up

Copy code

failed to run command envd build --path /var/folders/_9/60k4g7zj1wq1kyl_pr7hq3k80000gq/T/flyteu4fgcrnq/control_plane_metadata/local_flytekit/ff10a1df7e677dbac1a0dcb8a118a235  --platform linux/amd64 --output type=image,name=localhost:30000/flytekit:1JrBkuUO0aml5MJRduDMXQ,push=true with error b'time="2024-04-29T00:13:31-07:00" level=fatal msg=exit app=envd error="failed to build the image: failed to build: failed to wait error group: failed to solve LLB: failed to solve: failed to do request: Head \\"<http://localhost:30000/v2/flytekit/blobs/sha256:e4141a94de7eb2f73676a7678ff9b1e968f935c4c3390cb75d6427c251b1677a>\\": dial tcp [::1]:30000: connect: connection refused" version=v0.3.45\n'

glamorous-carpet-83516

04/29/2024, 7:16 AM

could you create a new envd context?

Copy code

envd context create --name flyte-sandbox --builder tcp --builder-address localhost:30000 --use

glamorous-carpet-83516

04/29/2024, 7:16 AM

and register it again

bumpy-pager-32541

04/29/2024, 7:19 AM

what do you mean by register?

glamorous-carpet-83516

04/29/2024, 7:19 AM

pyflyte run ..

👍 1

bumpy-pager-32541

04/29/2024, 7:20 AM

I seem to be still getting the same error

glamorous-carpet-83516

04/29/2024, 7:27 AM

sorry, my bad. not 30000. should be 30003

Copy code

envd context create --name flyte-sandbox --builder tcp --builder-address localhost:30003 --use

bumpy-pager-32541

04/29/2024, 7:28 AM

Copy code

(flyte_env) (base) tchase@HQ9322OSX workflow % envd context create --name flyte-sandbox --builder tcp --builder-address localhost:30003 --use
FATA[2024-04-29T00:27:44-07:00] exit                                          app=envd error="failed to create context: context \"flyte-sandbox\" already exists" version=v0.3.45

glamorous-carpet-83516

04/29/2024, 7:28 AM

envd context rm --name flyte-sandbox

glamorous-carpet-83516

04/29/2024, 7:28 AM

remove previous one first

bumpy-pager-32541

04/29/2024, 7:29 AM

Copy code

(flyte_env) (base) tchase@HQ9322OSX workflow % envd context rm --name flyte-sandbox
FATA[2024-04-29T00:28:56-07:00] exit                                          app=envd error="failed to remove context: cannot remove current context \"flyte-sandbox\"" version=v0.3.45

glamorous-carpet-83516

04/29/2024, 7:30 AM

checkout to default

Copy code

envd context use --name default

👍 1

bumpy-pager-32541

04/29/2024, 7:36 AM

I'm getting so close!

glamorous-carpet-83516

04/29/2024, 7:38 AM

could you remove base_image from your imageSpec

Copy code

sklearn_image_spec = ImageSpec(
    base_image="<http://ghcr.io/flyteorg/flytekit:py3.8-1.6.2|ghcr.io/flyteorg/flytekit:py3.8-1.6.2>",
    packages=["mlflow", "flytekitplugins-mlflow"],
    registry="localhost:30000"
)

👍 1

glamorous-carpet-83516

04/29/2024, 7:38 AM

it will use latest flytekit

bumpy-pager-32541

04/29/2024, 7:42 AM

After removing the line that sets the base_image I am still getting the same error.

glamorous-carpet-83516

04/29/2024, 7:43 AM

could you try this

Copy code

sklearn_image_spec = ImageSpec(
    packages=["flytekit==1.11.0", "mlflow", "flytekitplugins-mlflow"],
    registry="localhost:30000"
)

bumpy-pager-32541

04/29/2024, 7:47 AM

still the same error

glamorous-carpet-83516

04/29/2024, 7:58 AM

could you send me the example you run again, sorry

glamorous-carpet-83516

04/29/2024, 7:58 AM

want to run it on my side

👍 1

bumpy-pager-32541

04/29/2024, 8:02 AM

Copy code

import pandas as pd
import numpy as np

from sklearn.datasets import load_wine
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error
from flytekit import task, workflow
# import seaborn as sns

from typing import List, Tuple, Dict
from flytekit import ImageSpec

import mlflow
from flytekitplugins.mlflow import mlflow_autolog

sklearn_image_spec = ImageSpec(
    packages=["flytekit==1.11.0", "mlflow", "flytekitplugins-mlflow"],
    registry="localhost:30000"    
)

# if sklearn_image_spec.is_container():


@task(container_image=sklearn_image_spec)
def get_data() -> pd.DataFrame:
    """Get the wine dataset."""
    return load_wine(as_frame=True).frame

@task(container_image=sklearn_image_spec)
def process_data(data: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame, pd.DataFrame]:
    """Simplify the task from a 3-class to a binary classification problem."""
    data_out = data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1))
    data_out_train, data_out_test = train_test_split(data_out, test_size=0.2, random_state=42)

    train_x = data_out_train.drop("target", axis=1)
    test_x = data_out_test.drop("target", axis=1)
    train_y = data_out_train[["target"]]
    test_y = data_out_test[["target"]]

    return train_x, test_x, train_y, test_y
    
@task(container_image=sklearn_image_spec)
@mlflow_autolog(framework=mlflow.sklearn)
def train_model(
    train_x: pd.DataFrame, 
    test_x: pd.DataFrame, 
    train_y: pd.DataFrame, 
    test_y: pd.DataFrame, 
    params: Dict[str, float]) -> Tuple[float, float, LogisticRegression]:
    """Train a model on the wine dataset."""

    lr = LogisticRegression(max_iter=3000, **params)
    lr.fit(train_x, train_y.iloc[:, 0])

    pred_y = lr.predict(test_x)
    mse = float(mean_squared_error(test_y, pred_y))
    mae = float(mean_absolute_error(test_y, pred_y))

    return mse, mae, lr


@task(container_image=sklearn_image_spec)
def training_model_loop(
    train_x: pd.DataFrame, 
    test_x: pd.DataFrame, 
    train_y: pd.DataFrame, 
    test_y: pd.DataFrame, 
    params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]
) -> None:

    for params_i in params_list:
        print('ahhhh')
        print(params_i)
        rmse_i, mae_i, lr_i = train_model(
            train_x = train_x,
            test_x = test_x,
            train_y = train_y,
            test_y = test_y,
            params=params_i,
        )

@workflow
def training_workflow(params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]) -> None:
    """Put all of the steps together into a single workflow."""
    # raise Exception("This is a test")
    data = get_data()
    train_x, test_x, train_y, test_y = process_data(data=data)

    training_model_loop(
        train_x = train_x,
        test_x = test_x,
        train_y = train_y,
        test_y = test_y,
        params_list=params_list,
    )

if __name__ == "__main__":
    training_workflow(params_list=[{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}])

glamorous-carpet-83516

04/29/2024, 8:13 AM

I saw this error instead

bumpy-pager-32541

04/29/2024, 8:14 AM

I got that a day or two ago when debugging I think installing scikit-learn=1.2.2 in my conda environment fixed that error for me when I was running locally

glamorous-carpet-83516

04/29/2024, 8:36 AM

it works

glamorous-carpet-83516

04/29/2024, 8:36 AM

my code

Copy code

import pandas as pd
import numpy as np

from sklearn.datasets import load_wine
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error
from flytekit import task, workflow
# import seaborn as sns

from typing import List, Tuple, Dict
from flytekit import ImageSpec

import mlflow
from flytekitplugins.mlflow import mlflow_autolog

sklearn_image_spec = ImageSpec(
    packages=["flytekit==1.11.0", "mlflow", "flytekitplugins-mlflow", "scikit-learn==1.2.2"],
    registry="pingsutw"
)


# if sklearn_image_spec.is_container():


@task(container_image=sklearn_image_spec)
def get_data() -> pd.DataFrame:
    """Get the wine dataset."""
    return load_wine(as_frame=True).frame


@task(container_image=sklearn_image_spec)
def process_data(data: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame, pd.DataFrame]:
    """Simplify the task from a 3-class to a binary classification problem."""
    data_out = data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1))
    data_out_train, data_out_test = train_test_split(data_out, test_size=0.2, random_state=42)

    train_x = data_out_train.drop("target", axis=1)
    test_x = data_out_test.drop("target", axis=1)
    train_y = data_out_train[["target"]]
    test_y = data_out_test[["target"]]

    return train_x, test_x, train_y, test_y


@task(container_image=sklearn_image_spec)
@mlflow_autolog(framework=mlflow.sklearn)
def train_model(
        train_x: pd.DataFrame,
        test_x: pd.DataFrame,
        train_y: pd.DataFrame,
        test_y: pd.DataFrame,
        params: Dict[str, float]) -> Tuple[float, float, LogisticRegression]:
    """Train a model on the wine dataset."""

    lr = LogisticRegression(max_iter=3000, **params)
    lr.fit(train_x, train_y.iloc[:, 0])

    pred_y = lr.predict(test_x)
    mse = float(mean_squared_error(test_y, pred_y))
    mae = float(mean_absolute_error(test_y, pred_y))

    return mse, mae, lr


@task(container_image=sklearn_image_spec)
def training_model_loop(
        train_x: pd.DataFrame,
        test_x: pd.DataFrame,
        train_y: pd.DataFrame,
        test_y: pd.DataFrame,
        params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]
) -> None:
    for params_i in params_list:
        print('ahhhh')
        print(params_i)
        rmse_i, mae_i, lr_i = train_model(
            train_x=train_x,
            test_x=test_x,
            train_y=train_y,
            test_y=test_y,
            params=params_i,
        )


@workflow
def training_workflow(params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]) -> None:
    """Put all of the steps together into a single workflow."""
    # raise Exception("This is a test")
    data = get_data()
    train_x, test_x, train_y, test_y = process_data(data=data)

    training_model_loop(
        train_x=train_x,
        test_x=test_x,
        train_y=train_y,
        test_y=test_y,
        params_list=params_list,
    )


if __name__ == "__main__":
    training_workflow(params_list=[{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}])

bumpy-pager-32541

04/30/2024, 4:28 PM

I am so close. Did you encounter this error at all while debugging?

tall-lock-23197

04/30/2024, 4:31 PM

not sure why python 3.8 is still being used. have you removed

base_image

in your imagespec?

bumpy-pager-32541

04/30/2024, 4:35 PM

yeah This is the code I am running. I copied the code from above that worked for Kevin and just changed the registry argument in imagespec.

Copy code

import pandas as pd
import numpy as np

from sklearn.datasets import load_wine
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error
from flytekit import task, workflow
# import seaborn as sns

from typing import List, Tuple, Dict
from flytekit import ImageSpec

import mlflow
from flytekitplugins.mlflow import mlflow_autolog

sklearn_image_spec = ImageSpec(
    packages=["flytekit==1.11.0", "mlflow", "flytekitplugins-mlflow", "scikit-learn==1.2.2"],
    registry="localhost:30000"    
)


# if sklearn_image_spec.is_container():


@task(container_image=sklearn_image_spec)
def get_data() -> pd.DataFrame:
    """Get the wine dataset."""
    return load_wine(as_frame=True).frame


@task(container_image=sklearn_image_spec)
def process_data(data: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame, pd.DataFrame]:
    """Simplify the task from a 3-class to a binary classification problem."""
    data_out = data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1))
    data_out_train, data_out_test = train_test_split(data_out, test_size=0.2, random_state=42)

    train_x = data_out_train.drop("target", axis=1)
    test_x = data_out_test.drop("target", axis=1)
    train_y = data_out_train[["target"]]
    test_y = data_out_test[["target"]]

    return train_x, test_x, train_y, test_y


@task(container_image=sklearn_image_spec)
@mlflow_autolog(framework=mlflow.sklearn)
def train_model(
        train_x: pd.DataFrame,
        test_x: pd.DataFrame,
        train_y: pd.DataFrame,
        test_y: pd.DataFrame,
        params: Dict[str, float]) -> Tuple[float, float, LogisticRegression]:
    """Train a model on the wine dataset."""

    lr = LogisticRegression(max_iter=3000, **params)
    lr.fit(train_x, train_y.iloc[:, 0])

    pred_y = lr.predict(test_x)
    mse = float(mean_squared_error(test_y, pred_y))
    mae = float(mean_absolute_error(test_y, pred_y))

    return mse, mae, lr


@task(container_image=sklearn_image_spec)
def training_model_loop(
        train_x: pd.DataFrame,
        test_x: pd.DataFrame,
        train_y: pd.DataFrame,
        test_y: pd.DataFrame,
        params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]
) -> None:
    for params_i in params_list:
        print('ahhhh')
        print(params_i)
        rmse_i, mae_i, lr_i = train_model(
            train_x=train_x,
            test_x=test_x,
            train_y=train_y,
            test_y=test_y,
            params=params_i,
        )


@workflow
def training_workflow(params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]) -> None:
    """Put all of the steps together into a single workflow."""
    # raise Exception("This is a test")
    data = get_data()
    train_x, test_x, train_y, test_y = process_data(data=data)

    training_model_loop(
        train_x=train_x,
        test_x=test_x,
        train_y=train_y,
        test_y=test_y,
        params_list=params_list,
    )


if __name__ == "__main__":
    training_workflow(params_list=[{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}])

tall-lock-23197

04/30/2024, 4:36 PM

would you mind sharing the base image that's being pulled while building the image?

bumpy-pager-32541

04/30/2024, 4:38 PM

What's the best way to do that?

tall-lock-23197

04/30/2024, 4:40 PM

you can see the image in the terminal when you run the

pyflyte run ...

command, or specify the base image as

<http://ghcr.io/flyteorg/flytekit:py3.11-1.11.0|ghcr.io/flyteorg/flytekit:py3.11-1.11.0>

bumpy-pager-32541

04/30/2024, 4:43 PM

(flyte_env) (base) tchase@HQ9322OSX workflow % pyflyte run --remote wine_classification_example.py training_workflow Running Execution on Remote. one two a b c True Image localhost30000/flytekit1JrBkuUO0aml5MJRduDMXQ found. Skip building. three [✔️] Go to http://localhost:30080/console/projects/flytesnacks/domains/development/executions/f3b0cab4ad1554cf1911 to see execution in the console.

tall-lock-23197

04/30/2024, 4:44 PM

oh okay. could you check the version of flytekit that got installed in your image?

Copy code

docker run -it --rm localhost:30000/flytekit:1JrBkuUO0aml5MJRduDMXQ /bin/bash
pip show flytekit

bumpy-pager-32541

04/30/2024, 4:48 PM

flytekit@f9c7d7f06429:/root$ pip show flytekit WARNING: The directory '/home/flytekit/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag. Name: flytekit Version: 1.6.2 Summary: Flyte SDK for Python Home-page: https://github.com/flyteorg/flytekit Author: Author-email: License: apache2 Location: /usr/local/lib/python3.8/site-packages Requires: adlfs, click, cloudpickle, cookiecutter, croniter, dataclasses-json, deprecated, diskcache, docker, docker-image-py, docstring-parser, flyteidl, fsspec, gcsfs, gitpython, googleapis-common-protos, grpcio, grpcio-status, importlib-metadata, joblib, keyring, kubernetes, marshmallow-jsonschema, natsort, numpy, pandas, pyarrow, pyopenssl, python-dateutil, python-json-logger, pytimeparse, pytz, pyyaml, requests, responses, rich, rich-click, s3fs, sortedcontainers, statsd, typing-extensions, urllib3, wheel, wrapt Required-by: flytekitplugins-deck-standard, flytekitplugins-envd, flytekitplugins-mlflow, flytekitplugins-pod

tall-lock-23197

04/30/2024, 4:49 PM

this is a really old version. could you specify the base image and try again? the image should get re-built.

👍 1

bumpy-pager-32541

04/30/2024, 4:55 PM

this base image seem fine? base_image="ghcr.io/flyteorg/flytekit:py3.8-1.6.2"

tall-lock-23197

04/30/2024, 4:56 PM

no this install flytekit-1.6.2. we need the latest version 1.11.0. can you try specifying

<http://ghcr.io/flyteorg/flytekit:py3.11-1.11.0|ghcr.io/flyteorg/flytekit:py3.11-1.11.0>

👍 1

bumpy-pager-32541

04/30/2024, 5:01 PM

I specified the base image but It doesn't seem to have rebuilt sklearn_image_spec = ImageSpec( base_image="ghcr.io/flyteorg/flytekit:py3.11-1.11.0", packages=["flytekit==1.11.0", "mlflow", "flytekitplugins-mlflow", "scikit-learn==1.2.2"], registry="localhost:30000" )

Copy code

Traceback (most recent call last):

      File "/usr/local/lib/python3.8/site-packages/flytekit/exceptions/scopes.py", line 206, in user_entry_point
        return wrapped(*args, **kwargs)
      File "/usr/local/lib/python3.8/site-packages/flytekitplugins/mlflow/tracking.py", line 113, in wrapper
        if not ctx.execution_state.is_local_execution():

Message:

    'ExecutionState' object has no attribute 'is_local_execution'

User error.

bumpy-pager-32541

04/30/2024, 5:10 PM

ok I tore down the demo cluster and rebuilt it, now the image is rebuilding when I run.

bumpy-pager-32541

04/30/2024, 5:12 PM

and yet I have the exact same error...

tall-lock-23197

04/30/2024, 5:12 PM

is it still the same flytekit version?

bumpy-pager-32541

04/30/2024, 5:13 PM

it is yeah... No idea why

Copy code

(flyte_env) (base) tchase@HQ9322OSX workflow % pyflyte run --remote wine_classification_example.py training_workflow
Running Execution on Remote.
one
two
a
b
e
False
Image localhost:30000/flytekit:1JrBkuUO0aml5MJRduDMXQ not found. Building...
five
six
Run command: envd build --path /var/folders/_9/60k4g7zj1wq1kyl_pr7hq3k80000gq/T/flyte4gq06u5q/control_plane_metadata/local_flytekit/e4078d1595594356d59956dba925c844  --platform linux/amd64 --output type=image,name=localhost:30000/flytekit:1JrBkuUO0aml5MJRduDMXQ,push=true 
#1 [internal] setting pip cache mount permissions
#1 DONE 0.0s
#2 <docker-image://ghcr.io/flyteorg/flytekit:py3.8-1.6.2>
#2 resolve <http://ghcr.io/flyteorg/flytekit:py3.8-1.6.2|ghcr.io/flyteorg/flytekit:py3.8-1.6.2>

bumpy-pager-32541

04/30/2024, 5:15 PM

One sec, I may be running the wrong file. I created a copy when I was debugging another problem earlier.

bumpy-pager-32541

04/30/2024, 5:21 PM

Ok it ran!!! The mlflow stuff should be available through the flyte UI as well correct?

bumpy-pager-32541

04/30/2024, 5:31 PM

I was hoping to see the mlflow information in the flyte deck similar to this example, but I cannot seem to find it. https://docs.flyte.org/en/latest/flytesnacks/examples/mlflow_plugin/mlflow_example.html

freezing-airport-6809

04/30/2024, 5:34 PM

do you have

Copy code

enable_deck=True,

bumpy-pager-32541

04/30/2024, 5:41 PM

I just added that but I still cannot seem to find the plots. Where should they be in the UI?

bumpy-pager-32541

04/30/2024, 5:52 PM

Does the task with the tracking need to be directly called by the workflow? Right now train_model_loop is being called by the workflow, and train_model is called by train_model_loop.

Copy code

import pandas as pd
import numpy as np

from sklearn.datasets import load_wine
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error
from flytekit import task, workflow
# import seaborn as sns

from typing import List, Tuple, Dict
from flytekit import ImageSpec

import mlflow
from flytekitplugins.mlflow import mlflow_autolog

sklearn_image_spec = ImageSpec(
    base_image="<http://ghcr.io/flyteorg/flytekit:py3.11-1.11.0|ghcr.io/flyteorg/flytekit:py3.11-1.11.0>",
    packages=["flytekit==1.11.0", "mlflow", "flytekitplugins-mlflow", "scikit-learn==1.2.2"],
    registry="localhost:30000"    
)

# if sklearn_image_spec.is_container():


@task(container_image=sklearn_image_spec)
def get_data() -> pd.DataFrame:
    """Get the wine dataset."""
    return load_wine(as_frame=True).frame

@task(container_image=sklearn_image_spec)
def process_data(data: pd.DataFrame) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame, pd.DataFrame]:
    """Simplify the task from a 3-class to a binary classification problem."""
    data_out = data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1))
    data_out_train, data_out_test = train_test_split(data_out, test_size=0.2, random_state=42)

    train_x = data_out_train.drop("target", axis=1)
    test_x = data_out_test.drop("target", axis=1)
    train_y = data_out_train[["target"]]
    test_y = data_out_test[["target"]]

    return train_x, test_x, train_y, test_y
    
@task(enable_deck=True, container_image=sklearn_image_spec)
@mlflow_autolog(framework=mlflow.sklearn)
def train_model(
    train_x: pd.DataFrame, 
    test_x: pd.DataFrame, 
    train_y: pd.DataFrame, 
    test_y: pd.DataFrame, 
    params: Dict[str, float]) -> Tuple[float, float, LogisticRegression]:
    """Train a model on the wine dataset."""

    lr = LogisticRegression(max_iter=3000, **params)
    lr.fit(train_x, train_y.iloc[:, 0])

    pred_y = lr.predict(test_x)
    mse = float(mean_squared_error(test_y, pred_y))
    mae = float(mean_absolute_error(test_y, pred_y))

    return mse, mae, lr


@task(container_image=sklearn_image_spec)
def training_model_loop(
    train_x: pd.DataFrame, 
    test_x: pd.DataFrame, 
    train_y: pd.DataFrame, 
    test_y: pd.DataFrame, 
    params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]
) -> None:

    for params_i in params_list:
        print('ahhhh')
        print(params_i)
        rmse_i, mae_i, lr_i = train_model(
            train_x = train_x,
            test_x = test_x,
            train_y = train_y,
            test_y = test_y,
            params=params_i,
        )

@workflow
def training_workflow(params_list: List[Dict[str, float]] = [{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}]) -> None:
    """Put all of the steps together into a single workflow."""
    # raise Exception("This is a test")
    data = get_data()
    train_x, test_x, train_y, test_y = process_data(data=data)

    training_model_loop(
        train_x = train_x,
        test_x = test_x,
        train_y = train_y,
        test_y = test_y,
        params_list=params_list,
    )

if __name__ == "__main__":
    training_workflow(params_list=[{"C": 0.1}, {"C": 0.2}, {"C": 0.3}, {"C": 0.4}])

bumpy-pager-32541

04/30/2024, 6:45 PM

I have some mlflow configuration specific questions that I will start in a new thread

👍 1

13 Views

Open in Slack

Previous Next