GitHub
09/11/2023, 11:48 PM<https://github.com/flyteorg/flytekit/tree/master|master>
by wild-endeavor
<https://github.com/flyteorg/flytekit/commit/173704811fae7cab64b0eb4a27d68f6948b0e9e4|17370481>
- Feat: Add type support for pydantic BaseModels (#1660)
flyteorg/flytekitGitHub
09/11/2023, 11:48 PMfrom pydantic import BaseModel
class Config(BaseModel):
lr: float = 1e-3
batch_size: int = 32
files: List[FlyteFile]
directories: List[FlyteDirectory]
@task
def train(cfg: Config):
...
Tracking Issue
flyteorg/flyte#2686
Follow-up issue
NA
flyteorg/flytekit
✅ All checks have passed
30/30 successful checksGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMmake generate
command to generate API docs in repos such as FlytePropeller, Flyteidl and so on.
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
09/12/2023, 1:04 AMdepends_on
.
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMflytectl demo start
returned after ~5 min with this error message
+---------------------------------------------+---------------+-----------+
| SERVICE | STATUS | NAMESPACE |
+---------------------------------------------+---------------+-----------+
| flyte-kubernetes-dashboard-7fd989b99d-hgmqb | Pending | flyte |
+---------------------------------------------+---------------+-----------+
| minio-55b8c8f4bc-mvjz5 | Pending | flyte |
+---------------------------------------------+---------------+-----------+
| postgres-bdb75f779-cngdp | Running | flyte |
+---------------------------------------------+---------------+-----------+
Error: Get "<https://127.0.0.1:30086/api/v1/nodes>": dial tcp 127.0.0.1:30086: connect: connection refused
Running flytectl demo exec -- kubectl describe pod -n flyte
shows that the pending pods are pulling the image before it exits. Also, it turned out that my internet connection was slow but increasing $FLYTE_TIMEOUT
as recommended in #2197 did not help.
Could it be that it fails while waiting for the deployments to be ready (
flyte/docker/sandbox-lite/flyte-entrypoint-dind.sh
Line 63 in </flyteorg/flyte/commit/cf24edfbb8c55be5d29c96f7f6ba761ceb44003f|cf24edf>
) when it is still loading the image? The timeout --timeout=5m
is not affected by changing $FLYTE_TIMEOUT
I guess.
Expected behavior
The demo cluster starts succesfully.
Additional context to reproduce
1. Slow internet connection :)
2. flytectl demo start
Screenshots
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMScreen Shot 2022-07-27 at 2 19 54 PM▾
GitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AM@task
and @workflow
.
Example
import mlflow
import flytekitplugins.wandb
from flytekit import task, dynamic
@dynamic
@flytekitplugins.wandb.experiment(
# TBD: figure out what experiment-level configurations can be automatically
# handled by Flyte, e.g. determining a project name that defaults to "{workflow_name}-{execution_id}"
)
def model_experiment(hyperparameter_grid: List[dict]):
models = []
data = ...
for hyperparameters in hyperparameter_grid:
models.append(train_model(hyperparameters=hyperparameters, data=data))
...
@task
@flytekitplugins.wandb.run(
# TBD: figure out which wandb.init options would make sense here
# <https://docs.wandb.ai/ref/python/init>
# The project name will default to the parent workflow's project name.
)
def train_model(hyperparameters: dict, data: ...):
# follow the wandb integrations guides based on ML framework of choice:
# <https://docs.wandb.ai/guides/integrations>
model = MySklearnModel(**hyperparameters)
... # fit
wandb.log({"key": value})
return model
API Proposal 2: extend @task
and @workflow
arguments
Task config plugins don't really make sense for MLFlow experiment tracking/logging, since the task_config
argument is typically used for task types that have specific backend resource requirements (e.g. Spark, Ray, MPI tasks) and is orthogonal to configuring experiments and logging metrics.
Therefore, to support similar functionality to proposal 1, we could introduce additional arguments to the @task
and @workflow
decorators, e.g.
import mlflow
from flytekitplugins.wandb import RunConfig, ExperimentConfig
from flytekit import task, dynamic
@dynamic(..., logging_config=ExperimentConfig(...))
def model_experiment(hyperparameter_grid: List[dict]):
models = []
data = ...
for hyperparameters in hyperparameter_grid:
models.append(train_model(hyperparameters=hyperparameters, data=data))
...
import mlflow
from flytekitplugins.wandb import RunConfig, ExperimentConfig
from flytekit import task, workflow
@task(..., logging_config=RunConfig(...))
def train_model(hyperparameters: dict):
model = MySklearnModel(**hyperparameters)
... # fit
return model
flyteorg/flyteGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMflytekitplugins-deck-standard
contains a few renderers for dataframes, markdown, and box plots.
The purpose of the issue is to add another renderer class for arbitrary plotly graphs.
flyteorg/flyteGitHub
09/12/2023, 1:04 AMflyteadmin-794f779986-mqrs7 flyteadmin
flyteadmin-794f779986-mqrs7 flyteadmin 2022/08/24 19:09:37 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/workflow_repo.go:79 SLOW SQL >= 200ms
flyteadmin-794f779986-mqrs7 flyteadmin [703.604ms] [rows:1] SELECT * FROM "workflows" WHERE project = 'flytesnacks' AND domain = 'development' AND name = 'core.control_flow.high_cpu_wf.single_integer_map_task' ORDER BY created_at desc LIMIT 1
flyteadmin-794f779986-mqrs7 flyteadmin
flyteadmin-794f779986-mqrs7 flyteadmin 2022/08/24 19:09:37 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/workflow_repo.go:79 SLOW SQL >= 200ms
flyteadmin-794f779986-mqrs7 flyteadmin [720.675ms] [rows:1] SELECT * FROM "workflows" WHERE project = 'flytesnacks' AND domain = 'development' AND name = '<http://core.control_flow.dynamics.wf|core.control_flow.dynamics.wf>' ORDER BY created_at desc LIMIT 1
flyteadmin-794f779986-mqrs7 flyteadmin
flyteadmin-794f779986-mqrs7 flyteadmin 2022/08/24 19:09:37 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/workflow_repo.go:79 SLOW SQL >= 200ms
flyteadmin-794f779986-mqrs7 flyteadmin [747.358ms] [rows:1] SELECT * FROM "workflows" WHERE project = 'flytesnacks' AND domain = 'development' AND name = 'core.control_flow.conditions.multiplier_2' ORDER BY created_at desc LIMIT 1
flyteadmin-794f779986-mqrs7 flyteadmin
flyteadmin-794f779986-mqrs7 flyteadmin 2022/08/24 19:09:37 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/launch_plan_repo.go:133 SLOW SQL >= 200ms
flyteadmin-794f779986-mqrs7 flyteadmin [483.240ms] [rows:1] SELECT "launch_plans"."id","launch_plans"."created_at","launch_plans"."updated_at","launch_plans"."deleted_at","launch_plans"."project","launch_plans"."domain","launch_plans"."name","launch_plans"."version","launch_plans"."spec","launch_plans"."workflow_id","launch_plans"."closure","launch_plans"."state","launch_plans"."digest","launch_plans"."schedule_type" FROM "launch_plans" inner join workflows on launch_plans.workflow_id = workflows.id WHERE launch_plans.project = 'flytesnacks' AND launch_plans.domain = 'development' AND launch_plans.name = 'core.control_flow.chain_entities.chain_workflows_wf' LIMIT 1
flyteadmin-794f779986-mqrs7 flyteadmin
flyteadmin-794f779986-mqrs7 flyteadmin 2022/08/24 19:09:37 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/launch_plan_repo.go:133 SLOW SQL >= 200ms
flyteadmin-794f779986-mqrs7 flyteadmin [485.314ms] [rows:1] SELECT "launch_plans"."id","launch_plans"."created_at","launch_plans"."updated_at","launch_plans"."deleted_at","launch_plans"."project","launch_plans"."domain","launch_plans"."name","launch_plans"."version","launch_plans"."spec","launch_plans"."workflow_id","launch_plans"."closure","launch_plans"."state","launch_plans"."digest","launch_plans"."schedule_type" FROM "launch_plans" inner join workflows on launch_plans.workflow_id = workflows.id WHERE launch_plans.project = 'flytesnacks' AND launch_plans.domain = 'development' AND launch_plans.name = 'core.control_flow.chain_tasks.chain_tasks_wf' LIMIT 1
flyteadmin-794f779986-mqrs7 flyteadmin
flyteadmin-794f779986-mqrs7 flyteadmin 2022/08/24 19:09:37 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/launch_plan_repo.go:133 SLOW SQL >= 200ms
flyteadmin-794f779986-mqrs7 flyteadmin [406.597ms] [rows:1] SELECT "launch_plans"."id","launch_plans"."created_at","launch_plans"."updated_at","launch_plans"."deleted_at","launch_plans"."project","launch_plans"."domain","launch_plans"."name","launch_plans"."version","launch_plans"."spec","launch_plans"."workflow_id","launch_plans"."closure","launch_plans"."state","launch_plans"."digest","launch_plans"."schedule_type" FROM "launch_plans" inner join workflows on launch_plans.workflow_id = workflows.id WHERE launch_plans.project = 'flytesnacks' AND launch_plans.domain = 'development' AND launch_plans.name = '<http://control_flow.large_fanout_lp.wf|control_flow.large_fanout_lp.wf>' LIMIT 1
we should investigate indices or other performance optimizations to improve the query performance
Goal: What should the final outcome look like, ideally?
Fast queries to select workflows and launch plans
Describe alternatives you've considered
slower performance
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
09/12/2023, 1:04 AMmake generate
on a clean flyteadmin repo produces different results between go versions 1.18 and 1.19. This causes the Check Go Generate checker to fail and prevents using the latest version of go for development.
Looks like this is related to enumer
and alvaroloes/enumer#68.
Expected behavior
Running make generate
on a clean flyteadmin repo with the latest version of go produces the same output as Check Go Generate.
Additional context to reproduce
1. Install latest version of go (1.19)
2. From a clean flyteadmin repo, run make generate
3. Run git diff
diff --git a/auth/config/authorizationservertype_enumer.go b/auth/config/authorizationservertype_enumer.go
index a5c7dc2..f6e89a6 100644
--- a/auth/config/authorizationservertype_enumer.go
+++ b/auth/config/authorizationservertype_enumer.go
@@ -1,6 +1,5 @@
// Code generated by "enumer --type=AuthorizationServerType --trimprefix=AuthorizationServerType -json"; DO NOT EDIT.
-//
package config
import (
diff --git a/auth/config/samesite_enumer.go b/auth/config/samesite_enumer.go
index af9bfdf..e42e58f 100644
--- a/auth/config/samesite_enumer.go
+++ b/auth/config/samesite_enumer.go
@@ -1,6 +1,5 @@
// Code generated by "enumer --type=SameSite --trimprefix=SameSite -json"; DO NOT EDIT.
-//
package config
import (
diff --git a/pkg/runtime/interfaces/inlineeventdatapolicy_enumer.go b/pkg/runtime/interfaces/inlineeventdatapolicy_enumer.go
index 63ff94e..7c3895b 100644
--- a/pkg/runtime/interfaces/inlineeventdatapolicy_enumer.go
+++ b/pkg/runtime/interfaces/inlineeventdatapolicy_enumer.go
@@ -1,6 +1,5 @@
// Code generated by "enumer -type=InlineEventDataPolicy -trimprefix=InlineEventDataPolicy"; DO NOT EDIT.
-//
package interfaces
import (
Screenshots
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
09/12/2023, 1:04 AMgoimports
has run, similar to Check Go Generate, and based on guidance in the README.
What if we do not do this?
Anyone contributing to Flyte may start from an unclean goimports
state
Related component(s)
flyteadmin
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:04 AMGitHub
09/12/2023, 1:05 AM--pkg
in the pyflyte run
, and use fast_package
to register the entire package (directory) if people use this flag.
Goal: What should the final outcome look like, ideally?
pyflyte run --remote --pkg example.py wf
Use pkg
plag to fast register the entire package. Otherwise, register a single script by default.
Describe alternatives you've considered
Replace register_script
with fast_package
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
09/12/2023, 1:05 AMpyflyte register
, add more examples mentioning pyflyte register
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyteGitHub
09/12/2023, 1:05 AMGitHub
09/12/2023, 1:05 AM