acoustic-carpenter-78188
04/12/2023, 4:07 AM{"json":{"src":"schedule_executor.go:93"},"level":"error","msg":"failed to catch up on all the schedules. Aborting","ts":"2023-04-12T07:04:15Z"}
{"json":{"src":"schedule_executor.go:94"},"level":"info","msg":"Flyte native scheduler shutdown","ts":"2023-04-12T07:04:15Z"}
It's also possible that catch up procedure for a valid scheduled launch plan might not be executed if there is malformed launch plan.
This line will immediately exit CatchupAll
which potentially can starve other launch plan
(https://github.com/flyteorg/flyteadmin/blob/eb695b19dcc6fd53492176586c2ab9d64f0c990d/scheduler/core/gocron_scheduler.go#L190)
Expected behavior
1. Scheduled Launch Plan with incomplete inputs should be rejected during registration.
2. Error during create execution request from scheduler should be surfaced to the user so that they are aware of the issue.
3. Ensure that scheduler doesn't restart when a malformed scheduled launch plan is failed to be executed.
I think fixing 1 is more urgent as it can avoid this issue altogether. However, 2 will also be useful in case there is any condition that can lead to this.
Additional context to reproduce
Using the following workflow and launch plan code.
@task
def square(a: int) -> int:
return a * a
@task
def add(a: int, b: int) -> int:
return a + b
@workflow
def my_wf(kickoff_time: datetime, a: int, b: int) -> int:
# a and b are required inputs
x = square(a=a)
return add(a=x, b=b)
my_wf_lp = LaunchPlan.get_or_create(
name=f"my-schedule",
workflow=my_wf,
fixed_inputs={
"a": 1,
# omit b from fixed_inputs, so the scheduled launch plan will only pass in "kickoff_time" and "a" input.
},
schedule=CronSchedule(
schedule="*/5 * * * *",
kickoff_time_input_arg="kickoff_time",
),
)
Log in scheduler
{
"json": {
"routine": "jobfunc-11804557365892249653",
"src": "executor_impl.go:110"
},
"level": "error",
"msg": "failed to create execution create request %+v due to %vproject:\"sample\" domain:\"development\" name:\"f0bd182b3249867ba000\" spec:<launch_plan:<resource_type:LAUNCH_PLAN project:\"sample\" domain:\"development\" name:\"ml_pipeline.launchplan.schedule\" version:\"0.1.5\" > metadata:<mode:SCHEDULED scheduled_at:<seconds:1681271880 > > > inputs:<literals:<key:\"kickoff_time\" value:<scalar:<primitive:<datetime:<seconds:1681271880 > > > > > > rpc error: code = InvalidArgument desc = expected_inputs b missing",
"ts": "2023-04-12T04:04:42Z"
}
Screenshots
No response
Are you sure this issue hasn't been raised already?
☑︎ Yes
Have you read the Code of Conduct?
☑︎ Yes
flyteorg/flyte