hi all wave I just wanted to ping the community to ask a qui Flyte #announcements

hi all :wave:, I just wanted to ping the community...

broad-monitor-993

09/08/2022, 5:48 PM

hi all 👋, I just wanted to ping the community to ask a quick question 🤔: Say you have workflow that uses a trained model to generate predictions on a scheduled launchplan. The question is, how do you typically want to get features for that prediction? I.e. when the scheduled workflow kicks off, where are you reading those features from? Do you need the kick-off time as a parameter to fetching data from, say, an s3 bucket or DB?

freezing-airport-6809

09/08/2022, 6:55 PM

What I have seen in the past is a query against a metastore - or a datawarehouse

freezing-airport-6809

09/08/2022, 6:56 PM

or if you are using something like Firehose, then data comes partitioned by timestamps

broad-monitor-993

09/08/2022, 7:28 PM

Do you need the kick-off time as a parameter to fetching data from, say, an s3 bucket or DB?

Cool, so that’s a “yes” to this question

astonishing-ocean-79318

09/10/2022, 12:58 PM

Yep agreed that a date parameter is important. Whether we're training a model or batch inference, we'll always do it 'as of' some date. Something like Airflow's logical_date would be great.

worried-lighter-79998

11/07/2022, 2:15 PM

Old thread but I want to chime in and say "yes absolutely". Tangential to this is being able to re-run a failed workflow at the originally scheduled time. I have not found a way of doing it but this is a crucial feature for e.g. backfilling jobs.

freezing-airport-6809

11/07/2022, 3:27 PM

@worried-lighter-79998 this is something we are thinking of adding. Today you should be able to run as running is simply an adhic execution

freezing-airport-6809

11/07/2022, 3:28 PM

Also does the rerun not work? Or recover

worried-lighter-79998

11/07/2022, 3:31 PM

How would you expose a time stamp and at the same time have it work with scheduling? If you set the time stamp parameter to datetime.now() in your launch plan it is executed at build time, not executive time.

freezing-airport-6809

11/07/2022, 3:32 PM

No, Flyte allows timestamp to be variable for scheduled workflows right?

freezing-airport-6809

11/07/2022, 3:33 PM

It’s called kickoff time input arg - you have to explicitly bind it - https://docs.flyte.org/projects/cookbook/en/stable/auto/core/scheduled_workflows/lp_schedules.html

worried-lighter-79998

11/07/2022, 3:33 PM

That has not been my experience

freezing-airport-6809

11/07/2022, 3:34 PM

Would love to understand

worried-lighter-79998

11/07/2022, 3:39 PM

Thank you for the link. So kickoff_time is a special arg which gets supplied by the scheduler? Is it exposed though the web ui as well?

freezing-airport-6809

11/07/2022, 3:39 PM

Yes

freezing-airport-6809

11/07/2022, 3:39 PM

You can call it whatever

freezing-airport-6809

11/07/2022, 3:39 PM

In Flyte everything has inputs. All inputs are exposed in Ui

freezing-airport-6809

11/07/2022, 3:40 PM

To schedule you can only have one variable input all others need to be fixed. This is why launchplans exist

freezing-airport-6809

11/07/2022, 3:41 PM

You can fix all other inputs and tell the launch plan which I put should the time value be sent in - on this case it is kickoff_time

freezing-airport-6809

11/07/2022, 3:41 PM

Does that help - cc @tall-lock-23197 maybe we have a better doc here?

worried-lighter-79998

11/07/2022, 3:42 PM

Thank you very much. So to clarify the error I made: I had in the launch plan specified something like

default_inputs={"execution_time": datetime.now()}

, which is evaluated to the build time

freezing-airport-6809

11/07/2022, 3:42 PM

Sorry for confusion @worried-lighter-79998

freezing-airport-6809

11/07/2022, 3:42 PM

Aah ya, that is fixing the time to when you build

worried-lighter-79998

11/07/2022, 3:43 PM

But if I understand you correctly i can remove that line and add

kickoff_time_input_arg="execution_time",

and it should be fine

👍 1

freezing-airport-6809

11/07/2022, 3:43 PM

Sadly this will be allowed as Flyte thinks you want a constant time of the build time as an arg

freezing-airport-6809

11/07/2022, 3:43 PM

Correct

worried-lighter-79998

11/07/2022, 3:43 PM

Yeah that makes sense in hindsight and was an error on my side

worried-lighter-79998

11/07/2022, 3:44 PM

Thank you for your help, this has been very valuable!

❤️ 1

freezing-airport-6809

11/07/2022, 3:44 PM

No docs should cater to avoiding confusion- please recommend an edit

freezing-airport-6809

11/07/2022, 3:44 PM

We are always here

worried-lighter-79998

11/07/2022, 3:55 PM

If you think other people will make the same mistake it could be worth adding to the docs page something like "the scheduler specification and its arguments are executed when the flyte resources are compiled so something like

default_args={"kickoff_time": datetime.now()}

won't get the scheduled time but the build time. That's why we have

kickoff_time_input_arg="kickoff_time"

..."

freezing-airport-6809

11/07/2022, 3:56 PM

Good idea

163 Views

Open in Slack

Previous Next