Adedeji Ayinde
10/28/2022, 9:20 PMimg = ImageConfig.from_images(
"6***********.<http://dkr.ecr.us-east-1.amazonaws.com/feathr-demo:latest|dkr.ecr.us-east-1.amazonaws.com/feathr-demo:latest>"
)
flyte_workflow = remote.register_workflow(
entity=wf_bb,
serialization_settings=SerializationSettings(image_config=img),
version="v1.0",
)
Error
[1/1] currentAttempt done. Last Error: USER::containers with unready status: [ax9m22wmc8xwpmhqmllt-n0-0]|Back-off pulling image "6*********.<http://dkr.ecr.us-east-1.amazonaws.com/feathr-demo:latest|dkr.ecr.us-east-1.amazonaws.com/feathr-demo:latest>"
Mehtab Mehdi
10/30/2022, 9:22 PM[1/1] currentAttempt done. Last Error: UNKNOWN::Outputs not generated by task execution
Can anyone help me to solve this issue.SeungTaeKim
10/31/2022, 4:25 AMPromise object
of task's return values.
from flytekit import workflow, task, dynamic, conditional
from flytekit.core.promise import Promise
@task
def t1() -> bool:
return True
@task
def t2() -> bool:
return False
@workflow
def wf() -> bool:
test = t1()
print(test)
return t2()
if __name__ == "__main__":
print(wf())
when I run this sample codes through pyflyte
, a task returns Promise object
, and literal or objects I expect are wrapped in Promise
.
So, here is my question, how I unwrap this promise object? I would like to use the expected value from the function.
Thank you!Tarmily Wen
10/31/2022, 3:48 PM@task(
retries=2,
cache=True,
cache_version="1.0",
requests=Resources(gpu=gpu, mem=mem, storage=storage),
limits=Resources(gpu=gpu, mem=mem, storage=storage),
secret_requests=[Secret(group="wandb-secrets", key="API_KEY")],
)
def pytorch_mnist_task(hp: Hyperparameters) -> TrainingOutputs:
secrets = current_context().secrets
wandb_api_key = secrets.get(group="wandb-secrets", key="API_KEY")
And I would like to call a remote execution and inject the secret at the same time like this:
current_config = Config.auto()
remote = FlyteRemote(config=current_config)
flyte_workflow = remote.fetch_workflow(name=workflow_name, version="v1", project="flytesnacks", domain="development",)
workflow_execution = remote.execute(entity=flyte_workflow, inputs={"hp": Hyperparameters(epochs=2, batch_size=128)}, project="flytesnacks", domain="development",)
But after checking out the docs, I am confused about where to put the secret and the format it should be in. I noticed the SecretsConfig, but it isn't clear to me about how to utilize that here since it appears to require a file. Can I not specific an execution time environment variable?Laura Lin
10/31/2022, 4:11 PMUnion[str, flytekit.types.directory.FlyteDirectory]
. Trying to create a task that could take in a str or a flytedirectory but I get
raise ValueError(f"Expected a directory. {source_path} is not a directory")
ValueError: Expected a directory. X is not a directory
AssertionError: Failed to Bind variable input for function
Hadi
11/01/2022, 10:31 AMYash Panchwatkar
11/01/2022, 11:38 AMZhiyi Li
11/01/2022, 12:57 PMLaura Lin
11/01/2022, 6:51 PMYash Panchwatkar
11/02/2022, 11:40 AM[1/1] currentAttempt done. Last Error: USER::containers with unready status: [apv828cglh4qs76ftwgf-n0-0]|Back-off pulling image "<http://x.y.z.dkr.ecr.us-west-2.amazonaws.com/flyte_test:adiwala1|x.y.z.dkr.ecr.us-west-2.amazonaws.com/flyte_test:adiwala1>"
To make sure that imagePullSecret are present in service account I ran the following commands
kubectl get pod nginx -o=jsonpath='{.spec.imagePullSecrets[0].name}{"\n"}'
The output was
reg-ecr-cred
Can you please help me out here what am I missing?Sampath Vaddadi
11/02/2022, 1:37 PMseunggs
11/02/2022, 6:00 PMLaura Lin
11/02/2022, 9:36 PMDennis O'Brien
11/02/2022, 10:09 PM@workflow
def cohort_ltv_inference(inference_date: datetime, n_days_horizon: int) -> bool:
...
I had created a schedule in my launch plan but failed to provide kickoff_time_input_arg
or default_inputs
.
launch_plan = LaunchPlan.get_or_create(
name="pir_cohort_ltv_inference",
workflow=cohort_ltv_inference,
schedule=CronSchedule(
schedule="45 17 * * *", # At 17:45 daily.
),
)
In the workflow web ui, I see the schedule displayed:
Schedules
At 05:45 PM
But when I noticed that no scheduled run was actually happening, I looked through the scheduler logs and saw messages like this:
rpc error: code = InvalidArgument desc = expected_inputs inference_date missing
Ideally the end user would be alerted about this earlier.
I saw a related issue on github here, and a PR here. I haven't worked in Go, but I think the function validateLaunchSpec
doesn't handle the case of no default_inputs
and no fixed_inputs
passed to the launch plan get_or_create
.
In general I'm missing some of the familiar scheduler UI I am used to from my experience with Airflow. I know these are very different projects, but for almost all my use cases, I'll eventually be running the ML pipeline on a schedule.Kamakshi Muthukrishnan
11/03/2022, 5:18 AMKamakshi Muthukrishnan
11/03/2022, 5:18 AMERROR worker.py:399 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::ModinXGBoostActor.train() (pid=29796, ip=172.16.54.211, repr=<modin.experimental.xgboost.xgboost_ray.ModinXGBoostActor object at 0x7fbef05837f0>)
File "/home/ec2-user/anaconda3/envs/python3/lib/python3.8/site-packages/modin/experimental/xgboost/xgboost_ray.py", line 166, in train
with RabitContext(self._rank, rabit_args):
File "/home/ec2-user/anaconda3/envs/python3/lib/python3.8/site-packages/modin/experimental/xgboost/utils.py", line 101, in __enter__
xgb.rabit.init(self.args)
AttributeError: module 'xgboost' has no attribute 'rabit'
Kamakshi Muthukrishnan
11/03/2022, 6:15 AMWillem Gillis
11/03/2022, 12:23 PMpyflyte register
but hitting a wall. The registration passes and it's worklows and launchplans even shows up in Flyte Console however when executing it we get error: ModuleNotFoundError: No module named 'flyte'
When looking at the resulted generated file on the S3 minio we find that it is completely empty. Our full command is:
pyflyte --config ~/.flyte/cloud-config.yaml register flyte/workflows -p {{.PROJECT}} --domain {{.DOMAIN}} -v {{add .DOCKER_VERSION 1}} -i "{{.CONTAINER_REGISTRY}}/{{.PROJECT}}:{{.DOCKER_VERSION}}"
Folder structure see image
Note that using the pyflyte package
and flytectl register files
in the same project does work as expected.
Are we missing something?Felix Ruess
11/03/2022, 2:37 PMEdgar Trujillo
11/03/2022, 3:37 PMFlytePropeller & user defined pod executions
logs having for the most part static timestamps? We're running EKS with FluentBit deployed that pushes logs to Cloudwatch, but seeing the followingFelix Ruess
11/03/2022, 5:26 PMFLYTE_AWS_SECRET_ACCESS_KEY
not as env var so that this secret is not visible to anyone who can see get the pod details?varsha Parthasarathy
11/03/2022, 7:10 PMflytekit.core.type_engine.RestrictedTypeError: Transformer for type <class 'tuple'> is restricted currently
yujinlee
11/04/2022, 4:15 AMflyte_config.yaml
as follows:
admin:
# For GRPC endpoints you might want to use dns:///flyte.myexample.com
endpoint:
authType: Pkce
insecure: false
timezone: ??
Any suggestion regarding with this feature? I also wonder which git repository to start in FlyteOrg.
Thank You!Sanjay Chouhan
11/04/2022, 7:22 AM(base) sanjaychouhan@prod-ml-jenkins2:~$ helm upgrade -n flyte -f values-eks.yaml --create-namespace flyte flyteorg/flyte-core
Error: failed to download "flyteorg/flyte-core"
(base) sanjaychouhan@prod-ml-jenkins2:~$
(base) sanjaychouhan@prod-ml-jenkins2:~$ helm repo add flyteorg <https://flyteorg.github.io/flyte>
Error: repository name (flyteorg) already exists, please specify a different name
Felix Ruess
11/04/2022, 11:49 AMAWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
and thought I can get the correct ones by specifying the group (k8s secret name). But how it currently works I would need to use AWS
as group name and hence can have only one k8s secret with that name....Padma Priya M
11/04/2022, 1:10 PMFelix Ruess
11/04/2022, 3:51 PMRequested MEMORY default [2Gi] is greater than current limit set in the platform configuration [1Gi]. Please contact Flyte Admins to change these limits or consult the configuration
And I already increased the task resource limits but I still get the same error.
Any pointers?Louis DiNatale
11/04/2022, 4:20 PMJay Ganbat
11/04/2022, 9:22 PMvendor-a9fbc36b.js:2 GET https://.../fizlxcwy-n4-0-dn1-0/n1?limit=10000 net::ERR_INSUFFICIENT_RESOURCES
it might have been showing previously as well but never seem to be noticable. it takes solid min or 2 to even show what tasks were executed.
previous version was flyteconsole 1.1.6
, propeller 1.1.15
and admin on 1.1.29
any idea on what have changedFelix Ruess
11/05/2022, 5:19 PMFelix Ruess
11/05/2022, 5:19 PMSamhita Alla
11/07/2022, 4:27 AMFelix Ruess
11/07/2022, 8:16 AMSamhita Alla
11/07/2022, 10:33 AMKetan (kumare3)
11/07/2022, 3:29 PM