Frank Shen
10/21/2022, 6:01 PMFrank Shen
10/21/2022, 9:52 PMfrom feathr_client_wrapper.feathr_client import generate_feathr_client
from feathr.client import FeathrClient
# @task
def get_feathr_client() -> FeathrClient:
client = generate_feathr_client(
team="customer",
environment="dev",
organization="wm",
cluster_size="small",
project_name="test_project",
email="<mailto:test@warnermedia.com|test@warnermedia.com>",
aws_registry=True
)
return client
@task
def list_features(client: FeathrClient, project_name: str) -> typing.List:
return client.list_registered_features(project_name=project_name)
@workflow
def wf(project_name: str = 'feathr_demo') -> typing.List:
client = get_feathr_client()
return list_features(client = client, project_name = project_name)
if __name__ == "__main__":
print(wf())
Frank Shen
10/21/2022, 9:55 PMTypeError: can't pickle _thread.lock objects
...
Failed to Bind variable client for function feathr_example.list_features
Frank Shen
10/21/2022, 9:59 PMAlex Pozimenko
10/21/2022, 9:59 PMflytectl
? The documentation says that flytectl delete
can do that (here), but then I run the tool, project is not listed in available commands:
flytectl delete --help
Delete a resource; if an execution:
::
flytectl delete execution kxd1i72850 -d development -p flytesnacks
Usage:
flytectl delete [command]
Available Commands:
cluster-resource-attribute Deletes matchable resources of cluster attributes.
execution Terminates/deletes execution resources.
execution-cluster-label Deletes matchable resources of execution cluster label.
execution-queue-attribute Deletes matchable resources of execution queue attributes.
plugin-override Deletes matchable resources of plugin overrides.
task-resource-attribute Deletes matchable resources of task attributes.
workflow-execution-config Deletes matchable resources of workflow execution config.
Sampath Vaddadi
10/21/2022, 10:12 PMLaura Lin
10/21/2022, 10:26 PMpyflyte register -p project_name -d domain_name --output <s3://my-s3-bucket/raw_data>
<-- subbing in my own bucket, I get FileNotFoundError: [Errno 2] No such file or directory: '<GIT REPO PATH>/s3:/my-s3-bucket/fast-serialize/fast3e198a8e9dd654e746828c0ae929fce3.tar.gz'
and when I don't feed in a --output
, it creates a new folder inside my-s3-bucket
. When I run it from the UI, I get tar: development/fast3e198a8e9dd654e746828c0ae929fce3.tar.gz: Cannot open: Not a directory
Laura Lin
10/25/2022, 1:39 AMmax-array-job-size
but it didn't seem to work.Panos Strouth
10/25/2022, 9:27 AMgrpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "failed to create a signed url. Error: WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: 88d09420-d2e3-4772-8767-83cff32d91af"
debug_error_string = "UNKNOWN:Error received from peer ipv4:xx.xx.xx.xx:443 {grpc_message:"failed to create a signed url. Error: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403
Seems like an error in IRSA (IAM Role for ServiceAccount). The installation guide suggests to attach IAM roles to the whole EC2 node. Personally I decided to use IRSA because I think this is the correct way to provide permissions to applications. Using EC2-wide roles means that every application running on the instance has the role permissions. With IRSA you allow IAM roles be assumed by applications running in specific namespaces…some kind of more fine-grained control. But as I said I am still a K8S beginner so no strong opinion.
My IAM setup has 2 roles: flyte-user-role and iam-role-flyte.
Both roles have full s3 permissions. The most important part is the trust policy.
Since I use IRSA both roles have the following trust policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::xxxxxxxx:oidc-provider/oidc.eks.eu-central-1.amazonaws.com/id/yyyyyy"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"<http://oidc.eks.eu-central-1.amazonaws.com/id/yyyyyy:aud|oidc.eks.eu-central-1.amazonaws.com/id/yyyyyy:aud>": "<http://sts.amazonaws.com|sts.amazonaws.com>",
"<http://oidc.eks.eu-central-1.amazonaws.com/id/yyyyyy:sub|oidc.eks.eu-central-1.amazonaws.com/id/yyyyyy:sub>": "system:serviceaccount:flyte:default"
}
}
}
]
}
Note the “flyte” namespace in the Condition. My flyte services run in “flyte” namespace and they should be able to assume the above roles.
I think the problem is related to IAM trust policies because flyte service does not have the required permissions to assume the IAM role.
Has anyone faced a similar issue?
Any help is appreciated!Adedeji Ayinde
10/25/2022, 5:56 PMimg = ImageConfig.from_images(
"<http://cr.flyte.org/flyteorg/flyte-sandbox-lite:sha-4f73dc6994dfeafb9eecd9b17d16d7f9275b577a|cr.flyte.org/flyteorg/flyte-sandbox-lite:sha-4f73dc6994dfeafb9eecd9b17d16d7f9275b577a>",
)
t1 = remote.register_task(
entity=generate_normal_df,
serialization_settings=SerializationSettings(image_config=img),
version="v1.0",
)
I got the following error message
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[abdmkkdhj6rvsx6cvkww-n0-0] terminated with exit code (1). Reason [Error]. Message:
Starting Docker daemon...
Terminated
Timed out while waiting for dockerd to start
.
Tom Melendez
10/25/2022, 8:56 PMCraig Amundsen
10/25/2022, 10:36 PMEdgar Trujillo
10/26/2022, 2:26 AMFailed to convert return value for var o0 for function src.tasks.batch_transform.load_bt_result with error <class 'TypeError'>: unhashable type: 'list'.
SYSTEM ERROR! Contact platform administrators.
We've double checked that this task is actually returning a pandas df. We've bumped the cache version as well.
The flyte propeller logs show:
{"json":{"exec_id":"f6d3b8ca69b2149bbbbb","node":"n3/n3","ns":"forecaster-training-dev","res_ver":"38555329","routine":"worker-19","tasktype":"python-task","wf":"forecaster-training:dev:src.workflow.end_to_end"},"level":"warning","msg":"No plugin found for Handler-type [python-task], defaulting to [container]","ts":"2022-10-26T01:19:57Z"}
{"json":{"exec_id":"f6d3b8ca69b2149bbbbb","node":"n4/n3","ns":"forecaster-training-dev","res_ver":"38555329","routine":"worker-19","wf":"forecaster-training:dev:src.workflow.end_to_end"},"level":"warning","msg":"No plugin found for Handler-type [python-task], defaulting to [container]","ts":"2022-10-26T01:19:57Z"}
{"json":{"exec_id":"f6d3b8ca69b2149bbbbb","node":"n4/n3","ns":"forecaster-training-dev","res_ver":"38555329","routine":"worker-19","wf":"forecaster-training:dev:src.workflow.end_to_end"},"level":"warning","msg":"Failed to record taskEvent, error [EventAlreadyInTerminalStateError: conflicting events; destination: ABORTED, caused by [rpc error: code = FailedPrecondition desc = invalid phase change from FAILED to ABORTED for task execution {resource_type:TASK project:\"forecaster-training\" domain:\"dev\" name:\"src.tasks.batch_transform.load_bt_result\" version:\"KZ5UyzZ2adcfdjkL4edUwg==\" node_id:\"n4-0-n3\" execution_id:\u003cproject:\"forecaster-training\" domain:\"dev\" name:\"f6d3b8ca69b2149bbbbb\" \u003e 0 {} [] 0}]]. Trying to record state: ABORTED. Ignoring this error!","ts":"2022-10-26T01:19:57Z"}
Schleppo
10/26/2022, 6:51 AMKatrina P
10/26/2022, 1:36 PMGeorge D. Torres
10/26/2022, 4:39 PM/api/v1/data/executions/{id}
with a execution id, the json response I get can't be Unmarshaled to the proper struct (located here). I'm especially having a hard time Unmarshaling `core.Literal`sAugie Palacios
10/26/2022, 10:12 PMflyteconsole
image that would be able to be deployed on an airgapped network? we are trying to allow users access to the flyte UI but when it is deployed onsite the UI breaks because they are unable to hit the following endpoints:
<script crossorigin="" src="<https://unpkg.com/react@16.13.1/umd/react.production.min.js>"></script>
<script crossorigin="" src="<https://unpkg.com/react-dom@16.13.1/umd/react-dom.production.min.js>"></script>
<script async="" src="<https://www.googletagmanager.com/gtag/js?id=G-0QW4DJWJ20>"></script>
can probably fork and make quick edit to enable this to work with no internet connection but didn't want to reproduce work if yall had something alreadyCarlos Cervantes
10/27/2022, 12:46 AMconfigmap:
...
k8s:
plugins:
k8s:
...
default-node-selector:
algorithm-node: "true"
"<http://cloud.google.com/gke-smt-disabled|cloud.google.com/gke-smt-disabled>": "false"
but in my pod task, I add a V1PodSpec with a different node-selector
@task(task_config=Pod(pod_spec=V1PodSpec(
node_selector={
"large-ssd-node": "true",
"<http://cloud.google.com/gke-large-ssd|cloud.google.com/gke-large-ssd>": "true",
},
...)
def mytask():
However, when I look at the yaml that ultimately gets generated, it seems to be a merge of the two different node-selectors
nodeSelector:
algorithm-node: "true"
<http://cloud.google.com/gke-large-ssd|cloud.google.com/gke-large-ssd>: "true"
<http://cloud.google.com/gke-smt-disabled|cloud.google.com/gke-smt-disabled>: "false"
large-ssd-node: "true"
Is this expected behavior? If so, is there a way to replace the default-node-selector, so I only end up with
nodeSelector:
<http://cloud.google.com/gke-large-ssd|cloud.google.com/gke-large-ssd>: "true"
large-ssd-node: "true"
Sujith Samuel
10/27/2022, 6:50 AM<http://nvidia.com/gpu|nvidia.com/gpu>
Is there any way to get Migs into this mix of things.
@SeungTaeKim I see that you were working on this 4 months back, did you get a solution to this issue. Please assist.Sujith Samuel
10/27/2022, 2:16 PMTarmily Wen
10/27/2022, 4:35 PMJay Ganbat
10/27/2022, 8:18 PMTraceback (most recent call last):
File "/fn/lib/venv/lib/python3.10/site-packages/flytekit/core/base_task.py", line 479, in dispatch_execute
native_outputs = self.execute(**native_inputs)
File "/fn/lib/venv/lib/python3.10/site-packages/flytekit/core/python_function_task.py", line 163, in execute
return self.dynamic_execute(self._task_function, **kwargs)
File "/fn/lib/venv/lib/python3.10/site-packages/flytekit/core/python_function_task.py", line 268, in dynamic_execute
return self.compile_into_workflow(ctx, task_function, **kwargs)
File "/fn/lib/venv/lib/python3.10/site-packages/flytekit/core/python_function_task.py", line 204, in compile_into_workflow
literals={
File "/fn/lib/venv/lib/python3.10/site-packages/flytekit/core/python_function_task.py", line 205, in <dictcomp>
binding.var: binding.binding.to_literal_model() for binding in workflow_spec.template.outputs
File "/fn/lib/venv/lib/python3.10/site-packages/flytekit/models/literals.py", line 467, in to_literal_model
return Literal(map=LiteralMap(literals={k: binding.to_literal_model() for k, binding in self.map.bindings}))
File "/fn/lib/venv/lib/python3.10/site-packages/flytekit/models/literals.py", line 467, in <dictcomp>
return Literal(map=LiteralMap(literals={k: binding.to_literal_model() for k, binding in self.map.bindings}))
ValueError: too many values to unpack (expected 2)
So we do construct the dictionary in the dynamic task and return it though like
return in_fastq1, in_fastq2, {k: get_empty_flyte_file() for k in in_metrics_keys}
is this expected?Sathish kumar Venkatesan
10/28/2022, 7:50 AMFLYTE_SNOWFLAKE_CLIENT_TOKEN: <JWT_TOKEN>
Andrew Korzhuev
10/28/2022, 11:45 AMs3://.../2y/project-x/development/
s3://.../metadata/propeller/project-x-development-aq9kwx7nbqmxhccp2wgj/
Hampus Rosvall
10/28/2022, 12:07 PMflyte
namespaces grepping after task name yields following logs (in comment).Rahul Mehta
10/28/2022, 3:52 PMAdedeji Ayinde
10/28/2022, 4:20 PMYash Panchwatkar
10/28/2022, 5:36 PMYash Panchwatkar
10/28/2022, 5:39 PMAdedeji Ayinde
10/28/2022, 9:14 PMAdedeji Ayinde
10/28/2022, 9:14 PMKevin Su
10/28/2022, 10:15 PMAdedeji Ayinde
11/02/2022, 5:59 PMKevin Su
11/02/2022, 6:52 PMJason Porter
11/02/2022, 8:31 PM