https://flyte.org
Join Slack
hello community.. does flyte have the AWS SQS dependency?? @anyone
y

Yash Panchwatkar

almost 3 years ago
hello community.. does flyte have the AWS SQS dependency?? @anyone
y
k
  • 2
  • 11
  • 169
Hi team, The <default-pod-template> does not add volumeMounts to flyte containers. Can this be suppo...
f

Fredrick

about 3 years ago
Hi team, The default-pod-template does not add volumeMounts to flyte containers. Can this be supported ? Here is my pod-template. The volumeMounts
scratch-volume
does not get added to the flyte containers.
apiVersion: v1
kind: PodTemplate
metadata:
  name: flyte-pod-template
  namespace: flyte
template:
  metadata:
    labels:
      app: flyte
  spec:
    containers:
    - name: default
      image: <http://docker.io/rwgrim/docker-noop|docker.io/rwgrim/docker-noop>
      volumeMounts:
      - mountPath: /scratch
        name: scratch-volume
    volumes:
    - name: scratch-volume
      emptyDir: {}
f
d
  • 2
  • 5
  • 169
Hi, How to change the container image name in the task decorator to fetch it directly from the sand...
k

KS Tarun

about 3 years ago
Hi, How to change the container image name in the task decorator to fetch it directly from the sandbox.config file ? I've tried as mentioned below:
@task(container_image="{{.image.trainer.fqn }}:{{.image.trainer.version}}")
My sandbox.config looks like this:
[images]
trainer = <http://ghcr.io/flyteorg/flytecookbook:core-latest|ghcr.io/flyteorg/flytecookbook:core-latest>
predictor = <http://ghcr.io/flyteorg/flytecookbook:pima_diabetes-d4838f0f5e39a21f845a93b9e3375a675bd75eaa|ghcr.io/flyteorg/flytecookbook:pima_diabetes-d4838f0f5e39a21f845a93b9e3375a675bd75eaa>
I'm getting this error:
raise AssertionError(f"Image Config with name {name} not hound in the configuration")
AssertionError:Image Config with name trainer not found in the configuration
Any suggestions about this ?
k
j
+7
  • 9
  • 56
  • 169
Hi, I'm trying to register a flow and getting pytorch error. Any input is greatly appreciated. <@UNW...
o

Open AIMP

about 3 years ago
Hi, I'm trying to register a flow and getting pytorch error. Any input is greatly appreciated. @Haytham Abuelfutuh @Ketan (kumare3)@Samhita Alla (XADM-1068/Code$ pyflyte register --image openaimp/flyte:1 Traceback (most recent call last): File "/home/xaimpl/.local/bin/pyflyte", line 5, in <module> from flytekit.clis.sdk_in_container.pyflyte import main File "/home/xaimpl/.local/lib/python3.10/site-packages/flytekit/__init__.py", line 185, in <module> from flytekit.extras import pytorch File "/home/xaimpl/.local/lib/python3.10/site-packages/flytekit/extras/pytorch/__init__.py", line 26, in <module> from .checkpoint import PyTorchCheckpoint, PyTorchCheckpointTransformer File "/home/xaimpl/.local/lib/python3.10/site-packages/flytekit/extras/pytorch/checkpoint.py", line 25, in <module> class PyTorchCheckpoint: File "/home/xaimpl/.local/lib/python3.10/site-packages/flytekit/extras/pytorch/checkpoint.py", line 30, in PyTorchCheckpoint module: Optional[torch.nn.Module] = None
AttributeError: module 'torch' has no attribute 'nn'
🖖 1
o
s
y
  • 3
  • 9
  • 169
Hi, I'm seeing a weird error with workflow executions. Specifically seeing a `wf node` was `aborted`...
e

Edgar Trujillo

about 3 years ago
Hi, I'm seeing a weird error with workflow executions. Specifically seeing a
wf node
was
aborted
due to -
node timed out
but the node status is
UNKNOWN
so the node duration continues to go up. This node just called a sagemaker training job via the sdk with the following code:
estimator = Estimator(
    dummy_params
)

# Start the job
estimator.fit(inputs=data_channels)
job_name = estimator.latest_training_job.job_name
<http://logging.info|logging.info>(f"Started sagemaker training job {job_name}")

return os.path.join(output_location, job_name, "output", "model.tar.gz")
SageMaker is showing the training job completed after 55 hours & a similar node was used in another wf and ran for 1 hour and that node execution shows success and cached the model artifact s3 uri. It's happened a few times now and I don't want to continue kicking off training jobs that wont be cached by flyte Does any one have an idea what can be causing this
Node Timeout
?
e
k
d
  • 3
  • 5
  • 169
Hi team If I use <RawContainer>, flytepropeller fails to spawn the pod with the following error. ``...
f

Fredrick

about 3 years ago
Hi team If I use RawContainer, flytepropeller fails to spawn the pod with the following error.
flytepropeller-7476486c6c-rtkp7 flytepropeller E1002 21:16:23.580394       1 workers.go:102] error syncing 'maps-flyte/a2lmztn5z
v6tt26ddk8k': failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [container]: [[deny-capabilities] container <a2lmztn5zv6tt26ddk8k-n0-0> has a denied capability. Denied capabilities are ["DAC_READ_SEARCH", "NET_ADMIN", "SYS_ADMIN", "SYS_MODULE", "SYS_PTRACE", "DAC_OVERRIDE", "FOWNER", "KILL", "MKNOD", "NET_BIND_
SERVICE", "NET_RAW", "SETFCAP", "SETGID"]] failed to create resource, caused by: admission webhook "validation.gatekeeper.sh" denied the request: [deny-capabilities] container <a2lmztn5zv6tt26ddk8k-n0-0> has a denied capability. Denied capabilities are ["DAC_READ_SEARCH", "NET_ADMIN", "SYS_ADMIN", "SYS_MODULE", "SYS_PTRACE", "DAC_OVERRIDE", "FOWNER", "KILL", "MKNOD", "NET_BIND_SERV
ICE", "NET_RAW", "SETFCAP", "SETGID"]
f
k
h
  • 3
  • 11
  • 169
Hi team… I’m seeing this error in a slightly long running flyte task, which results in the task gett...
r

Rupsha Chaudhuri

about 3 years ago
Hi team… I’m seeing this error in a slightly long running flyte task, which results in the task getting triggered again and again.
object [my_domainctw5x2vdnkvd-n0-0] terminated in the background, manually
r
d
k
  • 3
  • 29
  • 169
Hi all, consider a workflow ```launch_plan.LaunchPlan.get_or_create( workflow=wf, name="my_w...
s

Sebastian

about 3 years ago
Hi all, consider a workflow
launch_plan.LaunchPlan.get_or_create(
    workflow=wf,
    name="my_wf_prod", # prod specific
    schedule=CronSchedule(schedule="1 0 * * *"),
    default_inputs={
        "execution_date": datetime.now(), # aside: this does not work. How do I provide an execution date?
    },
    fixed_inputs={
        "out_path": consts.OUT_S3_PATH_PROD # prod specific vars
    },
)
there is also a copy of this launch plan for the dev env, so that I can plug in the dev env vars. But this is silly and requires two copies. How can I parameterize workflows with something like env vars so I can vary them between prod and dev?
s
k
k
  • 3
  • 5
  • 169
Hi, anyone have a guess as to why `TypeTransformerFailedError` occurs when using imported variables ...
z

Zachary Carrico

about 3 years ago
Hi, anyone have a guess as to why
TypeTransformerFailedError
occurs when using imported variables in workflows, but if the variables are declared in the same python module as the workflow there is no error? Thank you!
z
s
+2
  • 4
  • 36
  • 169
My company has policies for domain names for certificate creation. How can I customize the "Address"...
s

Shahwar Saleem

over 3 years ago
My company has policies for domain names for certificate creation. How can I customize the "Address" generated by Flyte core? Should that go in the ingress configuration?
✅ 1
s
k
+3
  • 5
  • 29
  • 169
Previous747576Next

Flyte

Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.

Powered by