Hi, I am curious to what extent the registration p...
# ask-the-community
s
Hi, I am curious to what extent the registration process of Flyte is customizable. Is it possible to define custom rules that validate workflows before registering them? Let's say that I would like to enforce that all workflows added to a project or domain include a task that generates a PyTorchModel or have at least one task that is scheduled on a GPU. Can I also somehow query the registered workflows? Say, for example, I want to find workflows that are tagged with metadata such as "Version XYZ" or based on their structure "has task with PyTorchModel output". Thanks!
Also, can I tag workflows with custom metadata, for example, "object detection workflow"?
Okay, I think I have found an answer to my last question. It seems like I could use [Annotations](https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/productionizing/workflow_labels_annotations.html) for this. Are these annotations stored in the DataCatalog of Flyte? If so, can I then query the DataCatalog for these annotations?
d
@Stefan Werner these should be stored on the flyteadmin DB. I haven't tried but maybe querying with
flytectl get execution -p <project> -d <domain>
--filter.fieldSelector=<http://labels.XYZ|labels.XYZ>
s
Thanks for the response! I'll try it out 🙂 Do you know about the registration process as well? It would be great to have some kind of customizable validation in place there 🙌
k
You can do customizable validation, just create your own cli. All the comparators are available programmatically
s
Okay, thanks for the suggestion!
b
@David Espejo (he/him) I believe that the labels are stored in the DB as proto and there's no clean way to query for these resources. Is that your understanding as well?
d
@Blake Jackson I think so. I haven't been able to query, but I see indeed on the LaunchPlan proto a line for the labels: https://github.com/flyteorg/flyteidl/blob/6363acca3d210eaf886c97d85122c5f26b0411ae/protos/flyteidl/admin/launch_plan.proto#L99
b
Yea, I ended up digging into this a bit more. The data is stored as
bytea
in postgres, and those bytes are indeed proto. I created a GH issue that hopefully gets some traction, but also see that
tags
concept was introduced, so potentially that can solve our use-case when filtering works w/ tags. Looking forward to seeing how that is implemented.