Hello hello - I'm trying to understand how `pyflyt...
# flyte-support
p
Hello hello - I'm trying to understand how
pyflyte
works with custom dependencies and a custom docker image. if i install custom dependencies in my venv and then run
pyflyte run --remote --image {myRemoteImage} --project flyte workflows/myworkflow.py myworkflowname
things work as expected. but if i uninstall the custom dependencies locally
pyflyte run
(and
register
and
package
) complain about the dependencies not existing. obviously those dependencies are required when the workflow is run remotely, which is why they are included in
myRemoteImage
. but why are they needed for the
register
command, and why must they be installed locally? the documentation says:
Copy code
Packages and zips up the directory/file that you specify as the argument to pyflyte register, along with any files in the root directory of your project. The result of this is a tarball that is packaged into a .tar.gz file, which also includes the serialized task (in protobuf format) and workflow specifications defined in your workflow code.
the tgz created from
package
contains the serialized tasks and workflows but does not contain any custom dependencies. so are local dependencies only required as some form of validation during serialization?
presumably only dependencies either passed in as cli args or the ones incorporated in the custom docker image are actually used during workflow runs?
h
All your observations are correct! During registration, pyflyte loads all the modules and only sudo-executes the
@workflow
functions... so any packages/modules that are needed for loading or for running the
@workflow
function (with no inputs) need to exist at the registration time
p
is the requirement that packages/modules are installed locally a python/technical necessity? or is it an intentional safety/validation thing for pyflyte?
h
That's an interesting distinction. pyflyte does rely on python for initial safety/validation. However, you can build the full spec of the tasks (docker image + interface description) and workflows (interface description and node graphs/data flow) completely outside of python and submit that through
pyflyte register
or
flytectl register files
and that will bypass the python checks/validations. That full spec is in protobuf. We do have a product in union.ai that allows you to design the workflow fully in the browser (drag-drop style) and rely on the registration API to surface any compilation errors...
p
ok, that's helpful. thanks a lot. one (probably) last question on this topic - is it correct to say that one could trigger
pyflyte run -d production --remote --image...
and have a different local dependency version installed than what actually gets used when the workflow execution runs?
h
Correct...
👍 1