Hello, I am trying to use jinjasql in flyte task. ...
# ask-the-community
f
Hello, I am trying to use jinjasql in flyte task. jinjasql will need to read a templated sql file (a text file with extension .sql) from a subdir somewhere in the project. I can image the flyte task may fail due to path / file not found, and that is because the sql file is not being copied to the remote flyte server when pyflyte run or pyflyte register is executed. Am I correct? What will be the solution if I still want to use jinjasql? I am using flytekit 1.2.4.
j
you probably want to build a custom image first that has the
.sql
file copied in. then you can run
pyflyte run --remote --image <CUSTOM_IMAGE>...
I believe ImageSpec can handle this case for you cleanly, but will defer to someone else to confirm.
f
@jeev, thank you for the ideas. I cannot afford to build a custom image every time a new arbitrary text or json file is added to the projects. Because there will be multiple teams sharing the same image (which is managed by a central platform team). It will quickly becomes inefficient and unmanageable.
j
fast register will work here too. as long as the json files are included in the source root. it will bundle the whole package up and extract in the container at runtime. but afaik, pyflyte run only works with a single file.
f
@jeev, thanks for the idea. I’ve tried pyflyte register instead of pyflyte run. However, it still complained about the file missing in /root the same way as pyflyte run.
Is pyflyte register not performing fast register?
Also the doc for flytectl register (if that’s what you meant for fast register) is very vague. https://docs.flyte.org/projects/flytectl/en/latest/gen/flytectl_register_files.html What is the syntax to bundle whole project up via fast register? 🙏
j
pyflyte register does do fast-register by default i believe
f
@jeev, I tried pyflyte register and it didn’t work, throwing the same error ‘/root/[somefile.sql] not found’
j
@Frank Shen it will be hard to assist without more context. i'll try to create a small example for you.
f
Hi @jeev, I have a simple workflow that reads from a json file in a task.
Copy code
from typing import Dict, List, NamedTuple, Tuple, Any, Optional
import json
from flytekit import task, workflow, current_context, Resources, LaunchPlan, Email, Slack

@task
def get_config() -> Dict[str, Any]:
    # Open the JSON file
    with open('parameters.json') as f:
        # Load the contents of the file into a dictionary
        params = json.load(f)
        print(type(params))
        print(params)
        return params


@workflow
def wf() -> Dict[str, Any]:
    return get_config()
In the parameters.json (which is in the same same folder as the workflow file):
Copy code
{
  "COMMON": {
    "table_name": "ABC",
    "threshold": 0.4
  }
}
Could you try pyflyte register them and see if you can get it run in the flyte remote server? Note: I can run it locally without issue. However remotely, it error with: ‘/root/parameters.json is not found’. This is the command I run to fast register (I only mention the flow .py file name, not the package). Could this be the root cause of the issue? What is the correct pyflyte register syntax to register the whole package?
Copy code
pyflyte register sample_workflow.py
j
Screenshot 2023-06-01 at 10.53.33 AM.png
looks very similar to yours
f
Is my command to fast register not correct?
Copy code
pyflyte register sample_workflow.py
j
i think you need to provide project and domain as well
i have an example command in readme
you should be able to work backwards from there
let me know if it works @Frank Shen
f
@jeev The URL above didn’t work. 404 - page not found Cannot find a valid ref in jeev/pyflyte-register/001-simple-pyflyte-register
j
ah sorry i merged
f
Hi @jeev, thanks. Is there a way to exclude dir while doing pyflyte register? I created a venv inside the project and the subdir has a lot of installed dependencies. And that is causing the packaging to take a long time while registering.
j
unfortunately no. but you can create a venv outside of the directory
f
True
@jeev, I also verified pyflyte register works. I don’t know why it didn’t work for me before. Thanks a lot! That should solve the problem with reading from a .sql file.
j
glad it works!
152 Views