Hey I'm having trouble registering a workflow. It looks like the tasks are registering alright: ```f...
j

Jake Neyer

almost 4 years ago
Hey I'm having trouble registering a workflow. It looks like the tasks are registering alright:
flytectl register files --project chariot-sdk-test --domain development --archive out.tar.gz --version v2

 ---------------------------------------------------------------------- --------- ------------------------------------------------------------
| NAME (4)                                                             | STATUS  | ADDITIONAL INFO                                            |
 ---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/0_pm.nb.new.ipynb_1.pb                        | Success | Successfully registered file                               |
 ---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/1_new.ipynb_1.pb                              | Success | Successfully registered file                               |
 ---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/2_workflows.new_workflow.nb_to_python_wf_2.pb | Failed  | Error registering file due to rpc error: code =            |
|                                                                      |         | Internal desc = failed to compile workflow for             |
|                                                                      |         | [resource_type:WORKFLOW project:"chariot-sdk-test"         |
|                                                                      |         | domain:"development"                                       |
|                                                                      |         | name:"workflows.new_workflow.nb_to_python_wf" version:"v2" |
|                                                                      |         | ] with err entry not found                                 |
 ---------------------------------------------------------------------- --------- ------------------------------------------------------------
| /tmp/register024275035/3_workflows.new_workflow.nb_to_python_wf_3.pb | Failed  | Error registering file due to rpc error: code =            |
|                                                                      |         | NotFound desc = entry not found                            |
 ---------------------------------------------------------------------- --------- ------------------------------------------------------------
Any ideas why that might happen?
Hello, I am running bigquery task locally. I am using the latest flytekit 1.7.0 and I have flytekitp...
f

Frank Shen

over 2 years ago
Hello, I am running bigquery task locally. I am using the latest flytekit 1.7.0 and I have flytekitplugins-bigquery 1.7.0 and its dep *google-cloud-bigquery==*3.11.3, fsspec==2023.3.0 etc. installed locally. The bigquery_task succeeded in retrieving dataset from bigquery via select statement. However, it failed to convert it to pd.DataFrame in a subsequent task. error: Protocol not known: bq stack trace:
/venv/lib/python3.8/site-packages/fsspec/registry.py:209 in get_filesystem_class
   if protocol not in registry:
        if protocol not in known_implementations:
           raise ValueError("Protocol not known: %s" % protocol)
My code:
# from typing import Tuple
try:
    from typing import Annotated
except ImportError:
    from typing_extensions import Annotated


import pandas as pd
from flytekit import task, workflow, StructuredDataset, kwtypes
from flytekitplugins.bigquery import BigQueryConfig, BigQueryTask
import google.cloud.bigquery


bigquery_task = BigQueryTask(
    name="sql.bigquery.test",
    inputs=kwtypes(version=int),
    query_template="SELECT * FROM `bigquery-public-data.crypto_dogecoin.transactions` WHERE version = @version LIMIT 2;",
    task_config=BigQueryConfig(ProjectID=""),
    output_structured_dataset_type=pd.DataFrame

)

@task
def preproc(df: pd.DataFrame) -> None:
    print(df.head())


@workflow
def wf(version: int = 1) -> None:
    preproc(df = bigquery_task(version=version))
Do you have any idea?