Hi! currently, I am trying to use flyteconnector b...
# flyte-support
f
Hi! currently, I am trying to use flyteconnector bigquery. However, after trying to open StructuredDataset using
Copy code
bq_template = BigQueryTask(
    name="<name>",
    inputs={},
    query_template="SELECT * FROM <project_id>.<dataset_id>.<table>",
    output_structured_dataset_type=StructuredDataset,
    task_config=BigQueryConfig(ProjectID="<project_id>"),
)
@task(
    container_image=image_name,
)
def convert_bq_table_to_pandas_dataframe(ds: StructuredDataset) -> pd.DataFrame:
    return ds.open(pd.DataFrame).all()

@workflow
def full_bigquery_wf() -> pd.DataFrame:
    ds = bq_template()
    return convert_bq_table_to_pandas_dataframe(ds=ds)
So, what happen is when the bigquery task query data from bq it uses flyteconnector service account but after that when the python task try to extract pandas dataframe it is unable to do so.
Copy code
google.api_core.exceptions.PermissionDenied: 403 Access Denied: Dataset <project_id>:<job_id>: User does not have permission to access results of another user's job.
I have already deploy flyteconnector and enable plugin as documentation mentioned. Any help would be greatly appreciate :).
a
It does not appear to be a flyte issue. I think it’s an issue with your role that you created in GCP for
big query
. See the process to create a user with required role and permission below: https://cloud.google.com/bigquery/docs/quickstarts/quickstart-client-libraries The user/identity who created the job and the user/identity trying to access the job are different. You need to give proper access control the user/identity querying the job, so they can access job for other users. Once done first try to use it with a standalone python script with a select query that you are trying and check if it works. After that check that you have followed the steps as below: • flyte-binary - check the values • flyte-core - check the values • helm upgrade
f
Question: is it possible for my configuration to cause the result of the job to become missing. As I tried to access the result via UI with BigQuery Admin but it still give me this error.
a
No, your job will not go missing. Try the query
query_template="SELECT * FROM <project_id>.<dataset_id>.<table>"
with a standalone python code and see if you get the result with the same identity that you configured for flyte connector. https://cloud.google.com/bigquery/docs/quickstarts/quickstart-client-libraries#python_1 In case the python code does not work there can be 2 issues: • Job was not created • The identity/user_id does not have sufficient prev. In case you are referring if the job was created or not, you need to provide the code where you are creating the job, as the code you have shared only queries for the result.
👍 3
f
Solution: try bind flyteconnector kubernetes service account with flyte worker google service account