Hi all, I am trying to run a BigQueryTask and load...
# ask-the-community
j
Hi all, I am trying to run a BigQueryTask and load the output to a pd.DataFrame. I followed the example code and have the following output defined:
Copy code
Dataset = Annotated[StructuredDataset, SCHEMA]
Where schema is using
kwtypes()
to define columns. I’ve set
Dataset
as the
output_structured_dataset_type
for the
BigQueryTask
, when I try to run a task to convert the output to a pd.DataFrame I get an error:
Copy code
403 Access Denied: Dataset my-flyte-project:<job_ib>: User does not have permission to access results of another user's job.
The task is defined as shown below:
Copy code
@task
def convert_bq_table_to_pandas_dataframe(sd: Dataset) -> pd.DataFrame:
    return sd.open(pd.DataFrame).all()
I’ve given the service-accounts for
flyteworker
and
flytepropeller
the roles
BigQuery Admin
and
Owner
but still get this Error… Has anyone had this error before? Or know how to resolve it? Alternatively does anyone know the required permissions to get this to work? Thanks
To add more information, it seems that the
BigQueryTask
is using the
flytepropeller
service account, whilst the
convert_bq_table_to_pandas_dataframe
task is using the
flyteworker
service account. Is this the expected behaviour?
s
j
Hi Samitha, are there any config changes needed for this?
s
i'm not sure. @Eduardo Apolinario (eapolinario) do you know?
d
@Jake Dodd if you run the query in the PR's example, what
User
you see?
j
Hi David, When I run a query using the
BigQueryTask
the User is
<mailto:flyte-gcp-flytepropeller@my-project.iam.gserviceaccount.com|flyte-gcp-flytepropeller@my-project.iam.gserviceaccount.com>
, if I use the
bigquery Client
in a
task
the User is
<mailto:flyte-gcp-flyteworkers@my-project.iam.gserviceaccount.com|flyte-gcp-flyteworkers@my-project.iam.gserviceaccount.com>
I’ve given the
flyteworker
service account the
Service Account Token Creator
permission on
flytepropeller
service account