cuddly-jelly-27016
10/16/2024, 8:49 PMread_session = client.create_read_session(parent=parent, read_session=requested_session)
stream = read_session.streams[0]
reader = client.read_rows(stream.name)
frames = []
for message in reader.rows().pages:
frames.append(message.to_dataframe())
return pd.concat(frames)
Expected behavior
The BQ decoder should be able to read all data for bigger datasets as well.
This decoder code needs to be changed to either set the max stream to 1 in the read session or read from all streams as below snippet:
frames = []
for stream in read_session.streams:
reader = client.read_rows(stream.name)
for message in reader.rows().pages:
frames.append(message.to_dataframe())
if len(frames) > 0:
df = pd.concat(frames)
else:
schema = pyarrow.ipc.read_schema(
pyarrow.py_buffer(read_session.arrow_schema.serialized_schema)
)
df = schema.empty_table().to_pandas()
return df
Additional context to reproduce
No response
Screenshots
No response
Are you sure this issue hasn't been raised already?
• Yes
Have you read the Code of Conduct?
• Yes
flyteorg/flytecuddly-jelly-27016
10/16/2024, 8:49 PM