Hi I am running a workflow locally on my machine using `pyfl Flyte #flyte-support

Hi, I am running a workflow locally on my machine ...

gorgeous-beach-23305

06/20/2023, 3:08 PM

Hi, I am running a workflow locally on my machine using

pyflyte run

python -m

but, I am getting this error -

Copy code

2023-06-20 17:04:51,354876 ERROR    {"asctime": "2023-06-20 17:04:51,354", "name": "flytekit", "levelname":  base_task.py:587
                                    "ERROR", "message": "Failed to convert outputs of task 'read_dataset' at                 
                                    position 0:\n  [Errno 28] Error writing bytes to file. Detail: [errno                    
                                    28] No space left on device"}

Is there a virtual directory or disk flyte creates when running on a local machine which I can purge? The data for training is being pulled from s3.

millions-night-34157

06/20/2023, 3:25 PM

I've seen this error when I tried

flytectl demo start

command then I purged all the unused docker images, containers, volumes, etc. then error disappeared

gorgeous-beach-23305

06/20/2023, 3:35 PM

I am not using any docker image at the moment, I mean I am not passing in an image with

--image

. I am just running with something like

pyflyte run sample.py sample_wf

glamorous-carpet-83516

06/20/2023, 3:36 PM

hmm, are you using image spec in the workflow?

gorgeous-beach-23305

06/20/2023, 3:42 PM

No, no image spec in my workflow.

gorgeous-beach-23305

06/20/2023, 3:58 PM

Looks like this is happening when creating a StructuredDataset -

Copy code

❱  865 │   │   lv = transformer.to_literal(ctx, python_val, python_type, expected)
 sd = StructuredDataset(dataframe=python_val, metadata=meta)    
 600 │   │   return self.encode(ctx, sd, python_type, protocol, fmt, sdt)    
627 │   │   sd_model = handler.encode(ctx, sd, structured_literal_type) 
53 │   │   df.to_parquet(                                                       │                 
                                    │    54 │   │   │   path,                                                            │                 
                                    │    55 │   │   │   coerce_timestamps="us",                                          │                 
                                    │    56 │   │   │   allow_truncated_timestamps=False,  
 2976 │   │   return to_parquet(                                                 │                 
                                    │    2977 │   │   │   self,                                                          │                 
                                    │    2978 │   │   │   path,                                                          │                 
                                    │    2979 │   │   │   engine,  

 430 │   impl.write(                                                              │                 
                                    │   431 │   │   df,                                                                  │                 
                                    │   432 │   │   path_or_buf,                                                         │                 
                                    │   433 │   │   compression=compression,  
204 │   │   │   │   self.api.parquet.write_table(                                │                 
                                    │   205 │   │   │   │   │   table, path_or_handle, compression=compression, **kwargs │                 
                                    │   206 │   │   │   │   )         
2985 │   │   │   writer.write_table(table, row_group_size=row_group_size) 
1054 │   │   self.writer.write_table(table, row_group_size=row_group_size)

gorgeous-beach-23305

06/20/2023, 4:11 PM

I think the StructuredDataset was being created multiple times, i.e the writing to parquet files. So I moved that to a separate task and added a

cache=True

But now I am getting this error -

*FlyteScopedUserException:* database or disk is full

What local database does flyte use and is there a way to flush it? I am running on my mac and I don't think my disk is full 🙂

glamorous-carpet-83516

06/20/2023, 7:00 PM

are you able to share your workflow code

glamorous-carpet-83516

06/20/2023, 7:01 PM

we didn’t run any database when you run workflow locally

302 Views

Open in Slack

Previous Next