Hey All! I am trying to run a simple task on a Fly...
# ask-the-community
n
Hey All! I am trying to run a simple task on a FlyteDirectory of json files, load these jsons and return a list of dicts..
Copy code
@task(requests=Resources(cpu="1", mem="1Gi"))
def read_json_files(json_dir: FlyteDirectory) -> List[Dict]:
    loaded_objs = []
    for json_file in os.listdir(json_dir):
        file_path = os.path.join(json_dir, json_file)
        with open(file_path, 'r') as jf:
            try:
                obj = json.load(jf)
            except:
                print(f'Invalid json file: {json_file}')
                continue
            loaded_objs.append(obj)
    <http://logger.info|logger.info>(f'done')
    return loaded_objs
I am working on 238 json file. Logs show that operation is done in few seconds. resources used are negligible. but the task never finishes. stuck on running. Update: in propeller logs, I see this error:
Copy code
Failed to check if the output file exists. Error: output file @[<gs://flyte-storage/metadata/propeller/proj-development-fcd5d390ce6a443bea7d/n0/data/0/outputs.pb>] is too large [41447872] bytes, max allowed [10485760] bytes
and
Copy code
failed Execute for node. Error: failed at Node[n0]. RuntimeExecutionError: failed during plugin execution, caused by: output file @[<gs://flyte-storage/metadata/propeller/proj-development-fcd5d390ce6a443bea7d/n0/data/0/outputs.pb>] is too large [41447872] bytes, max allowed [10485760] bytes
how to change this max value? I think it should fail the task instead of keeping it stuck on Running.
p
Hey Nizar, I imagine it's possible to change the size cap of the metadata store, but it's there for a reason and would probably be awkward to do so. Perhaps it would be best to refactor this aggregator function to return a FlyteFile, and then consume that downstream instead of a list of dicts.
n
you're correct, but I suggest that It should fail the wf , currently it keeps it stuck at "running" state
y
yeah what pryce said. it’s possible to change the size, but i think that’s too big even for this. protobuf itself has a limit and i think this is over that. in any case, we generally encourage users to put real data in offloaded data types (like flytefile). typically the permissions and such are different between the bucket that stores primitives (what we call metadata, like ints and strings, or in this case a nested dict of values) and the bucket that stores offloaded types
and yeah we should fail. i think it does fail
can you let it run for a few hours? i think this is just an issue of timing/retrys