Thread
#ask-the-community
    Matheus Moreno

    Matheus Moreno

    1 week ago
    Hey, everyone. We're dealing with a weird bug here and we have no idea how to fix it. Basically, we have a task that's not finishing. We run a workflow with 8 tasks, and at the very last one it hangs. All of its code is being executed (a lot of prints from the very start all the way to the the
    return
    statement confirm that), but it hangs at the end and never actually finishes. We were able to reproduce it in our remote server and locally. On the remote server, none of the prints (or logs) are being shown on Stackdriver. What could be happening?
    We're using
    flytekit~=1.0.0
    , so I believe we're using the latest version if compatible release is to be trusted
    prints logged when using
    pyflyte run
    . This
    7 None
    happens right before the return statement (
    None
    is the value of that statement)
    Ketan (kumare3)

    Ketan (kumare3)

    1 week ago
    so IIUC, you are returning from your code and you are saying that the return completes, but hangs after this?
    cc @Eduardo Apolinario (eapolinario) / @Yee Could this be upload of the literal?
    Matheus Moreno

    Matheus Moreno

    1 week ago
    Yeah, I'm not sure the return completes since I can't execute anything after it 😅 but the code right after it does.
    Ketan (kumare3)

    Ketan (kumare3)

    1 week ago
    and you said you can reproduce this locally?
    Matheus Moreno

    Matheus Moreno

    1 week ago
    Weirdly enough, this task is the only one that returns
    None
    , and that does not send its return value to another variable in the workflow spec. The workflow also returns
    None
    .
    Eduardo Apolinario (eapolinario)

    Eduardo Apolinario (eapolinario)

    1 week ago
    is there anything unusual about the task? Can you share its overall structure?
    Matheus Moreno

    Matheus Moreno

    1 week ago
    yeah, this screenshot I sent is from
    pyflyte run
    Yee

    Yee

    1 week ago
    sounds like a bug? can you change it to
    return 5
    and make the signature an int?
    what’s the return type now?
    Matheus Moreno

    Matheus Moreno

    1 week ago
    This is the overall structure of the task:
    @extended_task(integrations=['gcloud'], requests=Resources(mem='4Gi'))
    def update_bq_table(
        amnt_dataframe: pd.DataFrame,
        gcs_config_path: str
    ) -> None:
        config_dict = read_file(gcs_config_path)
        update_gbq_table(    # Function that calls pandas_gbq.to_gbq()
            amnt_dataframe,
            config_dict['table_schema'],
            config_dict['table_destination']
        )
    @extended_task
    is a special decorator we use that does some pre and post-processing on tasks. Other tasks with it are running fine; the
    7 None
    on my screenshot above is being called on the wrapper, after a
    output = task_func(*args, **kwargs)
    and before a
    return output
    .
    Yee

    Yee

    1 week ago
    can you try changing it to int?
    just to see if it that fixes it?
    Matheus Moreno

    Matheus Moreno

    1 week ago
    The workflow looks like this:
    @workflow
    def main_workflow(
            hotel_amnt_sql_path: str,
            config_path: str,
            config_pre_process_path: str,
            model_config_path: str
    ) -> None:
        preview_amnt = ...
    
        # Some other tasks, all with <output> = <function call>
    
        update_bq_table(
            amnt_dataframe = hotel_topics,
            gcs_config_path = config_path
        )
    yeah I'll try it
    SĂ©rgio de Melo Barreto Junior

    SĂ©rgio de Melo Barreto Junior

    1 week ago
    Hey! I work with @Matheus Moreno. We changed it to int, but it didn't work 😕
    Yee

    Yee

    1 week ago
    @Matheus Moreno @SĂ©rgio de Melo Barreto Junior hop on call?
    Matheus Moreno

    Matheus Moreno

    1 week ago
    yeah, send us the link
    Yee

    Yee

    1 week ago
    can you try something for me?
    in the body of the task that’s hanging
    can you delete everything that’s in that function and just replace it with
    print(amnt_dataframe.describe().to_html())
    and keep all the
    print(7)
    s
    SĂ©rgio de Melo Barreto Junior

    SĂ©rgio de Melo Barreto Junior

    1 week ago
    yeah I'll try it
    it is hanging in this
    print(amnt_dataframe.describe().to_html())
    Yee

    Yee

    1 week ago
    ah nice
    can you remove the
    .to_html()
    ?
    and see if it still hangs?
    SĂ©rgio de Melo Barreto Junior

    SĂ©rgio de Melo Barreto Junior

    1 week ago
    still hanging
    Eduardo Apolinario (eapolinario)

    Eduardo Apolinario (eapolinario)

    1 week ago
    how big is this dataframe? Does it hang if you try with only a few rows ?
    Yee

    Yee

    1 week ago
    so describe is failing. yeah how big is this
    SĂ©rgio de Melo Barreto Junior

    SĂ©rgio de Melo Barreto Junior

    1 week ago
    237857 rows x 2 columns
    Yee

    Yee

    1 week ago
    that’s not that big

    SĂ©rgio de Melo Barreto Junior

    SĂ©rgio de Melo Barreto Junior

    1 week ago
    5,7MB
    Yee

    Yee

    1 week ago
    are you able to run describe outside of flyte
    just like in jupyter or in ipython or something, create the dataframe manually and try describe on it.
    SĂ©rgio de Melo Barreto Junior

    SĂ©rgio de Melo Barreto Junior

    1 week ago
    one of our columns is the "topics" and it has list for each row of the dataframe. Is it possible to be the problem?
    Yee

    Yee

    1 week ago
    the whole dataframe is 5.7 MBs?
    i don’t think any structure that’s that small should be a problem for pandas
    in any case
    can you maybe continue to investigate on the side? and in the meantime, add this to the top of your file
    from flytekit.deck.renderer import TopFrameRenderer
    from typing_extensions import Annotated
    and then make the task like
    @task
    def mytask() -> Annotated[pd.DataFrame, TopFrameRenderer(10)]: ...
    that should make it so that the renderer used just grabs the first 10 rows
    will make it skip the describe call
    but this is something we should continue to investigate. do you think you can send us a parquet file with the smallest set of data that can repro this?
    SĂ©rgio de Melo Barreto Junior

    SĂ©rgio de Melo Barreto Junior

    5 days ago
    @Yee Do you have any update? I am still fighting against this hanging problem 😕
    Yee

    Yee

    5 days ago
    let me play around with this tonight.
    but did the workaround not work?
    Eduardo Apolinario (eapolinario)

    Eduardo Apolinario (eapolinario)

    4 days ago
    just to confirm, I can see the python process get stuck when running this:
    ❯ ipython
    Python 3.8.13 (default, Mar 28 2022, 11:38:47)
    Type 'copyright', 'credits' or 'license' for more information
    IPython 8.5.0 -- An enhanced Interactive Python. Type '?' for help.
    
    In [1]: import pandas as pd
    
    In [2]: pd.read_parquet("/home/eduardo/Downloads/amnt_dataframe.parquet.gzip")
    Out[2]:
                          generic_sku                                           topics
    0       HT-0008-0-0-0-0-0-0-0-0-0                  [ST5, ST6, ST2, ST13, ST4, ST3]
    1       HT-000M-0-0-0-0-0-0-0-0-0             [ST5, ST6, ST2, ST7, ST13, ST4, ST3]
    2       HT-000W-0-0-0-0-0-0-0-0-0            [ST5, ST6, ST12, ST10, ST7, ST4, ST1]
    3       HT-000X-0-0-0-0-0-0-0-0-0                       [ST5, ST6, ST13, ST4, ST1]
    4       HT-000Z-0-0-0-0-0-0-0-0-0                 [ST5, ST10, ST13, ST4, ST1, ST3]
    ...                           ...                                              ...
    237852  HT-ZZY9-0-0-0-0-0-0-0-0-0                                       [ST1, ST4]
    237853  HT-ZZYC-0-0-0-0-0-0-0-0-0  [ST5, ST6, ST2, ST12, ST7, ST13, ST4, ST1, ST3]
    237854  HT-ZZYZ-0-0-0-0-0-0-0-0-0                      [ST5, ST10, ST13, ST4, ST1]
    237855  HT-ZZZ2-0-0-0-0-0-0-0-0-0                  [ST5, ST12, ST7, ST4, ST1, ST3]
    237856  HT-ZZZJ-0-0-0-0-0-0-0-0-0                             [ST4, ST5, ST6, ST2]
    
    [237857 rows x 2 columns]
    
    In [3]: df = pd.read_parquet("/home/eduardo/Downloads/amnt_dataframe.parquet.gzip")
    
    In [4]: df.describe()
    Matheus Moreno

    Matheus Moreno

    4 days ago
    Yeah, me too. It takes a few seconds for my PC to describe 1k lines, that's why it's taking hours to describe the entire dataset.
    The solution that @Yee proposed of annotating the DataFrame limits how many lines will be used by
    .describe()
    ?
    Ketan (kumare3)

    Ketan (kumare3)

    4 days ago
    Ya, describing whole dataframe as html does not seem like a good idea
    Eduardo Apolinario (eapolinario)

    Eduardo Apolinario (eapolinario)

    4 days ago
    @Matheus Moreno, no, what Yee proposed (using the
    TopFrameRenderer
    ) does not run
    describe
    , instead it turns a fixed number of rows directly into html: https://github.com/flyteorg/flytekit/blob/3cf063955907957de65b035066fe415503a9bd65/flytekit/deck/renderer.py#L17-L27