I have a workflow that consists of a few python ta...
# ask-the-community
c
I have a workflow that consists of a few python tasks and an R task. I have been able to run simple R tasks with int I/O as described in the docs. My workflow requires passing either a dataframe or csv from the upstream (python) task into the R task, then writing out a csv (or dataframe) from R for the downstream python task.
Copy code
py task  --> R task --> py task
What is the recommendation for passing more complex types into R tasks? Will raw containers work for my use case, or is there another way to execute more effectively?
k
Hi @Cody Scandore welcome to the community. I am assuming you are using raw container tasks to run r scripts. Raw containers do not support dataframes today. You could use R2py or reticulate to invoke for now
Or pass the dataframe as a parquet or csv
c
Hi Ketan, thank you for the welcome.. Correct - I'm using Raw containers. Are there any examples passing parquet/csv/FileTypes in and out of R tasks? The example in the docs uses only integer i/o. Or, is maybe the recommended implementation is to pass filepaths?
k
hmm ya that would what you will have to do
we do not have an example of using parquet in R
but i agree we should write one 😄
150 Views