I'm working on a hybrid approach to define and tas...
# flytekit-java
b
I'm working on a hybrid approach to define and tasks, workflows & launch plans via the
org.flyte.flytekitscala
SDK and register via
jflyte
tooling. That works well enough, be we know there are significant gaps when it comes to Flyte Admin tooling on the JVM. To mitigate, I took the protos from the Flight IDL, generated an API, and have the full scope of the Flyte service, specifically the admin service. I am now able to launch workflows via code, get execution status, etc. The next piece of the puzzle is to come up with a strategy to bind task/workflow inputs/outputs, so that I can dynamically: 1. Launch workflow executions with dynamic inputs, and, 2. Retrieve type-safe workflow execution output values. I've put together

a small video to show a bit of a simulation of what I'm trying to do

. While the example shows primitives, our use case is likely going to rely on JSON payloads throughout (for now), so that may significantly simplify the strategy here. It'd be great if I could get some thoughts, maybe from the Spotify folks who built the SDK, or anyone else who is trying to adopt the platform from the JVM.
a
This is amazing, thank you Andy
b
What data type are people using to support a JSON payload?
I'm assuming it's a
Struct
from the google protos.
Looks like these are my choices
Copy code
public enum Kind {
    PRIMITIVE,
    GENERIC,
    BLOB,
    BINARY
  }
The GENERIC maps to the
Struct
type, so I'll investigate some examples that leverage that type and see where it goes...
Ok, I think I've roughly got to the code to generate a literal map from a case class...
Copy code
case class Person(name: String)
  case class Input(person: SdkBindingData[Person])

  val input = Input(
      person = SdkBindingDataFactory.of(
        SdkLiteralTypes.generics(),
        Person("Andy")
      )
    )

    val literalMap = SdkScalaType[Input].toLiteralMap(input)
That creates the Java version... I just need to get it working for the Scala version of the literal map.
So ⬆️ is the hard part. I have generated my own service binding, and in order to leverage the data binding of the SDK, I have to come up with a transformer from the SDK's
org.flyte.api.v1.Literal
(which appears to be hand-written) to the generated
flyteidl.core.literals.Literal
, which is generated from the protos. The alternative would be to update the current SDK with all the latest protos, and generate AdminService stubs. Then there would be no extra transformation, because the generated service stubs would expect the right data type. There appears to be no collaborators on the java SDK, no GitHub issues, etc. Who would we talk to about this?
e
That creates the Java version... I just need to get it working for the Scala version of the literal map.
What do you mean with this?
> What data type are people using to support a JSON payload? The generic sounds good to me, it is the same that flytekit-java uses for the case classes or autovalues
b
Unfortunately I learned the hard way that it doesn’t support different collection types or integers. I had a
Vector
of
Int
and my workflow blew up at runtime. Generic (struct) is not viable to support JSON.
So it’s either a blob or a binary.
e
oh.. do you mean to serialized a json? In that case yeah maybe blob or binary.. the generic one is just if you translate the json into a dict, I guess verctor[int] could be collection of longs in the generic case, right?
b
Correct.
I’m hoping someone can explain the use case for using blob versus binary
e
tbh the result are pretty similar because the flyte input output are written to proto files.. but I guess the main difference is that if you use binary, you are writing the data on the protos itself but if you use blob you are writing a ref which points to the real data file in the protos
probably this is something that the flyte folks could also reply because blob vs binary is something that comes from flyte idl, not just flytekit-java
a
I guess the main difference is that if you use binary, you are writing the data on the protos itself but if you use blob you are writing a ref which points to the real data file in the protos
I think you're right Andres but hoping @high-accountant-32689 knows better