Hi community, as we are leveraging the agent frame...
# flyte-connectors
b
Hi community, as we are leveraging the agent framework to manage the custom agent in a decoupled way, we’d like to understand if propeller has the capabilities to talk to multiple agents end points or it only allow one agent: https://github.com/flyteorg/flyte/blob/28f65b30ed745af683e80f9c95400fda411d9c07/flyteplugins/go/tasks/plugins/webapi/agent/config.go#L42 for example, In the propeller config map, people usually define one
defaultAgent
Copy code
agent_service.yaml: |
    plugins:
      agent-service:
        defaultAgent:
          endpoint: customagent:8000
          insecure: true
Or does the
agents
https://github.com/flyteorg/flyte/blob/28f65b30ed745af683e80f9c95400fda411d9c07/flyteplugins/go/tasks/plugins/webapi/agent/config.go#L68 work? 1, It can be used in production, not just for canary deployment, correct? https://github.com/flyteorg/flyte/blob/master/docs/flyte_agents/developing_agents.md#5-canary-deployment 2. in here
Copy code
agentForTaskTypes:
       # It will override the default agent for custom_task, which means propeller will send the request to this agent.
       - custom_task: custom_agent
this just means that • for custom_task, propeller will be sent to the custom_agent, not the default agent. • but for any other types of task, it will still be sent to the default agent am I understanding correctly?
d
Copy code
agent-service:
defaultAgent:
  endpoint: "localhost:8000"    
  insecure: true
  timeouts:
    CreateTask: 100s
    GetTask: 100s
  defaultTimeout: 100s
agents:
  custom_agent:
    endpoint: "localhost:8001"
    insecure: true
    timeouts:
      ExecuteTaskSync: 300s
      GetTask: 100s
    defaultTimeout: 300s
agentForTaskTypes:
  - chatgpt: custom_agent
  - airflow: custom_agent
It can be used in production, not just for canary deployment, correct?
yes, there's company use that
for custom_task, propeller will be sent to the custom_agent, not the default agent.
yes
• but for any other types of task, it will still be sent to the default agent
yes
b
one last check point: what about for sensor agent? The default name is
sensor
: https://github.com/flyteorg/flytekit/blob/master/flytekit/sensor/base_sensor.py#L54 If I have a custom sensor, and package the custom_agent with end point 8001 it supports custom_sensor type as well, will the configuration look like following?
Copy code
agent-service:
defaultAgent:
  endpoint: "localhost:8000"    
  insecure: true
  timeouts:
    CreateTask: 100s
    GetTask: 100s
  defaultTimeout: 100s
  supportedTaskTypes:
        - sensor <<<<<<<========= 
agents:
  custom_agent:
    endpoint: "localhost:8001"
    insecure: true
    timeouts:
      ExecuteTaskSync: 300s
      GetTask: 100s
    defaultTimeout: 300s
agentForTaskTypes:
  - chatgpt: custom_agent
  - airflow: custom_agent
  - custom_sensor: custom_agent <<<<<========
d
no it will not
yes it will
sorry
right now you don't need
supportedTaskTypes
propeller will auto-update latest supported task types
if you are using the latest version of propeller
b
I mean if there is other team configured the defaultAgent with the default sensor name…
d
Copy code
supportedTaskTypes:
    - sensor
    - custom_sensor
I think supportedTaskTypes will be looked like this, is this what you are asking?
b
wait, but
supportedTaskTypes
in
defaultAgent
section. the defaultAgent would not have custom_sensor so we can not configure the custom_sensor in the defaultAgent
d
no
Copy code
agent-service:
    supportedTaskTypes:
      - chatgpt
      - sensor  
      - spark
      - default_task
      - custom_task
      - sensor
      - airflow
    By default, all the request will be sent to the default agent.
    defaultAgent:
      endpoint: "localhost:8000"    
      insecure: true
      timeouts:
        CreateTask: 100s
        GetTask: 100s
      defaultTimeout: 100s
    agents:
      custom_agent:
        endpoint: "localhost:8001"
        insecure: true
        # defaultServiceConfig: '{"loadBalancingConfig": [{"round_robin":{}}]}'
        timeouts:
          ExecuteTaskSync: 300s
          GetTask: 100s
        defaultTimeout: 300s
supported task type is in
agent-service
section
b
supported task type is in
agent-service
section
Really?! are there recent changes that we are using the old version?
d
yes, but supported task types will not be used by
defaultAgent
, it will be used by
agent-service
b
got it!
d
for example, you have 2 agent deployement, agent-service needs to know all supported task types, so that propeller can check if we support the given
supportedTaskTypes
from flytekit.
b
thanks a lot for the confirmation. this helps me resolve some unknowns and no feature gap before I package and deploy my custom_agent (with custom_sensor and custom_task) separately!
d
thank you so much
b
will try it out and keep you updated if any other issues. 🙂
d
Really?! are there recent changes that we are using the old version?
I just misunderstand this, supported task type is in agent-service section, this is never changed. the order of the key in yaml can be different, but it's always the same.
b
@damp-lion-88352 We have observed some overridding problem
Copy code
agent_service.yaml: |
    plugins:
      agent-service:
        defaultAgent:
          endpoint: flyteagent:8000
          insecure: true
        agents:
          aip-agent:
           endpoint: aipflyteagent.dlc-system.svc.kube.grid.linkedin.com:8000
           insecure: true
        supportedTaskTypes:
        - sensor
        - dataservicetask
it looks like the request will be always sent to the
aip-agent
end point and we can not remove
sensor
because it is needed for the defaultAgent. What’s the best way to do this?
d
are you using the latest flytepropeller?
b
I think it probably not the latest propeller. cc @mysterious-knife-69764
d
use the latest one
we have some tiny bugs before
b
the upgrade at LinkedIn will be slow and need more testing.
currently the bug is causing us unable to roll out custom agent. cc @glamorous-carpet-83516 as we mentioned about it briefly.
d
upgrade to flyte 1.13.0
it's ok to upgrade propeller only in my opinion
but maybe it will cause some error (just rollout back)
we add an agent watcher and fix bug about the routing mechanism
b
say if we are on the old version can not upgrade. does the
agentForTaskTypes
work?
d
YES
USE THAT
THAT ME GIVE YOU AN EXAMPLE NOW
Copy code
agent-service:
    # supportedTaskTypes:
      # - chatgpt
      # - sensor  
      # - spark
    #   - default_task
    #   - custom_task
    #   - sensor
    #   - airflow
    # By default, all the request will be sent to the default agent.
    defaultAgent:
      endpoint: "localhost:8000"    
      insecure: true
      timeouts:
        CreateTask: 100s
        GetTask: 100s
      defaultTimeout: 100s
    agents:
      custom_agent:
        endpoint: "localhost:8001"
        insecure: true
        # defaultServiceConfig: '{"loadBalancingConfig": [{"round_robin":{}}]}'
        timeouts:
          ExecuteTaskSync: 300s
          GetTask: 100s
        defaultTimeout: 300s
    agentForTaskTypes:
      # It will override the default agent for custom_task, which means propeller will send the request to this agent.
      - chatgpt: custom_agent
      - airflow: custom_agent
make sure you are having the right key value (relative position
b
what if the
sensor
is for defaultAgent?
how to specify that?
like
sensor: defaultAgent
?
d
- sensor: defaultAgent
wati
hi
give me 1 min
let me check the code now
100% correct answer for youo
YES
sensor: defaultAgent
verified by myself
b
@mysterious-knife-69764 as you are oncall, do you want to try that? I can quickly try that in grid2, and have propeller to read the new configmap…
d
and checked the code by myself
please let me know if the new agent work in linkedin
b
Thanks A TON for staying up to help to resolve the outage! ❤️ will ping here once the new data sensor works as well 🙂
d
yes, anytime, very excited the incoming journey
b
btw, @damp-lion-88352 can you please forward the metadata service PR fix to avoid override default agent being overriden? when the flyte team at linkedin upgrade the flyte SDK, they can pay special care to that.
d
I've fixed already
flyte 1.13
this is the version
tell the person or team in charge of upgrade flytekit and flyte contact me
I'll help him or she make the upgrade of agent success
b
cc @mysterious-knife-69764 @careful-analyst-27506 ^^
The change you have fixed is here: https://github.com/flyteorg/flytekit/pull/2012?
d
and flyte
agent watcher
flytepropeller: 1.13 flytekit: https://github.com/flyteorg/flytekit/pull/2012?
please update both
b
the custom sensor has the error, whenever you got the chance:
Copy code
{
  "asctime": "2024-10-10 08:05:36,444",
  "name": "flytekit",
  "levelname": "ERROR",
  "message": "failed to create dssensor task with error `np.string_` was removed in the NumPy 2.0 release. Use `np.bytes_` instead.."
}
I will check more tomorrow.
d
Yes I know this error
This is fixed by me and ketan
It’s related to my pr about flyteschema
Check this one
m
thanks folks, will try it
b
@damp-lion-88352 does https://github.com/flyteorg/flytekit/pull/2619 have other dependencies? Can it be upgraded standalone?
d
not it doesnt have
do it
don't copy the click_types.py
it's irrelevant