Happy new year We do some research on ssh connection through Flyte #slurm-flyte-wg

Happy new year! We do some research on ssh connec...

creamy-shampoo-53278

01/04/2025, 2:41 PM

Happy new year! We do some research on ssh connection through

asyncssh

and have a question to discuss with you! As far as we know, establishing connection with client keys is a preferred approach over password due to security, convenience, and scalability. Before using key pairs for connection, we must first complete the following steps: 1. Generate an SSH key pair (take RSA as an example)

Copy code

ssh-keygen -t rsa -b 4096

2. Place the public key on the remote server (slurm cluster in this case)

Copy code

ssh-copy-id <username>@<hostname>

creamy-shampoo-53278

01/04/2025, 2:43 PM

Then, we now need to make a trade-off between the cost of initial setup on the local system (i.e., client side) and the flexibility of SSH config setup in

task_config

SlurmTask

. To be concrete, we consider the following two cases: 1. Ask users to setup complete connection information in

~/.ssh/config

on the client side and retains only

ssh_host

option in

task_config

. Following shows an example setup:

Copy code

# ~/.ssh/config
Host slurm
  HostName <remote_server_ip>
  Port <ssh_port>
  User <username>
  IdentityFile <private_key_path>  # Commonly ~/.ssh/id_rsa

Then, we only need to use the following information for connection:

Copy code

@task(
    task_config=Slurm(
        ssh_host="<hostname>",
        srun_conf={
            "partition": "debug",
            "job-name": "tiny-slurm",
            "chdir": "/home/workdir"
        }
    ),
)
def plus_one(x: int) -> int: 
    return x + 1

2. Let users pass all required info for connection in

task_config

without a need to setup

~/.ssh/config

, e.g.,

Copy code

@task(
    task_config=Slurm(
        ssh_conf={
            "host": <hostname>,
            "port": int(<port>),
            "username": <username>,
            "client_keys": ["~/.ssh/id_rsa"]
        },
        srun_conf={
            "partition": "debug",
            "job-name": "tiny-slurm",
            "chdir": "/home/workdir"
        }
    ),
)
def plus_one(x: int) -> int: 
    return x + 1

creamy-shampoo-53278

01/04/2025, 2:46 PM

The first approach incurs higher initial setup overhead, but just one time. As for the second method, it lets users setup ssh connection with more flexibility.

creamy-shampoo-53278

01/04/2025, 2:53 PM

Hence, we would like to ask which method could lead to better UX. Or, is there any other better method? p.s. ssh-related fields in

task_config

will be implemented with

Secret

glamorous-carpet-83516

01/06/2025, 8:23 PM

I think it should look like this

Copy code

@task(
    task_config=Slurm(
        ssh_conf={
            "host": <hostname>,
            "port": int(<port>),
            "username": <username>,
        },
        srun_conf={
            "partition": "debug",
            "job-name": "tiny-slurm",
            "chdir": "/home/workdir"
        }
    ),
)

User should not need to specify

client_keys

, we can mount the file to the agent deployment.

creamy-shampoo-53278

01/07/2025, 4:40 PM

Hey Kevin, We now retain only

slurm_host

for user to decide which Slurm cluster to run as discussed this morning! A simple example is shown as follows:

Copy code

echo = SlurmTask(
    name="echo",
    script="""#!/bin/bash
# We can define sbatch options here
echo "Demo Slurm agent with ShellTask...\n"

# Run a demo python script on Slurm
echo ">>> Run a Demo Python Script <<<"
python3 demo.py
""",
    task_config=Slurm(
        slurm_host="slurm",  # Only need to designate Slurm hostname
        sbatch_conf={
            "partition": "debug",
            "job-name": "tiny-slurm",
        }
    )
)

7 Views

Open in Slack

Previous Next