curved-whale-1505
04/22/2025, 8:06 PMpytorch
job, the cloudwatch log link is broken because it is missing some variables as you can see in the screenshot
currently I'm defining the template like this:
`<https://console.aws.amazon.com/cloudwatch/home?region=${props.cluster.env.region}#logEventViewer:group=/aws/containerinsights/${props.cluster.clusterName}/application;stream={{> .nodeName }}-application.var.log.containers.{{ .podName }}_{{ .namespace }}_{{ .containerName }}-{{ .containerId }}.log`
this works fine for other regular python tasks
is there a way i can get the logs to work properly for pytorch?curved-whale-1505
04/22/2025, 8:09 PMnodeName
, containerName
, and containerId
are missing for this job typethankful-minister-83577
cool-lifeguard-49380
04/25/2025, 7:13 PMcool-lifeguard-49380
04/25/2025, 7:14 PMcool-lifeguard-49380
04/25/2025, 7:15 PMif taskType == PytorchTaskType && hasMaster {
masterTaskLog, masterErr := logPlugin.GetTaskLogs(
tasklog.Input{
PodName: name + "-master-0",
Namespace: namespace,
LogName: "master",
PodRFC3339StartTime: RFC3999StartTime,
PodRFC3339FinishTime: RFC3999FinishTime,
PodUnixStartTime: startTime,
PodUnixFinishTime: finishTime,
TaskExecutionID: taskExecID,
TaskTemplate: taskTemplate,
},
)
cool-lifeguard-49380
04/25/2025, 7:15 PMcurved-whale-1505
04/25/2025, 7:16 PMcool-lifeguard-49380
04/25/2025, 7:16 PMcool-lifeguard-49380
04/25/2025, 7:17 PM