Example OpenPBS Python Scripts
Cyclone Manual - Example OpenPBS Python Scripts
When interfacing with a full Queue System, Cyclone calls external scripts located on the Queue System itself. This approach allows Cyclone to remain dynamic and adaptable to different queue environments, rather than being hard-coded for a specific system. Cyclone requires the files listed in Table 8.
Table 8: List of Files Necessary to Interface with a Full Queue System
| Name | Description |
|---|---|
| Get Queues | Retrieves the names of all available queues and returns them in JSON format. |
| Get Jobs | Returns a list of active jobs (running or waiting in the queue). |
| Get Jobs Full | Returns a complete list of jobs, including finished, running, and queued jobs. |
| Get Queue Machines | Lists all machines (nodes) available to the queueing system with their details. |
| Get Queue Info | Provides detailed information about each queue, including state, resources, and job counts. |
| Rerun Job | Restarts one or more jobs by their job IDs. |
| Suspend Job | Pauses execution of one or more jobs without deleting them. |
| Resume Job | Resumes execution of previously suspended jobs. |
| Hold Job | Places one or more jobs on hold, preventing them from being scheduled. |
| Release Job | Releases held jobs, making them eligible for scheduling again. |
| Delete Job | Cancels and removes one or more jobs from the queue. |
| Move Job | Moves jobs from their current queue into another specified queue. |
| Hold Machine | Marks one or more machines as unavailable, preventing new jobs from running there. |
| Clear Machine | Returns previously held machines back to service, allowing jobs to run on them. |
The listed files may be called anything and can be written in any language, the only requirement is that the files are defined in the Settings as shown in Section 20.6.
All examples provided are written in Python for an OpenPBS Queue System.
Get Queues
Functional Requirements:
-
The script must retrieve all available queue names from the job scheduling system in use.
-
If OpenPBS is used, the provided Python script demonstrates how this can be done with qstat.
-
If a different scheduler is used (e.g., Slurm, LSF, SGE), the logic may be implemented in any language or method appropriate for the environment.
-
The script must output a valid JSON shown in the following structure.
JSON1{ 2 "data": 3 "queue1", 4 "queue2", 5 "queue3" 6 7}
-
The top-level object must contain a single key: "data".
-
The value of "data" must be an array of queue names (strings).
-
No additional fields or metadata are allowed.
-
The output must be written to standard output (stdout) so it can be captured by the software.
Notes:
-
The provided Python script works with OpenPBS and can be used as a reference for implementation.
-
If Python is unavailable, or if a different scheduler is used, then the JSON output structure must be replicated using a different tool or language of choice.
-
The implementation details (e.g., which command is run, which language is used) do not matter, as long as the JSON output matches the required format.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4 5# Run the qstat command 6result = subprocess.run(['qstat', '-Qf', '-F', 'json'], capture_output=True, text=True, check=True) 7# Parse JSON 8data = json.loads(result.stdout) 9queues = [] 10# Extract and print only the queue names 11 12for queue_name in data.get("Queue", {}): 13 queues.append(queue_name) 14 output = {} 15 output["data"]=queues 16 print(json.dumps(output, indent=4))
Get Jobs
To interface correctly with the software, a script or program must be provided that lists all jobs from the scheduling system and returns them in the required JSON format.
Functional Requirements
-
The script must retrieve job information from the job scheduling system in use.
HTML1- If OpenPBS is used, the provided Python script demonstrates how 2 this can be done with qstat -f -F json.HTML1- If a different scheduler is used (e.g., Slurm, LSF, SGE), the 2 logic may be implemented in any language or method appropriate for 3 the environment. -
The script must normalize job fields to a standardized set of names using the reference column mapping. Examples:
HTML1- "Job_Name" → "Job Name"HTML1- "Job_Owner" → "Job Owner" (without the @hostname suffix)HTML1- "job_state" → "Job State"HTML1- "ncpus" → "Number CPUs" -
Each job entry must include an "id" field.
HTML1- The "id" value must be the numeric portion of the job identifier.HTML1- Example: if the scheduler reports 12345.server, then "id" must be 2 "12345".HTML1- "name" must also be set to the same value for consistency. -
The script must output valid JSON with the following structure:
JSON1{ 2 "data": { 3 "jobs": [ 4 { 5 "id": "12345", 6 "name": "12345", 7 "Job Name": "myjob", 8 "Job Owner": "alice", 9 "Job State": "R", 10 "Queue": "short", 11 "Number CPUs": "4", 12 "CPU Time": "00:10:32", 13 ... 14 } 15 ], 16 "formatting": { ... }, 17 "defaultColumns": [ "id", "Job State", "Queue", "Number CPUs", "Job Name", "Job Owner", "CPU Time" ], 18 "fullColumns": ... 19 } 20}-
The top-level object must contain a single key: "data".
-
"data" must contain:
-
-
"jobs": an array of job objects, each with at least "id" and "name".
-
"formatting": column definitions (id, label, width, etc.).
-
"defaultColumns": minimal columns shown by default.
-
"fullColumns": complete list of available columns.
-
The output must be written to standard output (stdout) so it can be captured by the software.
-
Notes
HTML1- The provided Python script works with OpenPBS and is the reference 2 implementation.HTML1- If a different scheduler is used, then an identical JSON structure 2 must still be returned.HTML1- If some fields are unavailable in the environment, they must still 2 be included in the output with an empty string ("").HTML1- The "id" field is mandatory for every job, regardless of scheduler 2 or implementation.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4 5# Run the qstat command 6result = subprocess.run(['qstat', '-f', '-F', 'json'], 7capture_output=True, text=True, check=True) 8data = json.loads(result.stdout) 9topLevelColumns = "Job_Name", "Job_Owner", "job_state", "queue", "server", "Checkpoint", "ctime", "Error_Path", "exec_host", "exec_vnode", "Hold_Types", "Join_Path", "Keep_Files", "Mail_Points", 10"mtime", "Output_Path", "Priority", "qtime", "Rerunable", "stime", "session_id", "jobdir", "substate", "comment", "etime", "run_count", "Submit_arguments", "project", "Submit_Host", "Exit_status", "history_timestamp", "Stageout_Status", "Stop_Time" 11resourceColumns = "cpupercent", "cput", "mem", "ncpus", "vmem", "walltime" 12resourceListColumns = "ncpus", "nodect", "nodes", "place", "select" 13# column mapping to make it easier to read 14columnMapping = { 15 "Job_Name": "Job Name", 16 "Job_Owner": "Job Owner", 17 "job_state": "Job State", 18 "queue": "Queue", 19 "server": "Server", 20 "Checkpoint": "Checkpoint", 21 "ctime": "Created", 22 "Error_Path": "Error", 23 "exec_host": "Host", 24 "exec_vnode": "Virtual Node", 25 "Hold_Types": "Hold", 26 "Join_Path": "Join", 27 "Keep_Files": "Keep", 28 "Mail_Points": "Mail Points", 29 "mtime": "Modified", 30 "Output_Path": "Output", 31 "Priority": "Priority", 32 "qtime": "Queue Time", 33 "Rerunable": "Rerunable", 34 "stime": "Started", 35 "session_id": "Session ID", 36 "jobdir": "Job Directory", 37 "substate": "Substate", 38 "comment": "Comment", 39 "etime": "Ended", 40 "run_count": "Run Count", 41 "Submit_Host": "Submit Host", 42 "Submit_arguments": "Submit Arguments", 43 "project": "Project", 44 "cpupercent": "CPU Percentage", 45 "cput": "CPU Time", 46 "mem": "Memory", 47 "ncpus": "Number CPUs", 48 "vmem": "Virtual Memory", 49 "walltime": "Wall Time", 50 "select": "Select", 51 "place": "Place", 52 "nodes": "Nodes", 53 "nodect": "Node Count", 54 "Exit_status": "Exit Status", 55 "history_timestamp": "History Timestamp", 56 "Stageout_Status": "Stageout Status", 57 "Stop_Time": "Stop Time", 58 "name": "name" 59} 60# define the columns we want to show 61formatting = { 62 "Checkpoint": { 63 "id": 'Checkpoint', 64 "align": 'left', 65 "disablePadding": False, 66 "label": 'Checkpoint', 67 "numeric": False, 68 "filter": True, 69 "width": '150px', 70 "formatting": None, 71 }, 72 "Comment": { 73 "id": 'Comment', 74 "align": 'left', 75 "disablePadding": False, 76 "label": 'Comment', 77 "numeric": False, 78 "filter": True, 79 "width": '150px', 80 "formatting": None, 81 }, 82 "CPU Percentage": { 83 "id": 'CPU Percentage', 84 "align": 'left', 85 "disablePadding": False, 86 "label": 'CPU Percentage', 87 "numeric": False, 88 "filter": True, 89 "width": '150px', 90 "formatting": None, 91 }, 92 "CPU Time": { 93 "id": 'CPU Time', 94 "align": 'left', 95 "disablePadding": False, 96 "label": 'CPU Time', 97 "numeric": False, 98 "filter": True, 99 "width": '150px', 100 "formatting": None, 101 }, 102 "Created": { 103 "id": 'Created', 104 "align": 'left', 105 "disablePadding": False, 106 "label": 'Created', 107 "numeric": False, 108 "filter": True, 109 "width": '150px', 110 "formatting": None, 111 }, 112 "Error": { 113 "id": 'Error', 114 "align": 'left', 115 "disablePadding": False, 116 "label": 'Error File', 117 "numeric": False, 118 "filter": True, 119 "width": '150px', 120 "formatting": None, 121 }, 122 "Ended": { 123 "id": 'Ended', 124 "align": 'left', 125 "disablePadding": False, 126 "label": 'End Time', 127 "numeric": False, 128 "filter": True, 129 "width": '150px', 130 "formatting": None, 131 }, 132 "Host": { 133 "id": 'Host', 134 "align": 'left', 135 "disablePadding": False, 136 "label": 'Execution Host', 137 "numeric": False, 138 "filter": True, 139 "width": '150px', 140 "formatting": None, 141 }, 142 "Virtual Node": { 143 "id": 'Virtual Node', 144 "align": 'left', 145 "disablePadding": False, 146 "label": 'Virtual Node', 147 "numeric": False, 148 "filter": True, 149 "width": '150px', 150 "formatting": None, 151 }, 152 "Exit Status": { 153 "id": 'Exit Status', 154 "align": 'left', 155 "disablePadding": False, 156 "label": 'Exit Status', 157 "numeric": False, 158 "filter": True, 159 "width": '150px', 160 "formatting": None, 161 }, 162 "History Timestamp": { 163 "id": 'History Timestamp', 164 "align": 'left', 165 "disablePadding": False, 166 "label": 'History Timestamp', 167 "numeric": False, 168 "filter": True, 169 "width": '150px', 170 "formatting": None, 171 }, 172 "Hold": { 173 "id": 'Hold', 174 "align": 'left', 175 "disablePadding": False, 176 "label": 'Hold Types', 177 "numeric": False, 178 "filter": True, 179 "width": '150px', 180 "formatting": None, 181 }, 182 "Job Name": { 183 "id": 'Job Name', 184 "align": 'left', 185 "disablePadding": False, 186 "label": 'Job Name', 187 "numeric": False, 188 "filter": True, 189 "width": '400px', 190 "formatting": None, 191 }, 192 "Job Owner": { 193 "id": 'Job Owner', 194 "align": 'left', 195 "disablePadding": False, 196 "label": 'Job Owner', 197 "numeric": False, 198 "filter": True, 199 "width": '150px', 200 "formatting": None, 201 }, 202 "Job State": { 203 "id": 'Job State', 204 "align": 'left', 205 "disablePadding": False, 206 "label": 'Job State', 207 "numeric": False, 208 "filter": True, 209 "width": '150px', 210 "formatting": None, 211 }, 212 "Job Directory": { 213 "id": 'Job Directory', 214 "align": 'left', 215 "disablePadding": False, 216 "label": 'Job Directory', 217 "numeric": False, 218 "filter": True, 219 "width": '150px', 220 "formatting": None, 221 }, 222 "Join": { 223 "id": 'Join', 224 "align": 'left', 225 "disablePadding": False, 226 "label": 'Join Path', 227 "numeric": False, 228 "filter": True, 229 "width": '150px', 230 "formatting": None, 231 }, 232 "Keep": { 233 "id": 'Keep', 234 "align": 'left', 235 "disablePadding": False, 236 "label": 'Keep Files', 237 "numeric": False, 238 "filter": True, 239 "width": '150px', 240 "formatting": None, 241 }, 242 "Mail Points": { 243 "id": 'Mail Points', 244 "align": 'left', 245 "disablePadding": False, 246 "label": 'Mail Points', 247 "numeric": False, 248 "filter": True, 249 "width": '150px', 250 "formatting": None, 251 }, 252 "Memory": { 253 "id": 'Memory', 254 "align": 'left', 255 "disablePadding": False, 256 "label": 'Memory', 257 "numeric": False, 258 "filter": True, 259 "width": '150px', 260 "formatting": None, 261 }, 262 "Modified": { 263 "id": 'Modified', 264 "align": 'left', 265 "disablePadding": False, 266 "label": 'Modified Time', 267 "numeric": False, 268 "filter": True, 269 "width": '150px', 270 "formatting": None, 271 }, 272 "Number CPUs": { 273 "id": 'Number CPUs', 274 "align": 'left', 275 "disablePadding": False, 276 "label": 'Number CPUs', 277 "numeric": True, 278 "filter": True, 279 "width": '150px', 280 "formatting": None, 281 }, 282 "Node Count": { 283 "id": 'Node Count', 284 "align": 'left', 285 "disablePadding": False, 286 "label": 'Nodes Requested', 287 "numeric": True, 288 "filter": True, 289 "width": '150px', 290 "formatting": None, 291 }, 292 "Output": { 293 "id": 'Output', 294 "align": 'left', 295 "disablePadding": False, 296 "label": 'Output', 297 "numeric": False, 298 "filter": True, 299 "width": '400px', 300 "formatting": None, 301 }, 302 "Place": { 303 "id": 'Place', 304 "align": 'left', 305 "disablePadding": False, 306 "label": 'Placement', 307 "numeric": False, 308 "filter": True, 309 "width": '150px', 310 "formatting": None, 311 }, 312 "Priority": { 313 "id": 'Priority', 314 "align": 'left', 315 "disablePadding": False, 316 "label": 'Priority', 317 "numeric": True, 318 "filter": True, 319 "width": '150px', 320 "formatting": None, 321 }, 322 "Project": { 323 "id": 'Project', 324 "align": 'left', 325 "disablePadding": False, 326 "label": 'Project', 327 "numeric": False, 328 "filter": True, 329 "width": '150px', 330 "formatting": None, 331 }, 332 "Queue Time": { 333 "id": 'Queue Time', 334 "align": 'left', 335 "disablePadding": False, 336 "label": 'Queue Time', 337 "numeric": False, 338 "filter": True, 339 "width": '150px', 340 "formatting": None, 341 }, 342 "Queue": { 343 "id": 'Queue', 344 "align": 'left', 345 "disablePadding": False, 346 "label": 'Queue', 347 "numeric": False, 348 "filter": True, 349 "width": '150px', 350 "formatting": None, 351 }, 352 "Rerunable": { 353 "id": 'Rerunable', 354 "align": 'left', 355 "disablePadding": False, 356 "label": 'Rerunable', 357 "numeric": False, 358 "filter": True, 359 "width": '150px', 360 "formatting": None, 361 }, 362 "Run Count": { 363 "id": 'Run Count', 364 "align": 'left', 365 "disablePadding": False, 366 "label": 'Run Count', 367 "numeric": False, 368 "filter": True, 369 "width": '150px', 370 "formatting": None, 371 }, 372 "Select": { 373 "id": 'Select', 374 "align": 'left', 375 "disablePadding": False, 376 "label": 'Select', 377 "numeric": False, 378 "filter": True, 379 "width": '150px', 380 "formatting": None, 381 }, 382 "Server": { 383 "id": 'Server', 384 "align": 'left', 385 "disablePadding": False, 386 "label": 'Server', 387 "numeric": False, 388 "filter": True, 389 "width": '150px', 390 "formatting": None, 391 }, 392 "Session ID": { 393 "id": 'Session ID', 394 "align": 'left', 395 "disablePadding": False, 396 "label": 'Session ID', 397 "numeric": False, 398 "filter": True, 399 "width": '150px', 400 "formatting": None, 401 }, 402 "Stageout Status": { 403 "id": 'Stageout Status', 404 "align": 'left', 405 "disablePadding": False, 406 "label": 'Stageout Status', 407 "numeric": False, 408 "filter": True, 409 "width": '150px', 410 "formatting": None, 411 }, 412 "Stop Time": { 413 "id": 'Stop Time', 414 "align": 'left', 415 "disablePadding": False, 416 "label": 'Stop Time', 417 "numeric": False, 418 "filter": True, 419 "width": '150px', 420 "formatting": None, 421 }, 422 "Submit Arguments": { 423 "id": 'Submit Arguments', 424 "align": 'left', 425 "disablePadding": False, 426 "label": 'Submit Arguments', 427 "numeric": False, 428 "filter": True, 429 "width": '150px', 430 "formatting": None, 431 }, 432 "Submit Host": { 433 "id": 'Submit Host', 434 "align": 'left', 435 "disablePadding": False, 436 "label": 'Submit Host', 437 "numeric": False, 438 "filter": True, 439 "width": '150px', 440 "formatting": None, 441 }, 442 "Substate": { 443 "id": 'Substate', 444 "align": 'left', 445 "disablePadding": False, 446 "label": 'Substate', 447 "numeric": False, 448 "filter": True, 449 "width": '150px', 450 "formatting": None, 451 }, 452 "Virtual Memory": { 453 "id": 'Virtual Memory', 454 "align": 'left', 455 "disablePadding": False, 456 "label": 'Virtual Memory', 457 "numeric": False, 458 "filter": True, 459 "width": '150px', 460 "formatting": None, 461 }, 462 "Wall Time": { 463 "id": 'Wall Time', 464 "align": 'left', 465 "disablePadding": False, 466 "label": 'Wall Time', 467 "numeric": False, 468 "filter": True, 469 "width": '150px', 470 "formatting": None, 471 }, 472 "id": { 473 "id": 'id', 474 "align": 'left', 475 "disablePadding": False, 476 "label": 'Job Identification', 477 "numeric": True, 478 "filter": True, 479 "width": '200px', 480 "formatting": None, 481 }, 482 "name": { 483 "id": 'name', 484 "align": 'left', 485 "disablePadding": False, 486 "label": 'Job Name', 487 "numeric": True, 488 "filter": True, 489 "width": '200px', 490 "formatting": None, 491 }, 492} 493# we dont show them all, just the ones we need to being with 494defaultColumns = 'id', 'Job State', 'Queue', 495'Number CPUs', 'Job Name', 'Job Owner', 'CPU Time' 496fullColumns = 'id', 'Job State', 'Queue', 'Number CPUs', 'Job Name', 'Job Owner', 'CPU Time', 'Job Directory', 'Job Owner', 'Checkpoint', 'Comment', 'CPU Percentage', 'Created', 'Error', 'Ended', 'Host', 'Virtual Node', 'Exit Status', 'History Timestamp', 'Hold', 'Join', 'Keep', 'Mail Points', 497'Memory', 'Modified', 'Number CPUs', 'Node Count', 'Output', 'Place', 'Priority', 'Project', 'Queue Time', 'Queue', 'Rerunable', 'Run Count', 'Select', 'Server', 'Session ID', 'Stageout Status', 'Stop Time', 'Submit Arguments', 'Submit Host', 'Substate', 'Virtual Memory', 'Wall Time', 'name' 498jobData = data.get("Jobs", {}) 499data = {"jobs": [], "formatting": formatting, 500"defaultColumns": defaultColumns, "fullColumns": fullColumns} 501 502for job_name in jobData: 503 job = {} 504 505 for column in topLevelColumns: 506 if column in jobData[job_name]: 507 job[columnMapping[column]] = jobData[job_name]column 508 else: 509 job[columnMapping[column]] = "" 510 if "resources_used" in jobData[job_name]: 511 for column in resourceColumns: 512 if column in jobData[job_name]["resources_used"]: 513 jobcolumnMapping[column 514 ] = jobData[job_name]["resources_used"]column 515 else: 516 job[columnMapping[column]] = "" 517 if "Resource_List" in jobData[job_name]: 518 for column in resourceListColumns: 519 if column in jobData[job_name]["Resource_List"]: 520 jobcolumnMapping[column 521 ] = jobData[job_name]["Resource_List"]column 522 else: 523 job[columnMapping[column]] = "" 524 job"Job Owner" = job["Job Owner"].split("@")0 525 # requried fields for the table 526 job"id" = job_name.split(".")0 527 job"name" = job_name.split(".")0 528 data["jobs"].append(job) 529 output = {} 530 output"data" = data 531 print(json.dumps(output, indent=4))
Get Jobs Full
To interface correctly with the software, the user must provide a script or program that lists all jobs (including completed jobs) from their scheduling system and returns them in the required JSON format.
Functional Requirements
-
The script must retrieve job information from the job scheduling system.
HTML1- If OpenPBS is used, the provided Python script demonstrates how 2 this can be done with qstat -xf -F json.HTML1- If a different scheduler is used (e.g., Slurm’s sacct), the logic 2 may be implemented in any language or method appropriate for the 3 environment. -
The script must normalize job fields to a standardized set of names using the reference column mapping. Examples:
HTML1- "Job_Name" → "Job Name"HTML1- "Job_Owner" → "Job Owner" (without the @hostname suffix)HTML1- "job_state" → "Job State"HTML1- "ncpus" → "Number CPUs"HTML1- "cput" → "CPU Time"HTML1- "Exit_status" → "Exit Status" -
Each job entry must include an "id" field.
HTML1- "id" must be the numeric portion of the job identifier.HTML1- "name" must also be present and set to the same value. -
If fields are missing from the scheduler, they must still be present in the JSON output with an empty string ("").
-
The script must output valid JSON with the following structure:
HTML1- The top-level object must contain a single key: "data".HTML1- "data" must contain:HTML1- "jobs": an array of job objects, each with at least "id" and 2 "name".HTML1- "formatting": column definitions (id, label, width, etc.).HTML1- "defaultColumns": minimal columns shown by default.HTML1- "fullColumns": complete list of available columns. -
The output must be written to standard output (stdout) so it can be captured by the software.
-
Notes
HTML1- The provided Python script works with OpenPBS and is the reference 2 implementation.HTML1- If a different scheduler is used, the same JSON structure must 2 still be returned.HTML1- This script is distinct from the regular job-listing script 2 because it also includes finished jobs, not just active ones.HTML1- The "id" field is mandatory for every job, regardless of scheduler 2 or implementation.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4 5# Run the qstat command 6result = subprocess.run(['qstat', '-xf', '-F', 'json'], 7capture_output=True, text=True, check=True) 8data = json.loads(result.stdout) 9topLevelColumns = "Job_Name", "Job_Owner", "job_state", "queue", "server", "Checkpoint", "ctime", "Error_Path", "exec_host", "exec_vnode", "Hold_Types", "Join_Path", "Keep_Files", "Mail_Points", 10"mtime", "Output_Path", "Priority", "qtime", "Rerunable", "stime", "session_id", "jobdir", "substate", "comment", "etime", "run_count", "Submit_arguments", "project", "Submit_Host", "Exit_status", "history_timestamp", "Stageout_Status", "Stop_Time" 11resourceColumns = "cpupercent", "cput", "mem", "ncpus", "vmem", "walltime" 12resourceListColumns = "ncpus", "nodect", "nodes", "place", "select" 13# column mapping to make it easier to read 14columnMapping = { 15 "Job_Name": "Job Name", 16 "Job_Owner": "Job Owner", 17 "job_state": "Job State", 18 "queue": "Queue", 19 "server": "Server", 20 "Checkpoint": "Checkpoint", 21 "ctime": "Created", 22 "Error_Path": "Error", 23 "exec_host": "Host", 24 "exec_vnode": "Virtual Node", 25 "Hold_Types": "Hold", 26 "Join_Path": "Join", 27 "Keep_Files": "Keep", 28 "Mail_Points": "Mail Points", 29 "mtime": "Modified", 30 "Output_Path": "Output", 31 "Priority": "Priority", 32 "qtime": "Queue Time", 33 "Rerunable": "Rerunable", 34 "stime": "Started", 35 "session_id": "Session ID", 36 "jobdir": "Job Directory", 37 "substate": "Substate", 38 "comment": "Comment", 39 "etime": "Ended", 40 "run_count": "Run Count", 41 "Submit_Host": "Submit Host", 42 "Submit_arguments": "Submit Arguments", 43 "project": "Project", 44 "cpupercent": "CPU Percentage", 45 "cput": "CPU Time", 46 "mem": "Memory", 47 "ncpus": "Number CPUs", 48 "vmem": "Virtual Memory", 49 "walltime": "Wall Time", 50 "select": "Select", 51 "place": "Place", 52 "nodes": "Nodes", 53 "nodect": "Node Count", 54 "Exit_status": "Exit Status", 55 "history_timestamp": "History Timestamp", 56 "Stageout_Status": "Stageout Status", 57 "Stop_Time": "Stop Time", 58 "name": "name" 59} 60# define the columns we want to show 61formatting = { 62 "Checkpoint": { 63 "id": 'Checkpoint', 64 "align": 'left', 65 "disablePadding": False, 66 "label": 'Checkpoint', 67 "numeric": False, 68 "filter": True, 69 "width": '150px', 70 "formatting": None, 71 }, 72 "Comment": { 73 "id": 'Comment', 74 "align": 'left', 75 "disablePadding": False, 76 "label": 'Comment', 77 "numeric": False, 78 "filter": True, 79 "width": '150px', 80 "formatting": None, 81 }, 82 "CPU Percentage": { 83 "id": 'CPU Percentage', 84 "align": 'left', 85 "disablePadding": False, 86 "label": 'CPU Percentage', 87 "numeric": False, 88 "filter": True, 89 "width": '150px', 90 "formatting": None, 91 }, 92 "CPU Time": { 93 "id": 'CPU Time', 94 "align": 'left', 95 "disablePadding": False, 96 "label": 'CPU Time', 97 "numeric": False, 98 "filter": True, 99 "width": '150px', 100 "formatting": None, 101 }, 102 "Created": { 103 "id": 'Created', 104 "align": 'left', 105 "disablePadding": False, 106 "label": 'Created', 107 "numeric": False, 108 "filter": True, 109 "width": '150px', 110 "formatting": None, 111 }, 112 "Error": { 113 "id": 'Error', 114 "align": 'left', 115 "disablePadding": False, 116 "label": 'Error File', 117 "numeric": False, 118 "filter": True, 119 "width": '150px', 120 "formatting": None, 121 }, 122 "Ended": { 123 "id": 'Ended', 124 "align": 'left', 125 "disablePadding": False, 126 "label": 'End Time', 127 "numeric": False, 128 "filter": True, 129 "width": '150px', 130 "formatting": None, 131 }, 132 "Host": { 133 "id": 'Host', 134 "align": 'left', 135 "disablePadding": False, 136 "label": 'Execution Host', 137 "numeric": False, 138 "filter": True, 139 "width": '150px', 140 "formatting": None, 141 }, 142 "Virtual Node": { 143 "id": 'Virtual Node', 144 "align": 'left', 145 "disablePadding": False, 146 "label": 'Virtual Node', 147 "numeric": False, 148 "filter": True, 149 "width": '150px', 150 "formatting": None, 151 }, 152 "Exit Status": { 153 "id": 'Exit Status', 154 "align": 'left', 155 "disablePadding": False, 156 "label": 'Exit Status', 157 "numeric": False, 158 "filter": True, 159 "width": '150px', 160 "formatting": None, 161 }, 162 "History Timestamp": { 163 "id": 'History Timestamp', 164 "align": 'left', 165 "disablePadding": False, 166 "label": 'History Timestamp', 167 "numeric": False, 168 "filter": True, 169 "width": '150px', 170 "formatting": None, 171 }, 172 "Hold": { 173 "id": 'Hold', 174 "align": 'left', 175 "disablePadding": False, 176 "label": 'Hold Types', 177 "numeric": False, 178 "filter": True, 179 "width": '150px', 180 "formatting": None, 181 }, 182 "Job Name": { 183 "id": 'Job Name', 184 "align": 'left', 185 "disablePadding": False, 186 "label": 'Job Name', 187 "numeric": False, 188 "filter": True, 189 "width": '400px', 190 "formatting": None, 191 }, 192 "Job Owner": { 193 "id": 'Job Owner', 194 "align": 'left', 195 "disablePadding": False, 196 "label": 'Job Owner', 197 "numeric": False, 198 "filter": True, 199 "width": '150px', 200 "formatting": None, 201 }, 202 "Job State": { 203 "id": 'Job State', 204 "align": 'left', 205 "disablePadding": False, 206 "label": 'Job State', 207 "numeric": False, 208 "filter": True, 209 "width": '150px', 210 "formatting": None, 211 }, 212 "Job Directory": { 213 "id": 'Job Directory', 214 "align": 'left', 215 "disablePadding": False, 216 "label": 'Job Directory', 217 "numeric": False, 218 "filter": True, 219 "width": '150px', 220 "formatting": None, 221 }, 222 "Join": { 223 "id": 'Join', 224 "align": 'left', 225 "disablePadding": False, 226 "label": 'Join Path', 227 "numeric": False, 228 "filter": True, 229 "width": '150px', 230 "formatting": None, 231 }, 232 "Keep": { 233 "id": 'Keep', 234 "align": 'left', 235 "disablePadding": False, 236 "label": 'Keep Files', 237 "numeric": False, 238 "filter": True, 239 "width": '150px', 240 "formatting": None, 241 }, 242 "Mail Points": { 243 "id": 'Mail Points', 244 "align": 'left', 245 "disablePadding": False, 246 "label": 'Mail Points', 247 "numeric": False, 248 "filter": True, 249 "width": '150px', 250 "formatting": None, 251 }, 252 "Memory": { 253 "id": 'Memory', 254 "align": 'left', 255 "disablePadding": False, 256 "label": 'Memory', 257 "numeric": False, 258 "filter": True, 259 "width": '150px', 260 "formatting": None, 261 }, 262 "Modified": { 263 "id": 'Modified', 264 "align": 'left', 265 "disablePadding": False, 266 "label": 'Modified Time', 267 "numeric": False, 268 "filter": True, 269 "width": '150px', 270 "formatting": None, 271 }, 272 "Number CPUs": { 273 "id": 'Number CPUs', 274 "align": 'left', 275 "disablePadding": False, 276 "label": 'Number CPUs', 277 "numeric": True, 278 "filter": True, 279 "width": '150px', 280 "formatting": None, 281 }, 282 "Node Count": { 283 "id": 'Node Count', 284 "align": 'left', 285 "disablePadding": False, 286 "label": 'Nodes Requested', 287 "numeric": True, 288 "filter": True, 289 "width": '150px', 290 "formatting": None, 291 }, 292 "Output": { 293 "id": 'Output', 294 "align": 'left', 295 "disablePadding": False, 296 "label": 'Output', 297 "numeric": False, 298 "filter": True, 299 "width": '400px', 300 "formatting": None, 301 }, 302 "Place": { 303 "id": 'Place', 304 "align": 'left', 305 "disablePadding": False, 306 "label": 'Placement', 307 "numeric": False, 308 "filter": True, 309 "width": '150px', 310 "formatting": None, 311 }, 312 "Priority": { 313 "id": 'Priority', 314 "align": 'left', 315 "disablePadding": False, 316 "label": 'Priority', 317 "numeric": True, 318 "filter": True, 319 "width": '150px', 320 "formatting": None, 321 }, 322 "Project": { 323 "id": 'Project', 324 "align": 'left', 325 "disablePadding": False, 326 "label": 'Project', 327 "numeric": False, 328 "filter": True, 329 "width": '150px', 330 "formatting": None, 331 }, 332 "Queue Time": { 333 "id": 'Queue Time', 334 "align": 'left', 335 "disablePadding": False, 336 "label": 'Queue Time', 337 "numeric": False, 338 "filter": True, 339 "width": '150px', 340 "formatting": None, 341 }, 342 "Queue": { 343 "id": 'Queue', 344 "align": 'left', 345 "disablePadding": False, 346 "label": 'Queue', 347 "numeric": False, 348 "filter": True, 349 "width": '150px', 350 "formatting": None, 351 }, 352 "Rerunable": { 353 "id": 'Rerunable', 354 "align": 'left', 355 "disablePadding": False, 356 "label": 'Rerunable', 357 "numeric": False, 358 "filter": True, 359 "width": '150px', 360 "formatting": None, 361 }, 362 "Run Count": { 363 "id": 'Run Count', 364 "align": 'left', 365 "disablePadding": False, 366 "label": 'Run Count', 367 "numeric": False, 368 "filter": True, 369 "width": '150px', 370 "formatting": None, 371 }, 372 "Select": { 373 "id": 'Select', 374 "align": 'left', 375 "disablePadding": False, 376 "label": 'Select', 377 "numeric": False, 378 "filter": True, 379 "width": '150px', 380 "formatting": None, 381 }, 382 "Server": { 383 "id": 'Server', 384 "align": 'left', 385 "disablePadding": False, 386 "label": 'Server', 387 "numeric": False, 388 "filter": True, 389 "width": '150px', 390 "formatting": None, 391 }, 392 "Session ID": { 393 "id": 'Session ID', 394 "align": 'left', 395 "disablePadding": False, 396 "label": 'Session ID', 397 "numeric": False, 398 "filter": True, 399 "width": '150px', 400 "formatting": None, 401 }, 402 "Stageout Status": { 403 "id": 'Stageout Status', 404 "align": 'left', 405 "disablePadding": False, 406 "label": 'Stageout Status', 407 "numeric": False, 408 "filter": True, 409 "width": '150px', 410 "formatting": None, 411 }, 412 "Stop Time": { 413 "id": 'Stop Time', 414 "align": 'left', 415 "disablePadding": False, 416 "label": 'Stop Time', 417 "numeric": False, 418 "filter": True, 419 "width": '150px', 420 "formatting": None, 421 }, 422 "Submit Arguments": { 423 "id": 'Submit Arguments', 424 "align": 'left', 425 "disablePadding": False, 426 "label": 'Submit Arguments', 427 "numeric": False, 428 "filter": True, 429 "width": '150px', 430 "formatting": None, 431 }, 432 "Submit Host": { 433 "id": 'Submit Host', 434 "align": 'left', 435 "disablePadding": False, 436 "label": 'Submit Host', 437 "numeric": False, 438 "filter": True, 439 "width": '150px', 440 "formatting": None, 441 }, 442 "Substate": { 443 "id": 'Substate', 444 "align": 'left', 445 "disablePadding": False, 446 "label": 'Substate', 447 "numeric": False, 448 "filter": True, 449 "width": '150px', 450 "formatting": None, 451 }, 452 "Virtual Memory": { 453 "id": 'Virtual Memory', 454 "align": 'left', 455 "disablePadding": False, 456 "label": 'Virtual Memory', 457 "numeric": False, 458 "filter": True, 459 "width": '150px', 460 "formatting": None, 461 }, 462 "Wall Time": { 463 "id": 'Wall Time', 464 "align": 'left', 465 "disablePadding": False, 466 "label": 'Wall Time', 467 "numeric": False, 468 "filter": True, 469 "width": '150px', 470 "formatting": None, 471 }, 472 "id": { 473 "id": 'id', 474 "align": 'left', 475 "disablePadding": False, 476 "label": 'Job Identification', 477 "numeric": True, 478 "filter": True, 479 "width": '200px', 480 "formatting": None, 481 }, 482 "name": { 483 "id": 'name', 484 "align": 'left', 485 "disablePadding": False, 486 "label": 'Job Name', 487 "numeric": True, 488 "filter": True, 489 "width": '200px', 490 "formatting": None, 491 }, 492} 493# we dont show them all, just the ones we need to being with 494defaultColumns = 'id', 'Job State', 'Queue', 495'Number CPUs', 'Job Name', 'Job Owner', 'CPU Time' 496fullColumns = 'id', 'Job State', 'Queue', 'Number CPUs', 'Job Name', 'Job Owner', 'CPU Time', 'Job Directory', 'Job Owner', 'Checkpoint', 'Comment', 'CPU Percentage', 'Created', 'Error', 'Ended', 'Host', 'Virtual Node', 'Exit Status', 'History Timestamp', 'Hold', 'Join', 'Keep', 'Mail Points', 497'Memory', 'Modified', 'Number CPUs', 'Node Count', 'Output', 'Place', 'Priority', 'Project', 'Queue Time', 'Queue', 'Rerunable', 'Run Count', 'Select', 'Server', 'Session ID', 'Stageout Status', 'Stop Time', 'Submit Arguments', 'Submit Host', 'Substate', 'Virtual Memory', 'Wall Time', 'name' 498jobData = data.get("Jobs", {}) 499data = {"jobs": [], "formatting": formatting, 500"defaultColumns": defaultColumns, "fullColumns": fullColumns} 501 502for job_name in jobData: 503 job = {} 504 505 for column in topLevelColumns: 506 if column in jobData[job_name]: 507 job[columnMapping[column]] = jobData[job_name]column 508 else: 509 job[columnMapping[column]] = "" 510 if "resources_used" in jobData[job_name]: 511 for column in resourceColumns: 512 if column in jobData[job_name]["resources_used"]: 513 jobcolumnMapping[column 514 ] = jobData[job_name]["resources_used"]column 515 else: 516 job[columnMapping[column]] = "" 517 if "Resource_List" in jobData[job_name]: 518 for column in resourceListColumns: 519 if column in jobData[job_name]["Resource_List"]: 520 jobcolumnMapping[column 521 ] = jobData[job_name]["Resource_List"]column 522 else: 523 job[columnMapping[column]] = "" 524 job"Job Owner" = job["Job Owner"].split("@")0 525 # requried fields for the table 526 job"id" = job_name.split(".")0 527 job"name" = job_name.split(".")0 528 data["jobs"].append(job) 529 output = {} 530 output"data" = data 531 print(json.dumps(output, indent=4))
Get Queue Machines
To interface correctly with the software, a script or program that lists all machines (nodes) known to the scheduler and returns them in the required JSON format must be provided.
Functional Requirements
-
The script must retrieve machine (node) information from the job scheduling system in use.
HTML1- If OpenPBS is used, the provided Python script demonstrates how 2 this can be done with pbsnodes -a -F json.HTML1- If a different scheduler is used (e.g., Slurm’s scontrol show 2 nodes), the logic may be implemented in any language or method 3 appropriate for the environment. -
The script must normalize machine fields to a standardized set of names using the reference column mapping. Examples:
HTML1- "pcpus" → "Number CPUs"HTML1- "ncpus" (from assigned resources) → "Used CPUs"HTML1- "arch" → "Operating System"HTML1- "mem" → "Available Memory"HTML1- "Mom" → "Head Node"HTML1- "pbs_version" → "PBS Version" -
Each machine entry must include an "id" field.
HTML1- "id" must be set to the machine’s name (e.g., BigRig).HTML1- "name" must also be present and set to the same value. -
The script must calculate the "Available CPUs" field as:
Available CPUs = Number CPUs – Used CPUs
- If values are missing, they must be set to 0.
-
The script must output valid JSON with the following structure:
JSON1{ 2 "data": { 3 "machines": [ 4 { 5 "id": "BigRig", 6 "name": "BigRig", 7 "Head Node": "bigrig", 8 "Port Number": 15002, 9 "PBS Version": "23.06.06", 10 "State": "free", 11 "Number CPUs": 24, 12 "Used CPUs": 0, 13 "Available CPUs": 24, 14 "Operating System": "linux", 15 "Host Name": "bigrig", 16 "Available Memory": "49405656kb", 17 "Run Type": "mcnp,development,scale,BigRig", 18 "Virtual Node": "BigRig", 19 "Reservable": "True", 20 "Sharing": "default_shared", 21 "License": "l", 22 "Last State Change": 1747056010, 23 "Last Used Time": 1747056010 24 } 25 ], 26 "formatting": { ... }, 27 "defaultColumns": [ "id", "Number CPUs", "Used CPUs", "Available CPUs", "State" ], 28 "fullColumns": ... 29 } 30}-
The top-level object must contain a single key: "data".
-
"data" must contain:
-
-
"machines": an array of machine objects, each with at least "id" and "name".
-
"formatting": column definitions (id, label, width, etc.).
-
"defaultColumns": minimal columns shown by default.
-
"fullColumns": complete list of available columns.
-
The output must be written to standard output (stdout) so it can be captured by the software.
-
Notes
HTML1- The provided Python script works with OpenPBS and is the reference 2 implementation.HTML1- If a different scheduler is used, the same JSON structure must 2 still be returned.HTML1- If some fields are unavailable in the environment, they must still 2 be included in the output with null (or 0 for numeric fields).HTML1- The "id" field is mandatory for every machine, regardless of 2 scheduler or implementation.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4 5# Run the qstat command 6result = subprocess.run(['pbsnodes', '-a', '-F', 'json'], 7capture_output=True, text=True, check=True) 8# Parse JSON 9data = json.loads(result.stdout) 10nodes = data.get("nodes") 11{'Mom': 'bigrig', 'Port': 15002, 'pbs_version': '23.06.06', 'ntype': 'PBS', 'state': 'free', 'pcpus': 24, 'resources_available': {'arch': 'linux', 'host': 'bigrig', 'mem': '49405656kb', 'ncpus': 24, 12'run_type': 'mcnp,development,scale,BigRig', 'vnode': 'BigRig'}, 'resources_assigned': {}, 'resv_enable': 'True', 'sharing': 'default_shared', 'license': 'l', 'last_state_change_time': 1747056010, 'last_used_time': 1747056010} 13topLevelColumns = "name", "state", "pcpus", "resv_enable", 14"sharing", "license", "last_state_change_time", "last_used_time" 15resourceAvailableColumns = 'arch', 'host', 16'mem', 'ncpus', 'run_type', 'vnode' 17resourceAsignedColumns = "ncpus" 18columnMapping = { 19 "id": "Machine", 20 "Mom": "Head Node", 21 "Port": "Port Number", 22 "pbs_version": "PBS Version", 23 "state": "State", 24 "pcpus": "Number CPUs", 25 "arch": "Operating System", 26 "host": "Host Name", 27 "mem": "Available Memory", 28 "ncpus": "Available CPUs", 29 "run_type": "Run Type", 30 "vnode": "Virtual Node", 31 "ucpus": "Used CPUs", 32 "resv_enable": "Reservable", 33 "sharing": "Sharing", 34 "last_state_change_time": "Last State Change", 35 "last_used_time": "Last Used Time", 36 "license": "License", 37 "name": "name", 38} 39formatting = { 40 "id": { 41 "id": 'id', 42 "align": 'left', 43 "disablePadding": False, 44 "label": 'Machine', 45 "numeric": False, 46 "filter": True, 47 "width": '250px', 48 "formatting": None, 49 }, 50 "Head Node": { 51 "id": 'Head Node', 52 "align": 'left', 53 "disablePadding": False, 54 "label": 'Head Node', 55 "numeric": False, 56 "filter": True, 57 "width": '150px', 58 "formatting": None, 59 }, 60 "Port Number": { 61 "id": 'Port Number', 62 "align": 'left', 63 "disablePadding": False, 64 "label": 'Port Number', 65 "numeric": True, 66 "filter": True, 67 "width": '150px', 68 "formatting": None, 69 }, 70 "PBS Version": { 71 "id": 'PBS Version', 72 "align": 'left', 73 "disablePadding": False, 74 "label": 'PBS Version', 75 "numeric": False, 76 "filter": True, 77 "width": '150px', 78 "formatting": None, 79 }, 80 "State": { 81 "id": 'State', 82 "align": 'left', 83 "disablePadding": False, 84 "label": 'State', 85 "numeric": False, 86 "filter": True, 87 "width": '200px', 88 "formatting": None, 89 }, 90 "Number CPUs": { 91 "id": 'Number CPUs', 92 "align": 'left', 93 "disablePadding": False, 94 "label": 'Number CPUs', 95 "numeric": True, 96 "filter": True, 97 "width": '150px', 98 "formatting": None, 99 }, 100 "Operating System": { 101 "id": 'Operating System', 102 "align": 'left', 103 "disablePadding": False, 104 "label": 'Operating System', 105 "numeric": False, 106 "filter": True, 107 "width": '150px', 108 "formatting": None, 109 }, 110 "Host Name": { 111 "id": 'Host Name', 112 "align": 'left', 113 "disablePadding": False, 114 "label": 'Host Name', 115 "numeric": False, 116 "filter": True, 117 "width": '150px', 118 "formatting": None, 119 }, 120 "Available Memory": { 121 "id": 'Available Memory', 122 "align": 'left', 123 "disablePadding": False, 124 "label": 'Available Memory', 125 "numeric": True, 126 "filter": True, 127 "width": '150px', 128 "formatting": None, 129 }, 130 "Available CPUs": { 131 "id": 'Available CPUs', 132 "align": 'left', 133 "disablePadding": False, 134 "label": 'Available CPUs', 135 "numeric": True, 136 "filter": True, 137 "width": '150px', 138 "formatting": None, 139 }, 140 "Run Type": { 141 "id": 'Run Type', 142 "align": 'left', 143 "disablePadding": False, 144 "label": 'Run Type', 145 "numeric": False, 146 "filter": True, 147 "width": '150px', 148 "formatting": None, 149 }, 150 "Virtual Node": { 151 "id": 'Virtual Node', 152 "align": 'left', 153 "disablePadding": False, 154 "label": 'Virtual Node', 155 "numeric": True, 156 "filter": True, 157 "width": '150px', 158 "formatting": None, 159 }, 160 "Used CPUs": { 161 "id": 'Used CPUs', 162 "align": 'left', 163 "disablePadding": False, 164 "label": 'Used CPUs', 165 "numeric": True, 166 "filter": True, 167 "width": '150px', 168 "formatting": None, 169 }, 170 "Reservable": { 171 "id": 'Reservable', 172 "align": 'left', 173 "disablePadding": False, 174 "label": 'Reservable', 175 "numeric": False, 176 "filter": True, 177 "width": '150px', 178 "formatting": None, 179 }, 180 "Sharing": { 181 "id": 'Sharing', 182 "align": 'left', 183 "disablePadding": False, 184 "label": 'Sharing', 185 "numeric": False, 186 "filter": True, 187 "width": '150px', 188 "formatting": None, 189 }, 190 "Last State Change": { 191 "id": 'Last State Change', 192 "align": 'left', 193 "disablePadding": False, 194 "label": 'Last State Change', 195 "numeric": False, 196 "filter": True, 197 "width": '150px', 198 "formatting": None, 199 }, 200 "Last Used Time": { 201 "id": 'Last Used Time', 202 "align": 'left', 203 "disablePadding": False, 204 "label": 'Last Used Time', 205 "numeric": False, 206 "filter": True, 207 "width": '150px', 208 "formatting": None, 209 }, 210 "name": { 211 "id": 'name', 212 "align": 'left', 213 "disablePadding": False, 214 "label": 'Machine Name', 215 "numeric": True, 216 "filter": True, 217 "width": '200px', 218 "formatting": None, 219 }, 220} 221defaultColumns = 'id', 'Number CPUs', 'Used CPUs', 'Available CPUs', 'State' 222fullColumns = 'id', 'Number CPUs', 'Used CPUs', 'Available CPUs', 'State', 'Head Node', 'Port Number', 'PBS Version', 'Operating System', 'Host Name', 223'Available Memory', 'Available CPUs', 'Run Type', 'Virtual Node', 'Reservable', 'Sharing', 'Last State Change', 'Last Used Time' 224data = {"machines": [], "formatting": formatting, 225"defaultColumns": defaultColumns, "fullColumns": fullColumns} 226 227for machine_name in nodes: 228 machine = {} 229 230 for column in topLevelColumns: 231 if column in nodes[machine_name]: 232 machine[columnMapping[column]] = nodes[machine_name]column 233 else: 234 machine[columnMapping[column]] = None 235 if "resources_available" in nodes[machine_name]: 236 for column in resourceAvailableColumns: 237 if column in nodes[machine_name]["resources_available"]: 238 machinecolumnMapping[column 239 ] = nodes[machine_name]["resources_available"]column 240 else: 241 machine[columnMapping[column]] = None 242 if (machine"Number CPUs" == ""): 243 machine"Number CPUs" = 0 244 if 'resources_assigned' in nodes[machine_name]: 245 if 'ncpus' in nodes[machine_name]['resources_assigned']: 246 machine'Used CPUs' = nodes[machine_name]['resources_assigned']'ncpus' 247 else: 248 machine'Used CPUs' = 0 249 else: 250 machine'Used CPUs' = 0 251 if "Number CPUs" in machine: 252 machine"Available CPUs" = int( 253 machine["Number CPUs"]) - int(machine["Used CPUs"]) 254 # requried fields for the table 255 machine"id" = machine_name 256 machine"name" = machine_name 257 data["machines"].append(machine) 258 output = {} 259 output"data" = data 260 print(json.dumps(output, indent=4))
Get Queue Info
To interface correctly with the software, a script or program that retrieves relevant queue information from the scheduling system and returns it in the required JSON format must be provided.
Functional Requirements
-
The script must query the job scheduler for queue information.
HTML1- If using OpenPBS, the provided Python script demonstrates this 2 with qstat -Q -f -F json.HTML1- If using a different scheduler (e.g., Slurm, LSF, SGE), logic may 2 be implemented in any language appropriate for the environment. -
The script must return only fields that are relevant to the user, not the full raw scheduler data.
HTML1- Required fields include queue metadata such as:HTML1- Queue name and typeHTML1- PriorityHTML1- Whether the queue is enabled/startedHTML1- Total jobsHTML1- Run types (if applicable)HTML1- Current usage (e.g., used CPUs, node counts)HTML1- Job state breakdown (Queued, Running, Held, etc.) -
Each queue entry must include an "id" field.
HTML1- "id" must be the queue name.HTML1- "name" must also be present and set to the same value. -
If a required field is unavailable from the scheduler, it must still be included in the JSON output with an empty string ("") or zero (0) as appropriate.
-
The script must output valid JSON with the following structure:
JSON1{ 2 "data": { 3 "queues": [ 4 { 5 "Queue Type": "Execution", 6 "Priority": 2, 7 "Total Jobs": 0, 8 "Enabled": "True", 9 "Started": "True", 10 "Run Types": "scale", 11 "Used CPUs": 0, 12 "Node Count": 0, 13 "Queued": "0", 14 "Running": "0", 15 "Held": "0", 16 "Transit": "0", 17 "Exiting": "0", 18 "Begun": "0", 19 "id": "scale", 20 "name": "scale" 21 } 22 ], 23 "formatting": { ... }, 24 "defaultColumns": [ "id", "Enabled", "Started", "Queue Type", "Priority", "Total Jobs", "Queued", "Running" ], 25 "fullColumns": ... 26 } 27}-
The top-level object must contain a single key: "data".
-
"data" must contain:
-
-
"queues": an array of queue objects with the required user-relevant fields.
-
"formatting": column definitions (id, label, width, etc.).
-
"defaultColumns": minimal columns shown by default.
-
"fullColumns": complete list of available columns.
-
The output must be written to standard output (stdout) so it can be captured by the software.
-
Notes:
HTML1- The provided Python script is the reference implementation for 2 OpenPBS.HTML1- If a different scheduler is used, the same JSON structure must 2 still be replicated.HTML1- Only fields that are relevant to the user should be returned. 2 Internal scheduler fields (e.g., raw resource limits, system-level 3 bookkeeping) must not be exposed.HTML1- The "id" field is mandatory for every queue, regardless of 2 scheduler or implementation.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4 5# Run the qstat command 6result = subprocess.run(['qstat', '-Q', '-f', '-F', 'json'], 7capture_output=True, text=True, check=True) 8# Parse JSON 9data = json.loads(result.stdout) 10topLevelColumns = "queue_type", "Execution", "Priority", "total_jobs", "resources_min", "resources_max_per_user", "resources_max_per_job", 11"resources_max_per_node", "resources_max_per_core", "resources_max_per_gpu", "resources_max_per_node", "resources_max_per_core", "resources_max_per_gpu", "enabled", "started" 12defaultChunkColumns = "run_type" 13resourcesAssignedColumns = "ncpus", "nodect" 14# column mapping to make it easier to read 15columnMapping = { 16 "queue_type": "Queue Type", 17 "Execution": "Execution", 18 "Priority": "Priority", 19 "total_jobs": "Total Jobs", 20 "run_type": "Run Types", 21 "ncpus": "Used CPUs", 22 "enabled": "Enabled", 23 "started": "Started", 24 "Transit": "Transit", 25 "Queued": "Queued", 26 "Held": "Held", 27 "Waiting": "Waiting", 28 "Running": "Running", 29 "Exiting": "Exiting", 30 "Begun": "Begun", 31 "name": "Queue", 32 "id": "Queue", 33 "resources_min": "Min Resources", 34 "resources_max_per_user": "Max Resources per User", 35 "resources_max_per_job": "Max Resources per Job", 36 "resources_max_per_node": "Max Resources per Node", 37 "resources_max_per_core": "Max Resources per Core", 38 "resources_max_per_gpu": "Max Resources per GPU", 39 "enabled": "Enabled", 40 "started": "Started", 41 "Transit": "Transit", 42} 43formatting = { 44 "id": { 45 "id": 'id', 46 "align": 'left', 47 "disablePadding": False, 48 "label": 'Queue', 49 "numeric": False, 50 "filter": True, 51 "width": '250px', 52 "formatting": None, 53 }, 54 "Queue Type": { 55 "id": 'Queue Type', 56 "align": 'left', 57 "disablePadding": False, 58 "label": 'Queue Type', 59 "numeric": False, 60 "filter": True, 61 "width": '150px', 62 "formatting": None, 63 }, 64 "Priority": { 65 "id": 'Priority', 66 "align": 'left', 67 "disablePadding": False, 68 "label": 'Priority', 69 "numeric": False, 70 "filter": True, 71 "width": '150px', 72 "formatting": None, 73 }, 74 "Total Jobs": { 75 "id": 'Total Jobs', 76 "align": 'left', 77 "disablePadding": False, 78 "label": 'Total Jobs', 79 "numeric": False, 80 "filter": True, 81 "width": '150px', 82 "formatting": None, 83 }, 84 "Run Types": { 85 "id": 'Run Types', 86 "align": 'left', 87 "disablePadding": False, 88 "label": 'Run Types', 89 "numeric": False, 90 "filter": True, 91 "width": '150px', 92 "formatting": None, 93 }, 94 "Used CPUs": { 95 "id": 'Used CPUs', 96 "align": 'left', 97 "disablePadding": False, 98 "label": 'Used CPUs', 99 "numeric": False, 100 "filter": True, 101 "width": '150px', 102 "formatting": None, 103 }, 104 "Enabled": { 105 "id": 'Enabled', 106 "align": 'left', 107 "disablePadding": False, 108 "label": 'Enabled', 109 "numeric": False, 110 "filter": True, 111 "width": '150px', 112 "formatting": None, 113 }, 114 "Started": { 115 "id": 'Started', 116 "align": 'left', 117 "disablePadding": False, 118 "label": 'Started', 119 "numeric": False, 120 "filter": True, 121 "width": '150px', 122 "formatting": None, 123 }, 124 "Transit": { 125 "id": 'Transit', 126 "align": 'left', 127 "disablePadding": False, 128 "label": 'Transit', 129 "numeric": False, 130 "filter": True, 131 "width": '150px', 132 "formatting": None, 133 }, 134 "Queued": { 135 "id": 'Queued', 136 "align": 'left', 137 "disablePadding": False, 138 "label": 'Queued', 139 "numeric": False, 140 "filter": True, 141 "width": '150px', 142 "formatting": None, 143 }, 144 "Held": { 145 "id": 'Held', 146 "align": 'left', 147 "disablePadding": False, 148 "label": 'Held', 149 "numeric": False, 150 "filter": True, 151 "width": '150px', 152 "formatting": None, 153 }, 154 "Waiting": { 155 "id": 'Waiting', 156 "align": 'left', 157 "disablePadding": False, 158 "label": 'Waiting', 159 "numeric": False, 160 "filter": True, 161 "width": '150px', 162 "formatting": None, 163 }, 164 "Running": { 165 "id": 'Running', 166 "align": 'left', 167 "disablePadding": False, 168 "label": 'Running', 169 "numeric": False, 170 "filter": True, 171 "width": '150px', 172 "formatting": None, 173 }, 174 "Exiting": { 175 "id": 'Exiting', 176 "align": 'left', 177 "disablePadding": False, 178 "label": 'Exiting', 179 "numeric": False, 180 "filter": True, 181 "width": '150px', 182 "formatting": None, 183 }, 184 "Begun": { 185 "id": 'Begun', 186 "align": 'left', 187 "disablePadding": False, 188 "label": 'Begun', 189 "numeric": False, 190 "filter": True, 191 "width": '150px', 192 "formatting": None, 193 }, 194 "name": { 195 "id": 'name', 196 "align": 'left', 197 "disablePadding": False, 198 "label": 'Queue', 199 "numeric": False, 200 "filter": True, 201 "width": '250px', 202 "formatting": None, 203 }, 204} 205defaultColumns = 'id', 'Enabled', 'Started', 'Queue Type', 206'Priority', 'Total Jobs', 'Queued', 'Running' 207fullColumns = 'id', 'Queue Type', 'Priority', 'Total Jobs', 'Run Types', 'Used CPUs', 208'Enabled', 'Started', 'Transit', 'Queued', 'Held', 'Waiting', 'Running', 'Exiting', 'Begun' 209queueData = data.get("Queue", {}) 210data = {"queues": [], "formatting": formatting, 211"defaultColumns": defaultColumns, "fullColumns": fullColumns} 212 213for queueName in queueData: 214 queue = {} 215 216 for column in topLevelColumns: 217 if column in queueData[queueName]: 218 queue[columnMapping[column]] = queueData[queueName]column 219 else: 220 queue[columnMapping[column]] = "" 221 if ('default_chunk' in queueData[queueName]): 222 if ('run_type' in queueData[queueName]['default_chunk']): 223 queue'Run Types' = queueData[queueName]['default_chunk']'run_type' 224 else: 225 queue'Run Types' = 'ALL' 226 else: 227 queue'Run Types' = 'ALL' 228 if ('resources_assigned' in queueData[queueName]): 229 if ('ncpus' in queueData[queueName]['resources_assigned']): 230 queue'Used CPUs' = queueData[queueName]['resources_assigned']'ncpus' 231 else: 232 queue'Used CPUs' = '' 233 if ('nodect' in queueData[queueName]['resources_assigned']): 234 queue'Node Count' = queueData[queueName]['resources_assigned']'nodect' 235 else: 236 queue'Node Count' = '' 237 else: 238 queue'Used CPUs' = '' 239 queue'Node Count' = '' 240 if ('state_count' in queueData[queueName]): 241 stateData = queueData[queueName]['state_count'].split(' ') 242 queue'Transit' = stateData[0].split(':')1 243 queue'Queued' = stateData[1].split(':')1 244 queue'Held' = stateData[2].split(':')1 245 queue'Waiting' = stateData[3].split(':')1 246 queue'Running' = stateData[4].split(':')1 247 queue'Exiting' = stateData[5].split(':')1 248 queue'Begun' = stateData[6].split(':')1 249 queue'id' = queueName 250 queue'name' = queueName 251 data['queues'].append(queue) 252 output = {} 253 output"data" = data 254 print(json.dumps(output, indent=4))
Rerun Job
To interface correctly with the software, a script or program that reruns one or more jobs in the queue and returns the results in the required JSON format must be provided.
Functional Requirements
-
The script must accept one or more job IDs as input arguments.
HTML1- The job IDs will be passed on the command line.HTML1- Example:./rerunJob.py 12345 67890
-
For each job ID provided, the script must attempt to rerun the job.
HTML1- If using OpenPBS, the provided Python script demonstrates this 2 with qrerun -f \<jobid\>.HTML1- If using another scheduler, an equivalent rerun/resubmit command 2 must be substituted. -
The script must not throw an exception if the scheduler command fails.
HTML1- Instead, it must capture both stdout and stderr and record 2 success/failure for each job. -
The script must output valid JSON with the following structure:
JSON1{ 2 "data": { 3 "12345": { 4 "status": "ok", 5 "output": "Job rerun successfully" 6 }, 7 "67890": { 8 "status": "error", 9 "error": "Job 67890 not found" 10 } 11 } 12}-
"data" must be a dictionary keyed by job ID.
-
Each job ID must map to:
-
-
"status": either "ok" or "error".
-
"output": scheduler output (if successful).
-
"error": scheduler error message (if failed).
-
The output must be written to standard output (stdout) so it can be captured by the software.
-
Notes:
HTML1- The provided Python script is the reference implementation for 2 OpenPBS.HTML1- If a different scheduler is used, the same JSON structure must be 2 replicated even if the rerun command is different.HTML1- Multiple job IDs may be passed in a single call, and the result 2 for each job must be included separately in the "data" object.HTML1- This script differs from the listing scripts: it performs an 2 action (rerunning jobs) and reports back the outcome.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser(description='Rerun a job from the queue') 8 # ids is a list of job ids 9 parser.add_argument('ids', nargs='+', help='The job ID to rerun') 10 args = parser.parse_args() 11 if len(args.ids) == 0: 12 return 13 results = {} 14 15 for id in args.ids: 16 # Run the qstat command 17 # don’t use check=True so no exception is thrown 18 result = subprocess.run( 19 ['qrerun', '-f', id], 20 capture_output=True, 21 text=True 22 ) 23 if result.returncode == 0: 24 # command succeeded 25 resultsid = { 26 'status': 'ok', 27 'output': result.stdout 28 } 29 else: 30 # non-zero return code → failure 31 resultsid = { 32 'status': 'error', 33 'error': result.stderr 34 } 35 output = {} 36 output"data" = results 37 print(json.dumps(output, indent=4)) 38 39if __name__ == "__main__": 40 main()
Suspend Job
To interface correctly with the software, a script or program that suspends one or more jobs in the queue and returns the results in the required JSON format must be provided.
Functional Requirements
-
The script must accept one or more job IDs as input arguments.
HTML1- The job IDs will be passed on the command line.HTML1- Example:./suspendJob.py 12345 67890
-
For each job ID provided, the script must attempt to suspend the job.
HTML1- If using OpenPBS, the reference implementation uses:HTML1qsig -s suspend <jobid>- If using another scheduler, substitute the equivalent suspend/hold command.
-
The script must not throw an exception if the scheduler command fails.
HTML1- It must capture both stdout and stderr from the scheduler. -
The script must output valid JSON with the following structure:
JSON1{ 2 "data": { 3 "12345": { 4 "status": "ok", 5 "output": "Job suspended successfully" 6 }, 7 "67890": { 8 "status": "error", 9 "error": "Job 67890 not found" 10 } 11 } 12}-
"data" must be a dictionary keyed by job ID.
-
Each job ID must include:
-
-
"status": "ok" if suspend succeeded, "error" if not.
-
"output": scheduler output if successful.
-
"error": scheduler error message if failed.
-
The script must always print JSON to stdout.
-
Notes:
HTML1- This is an action script (like rerun, cancel, resume, etc.) — it 2 doesn’t list or query jobs, it executes a command.HTML1- If OpenPBS is not used, the custom implementation must still 2 produce the same JSON response.HTML1- Multiple jobs may be passed in a single call, and results must be 2 reported individually under "data".
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser( 8 description='Suspend a job from the queue') 9 # ids is a list of job ids 10 parser.add_argument('ids', nargs='+', help='The job ID to suspend') 11 args = parser.parse_args() 12 if len(args.ids) == 0: 13 return 14 results = {} 15 for id in args.ids: 16 # Run the qstat command 17 # can we combine streams? 18 # don’t use check=True so no exception is thrown 19 result = subprocess.run( 20 ['qsig', '-s', 'suspend', id], 21 capture_output=True, 22 text=True 23 ) 24 if result.returncode == 0: 25 # command succeeded 26 resultsid = { 27 'status': 'ok', 28 'output': result.stdout 29 } 30 else: 31 # non-zero return code → failure 32 resultsid = { 33 'status': 'error', 34 'error': result.stderr 35 } 36 output = {} 37 output"data" = results 38 print(json.dumps(output, indent=4)) 39 40if __name__ == "__main__": 41 main()
Resume Job
To interface correctly with the software, a script or program that resumes one or more suspended jobs and returns the results in the required JSON format must be provided.
Functional Requirements
-
The script must accept one or more job IDs as input arguments.
HTML1- Example usage:./resumeJob.py 12345 67890
-
For each job ID provided, the script must attempt to resume the job.
HTML1- If using OpenPBS, the reference implementation resumes with:HTML1qsig -s resume <jobid> && qrun <jobid>- If using another scheduler, substitute the equivalent resume/unhold command.
-
The script must not raise an exception if the scheduler command fails.
HTML1- It must capture both stdout and stderr output. -
The script must return results in JSON format with the following structure:
JSON1{ 2 "data": { 3 "12345": { 4 "status": "ok", 5 "output": "Job resumed successfully" 6 }, 7 "67890": { 8 "status": "error", 9 "error": "Job 67890 not found" 10 } 11 } 12}-
"data" is a dictionary keyed by job ID.
-
Each job result must include:
-
-
"status": "ok" if resume succeeded, "error" if not.
-
"output": scheduler output if successful.
-
"error": scheduler error if failed.
-
The script must always write JSON to stdout.
-
Notes:
HTML1- This is an action script, like rerun or suspend.HTML1- Multiple job IDs may be given at once, and each must have its own 2 result in the JSON.HTML1- If the scheduler doesn’t require both resume and run (like OpenPBS 2 does with qsig + qrun), an appropriate equivalent may be used, but 3 the JSON format must stay consistent.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser(description='Resume a job from the queue') 8 # ids is a list of job ids 9 parser.add_argument('ids', nargs='+', help='The job ID to resume') 10 args = parser.parse_args() 11 if len(args.ids) == 0: 12 return 13 results = {} 14 15 for id in args.ids: 16 # Run the qstat command 17 # can we combine streams? 18 # don’t use check=True so no exception is thrown 19 result = subprocess.run( 20 ['qsig', '-s', 'resume', id, '&&', 'qrun', id], 21 shell=True, 22 capture_output=True, 23 text=True, 24 ) 25 if result.returncode == 0: 26 # command succeeded 27 resultsid = { 28 'status': 'ok', 29 'output': result.stdout 30 } 31 else: 32 # non-zero return code → failure 33 resultsid = { 34 'status': 'error', 35 'error': result.stderr 36 } 37 output = {} 38 output"data" = results 39 print(json.dumps(output, indent=4)) 40 41if __name__ == "__main__": 42 main()
Hold Job
To integrate correctly with the software, a script or program that places one or more jobs on hold and returns results in the required JSON format must be provided.
Functional Requirements
-
The script must accept one or more job IDs as input arguments.
HTML1- Example usage:./holdJob.py 12345 67890
-
For each job ID provided, the script must attempt to place the job on hold.
HTML1- With OpenPBS, the reference implementation uses:HTML1qhold <jobid>- With another scheduler, use the equivalent hold/pause command.
-
The script must not throw exceptions on failure.
HTML1- Both stdout and stderr must be captured. -
The script must return results in JSON format, with the following structure:
JSON1{ 2 "data": { 3 "12345": { 4 "status": "ok", 5 "output": "Job held successfully" 6 }, 7 "67890": { 8 "status": "error", 9 "error": "Job 67890 not found" 10 } 11 } 12}-
"data" must be an object keyed by job ID.
-
Each job entry must include:
-
-
"status": "ok" if successful, "error" otherwise.
-
"output": scheduler output if successful.
-
"error": scheduler error message if failed.
-
The script must always print valid JSON to stdout.
-
Notes:
HTML1- This is another action script in the same family as rerun, 2 suspend, and resume.HTML1- The JSON contract is identical across all of them — only the 2 scheduler command changes.HTML1- If multiple job IDs are provided, each one must be processed 2 independently and reported in the "data" object.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser(description='Hold a job from the queue') 8 # ids is a list of job ids 9 parser.add_argument('ids', nargs='+', help='The job ID to hold') 10 args = parser.parse_args() 11 if len(args.ids) == 0: 12 return 13 results = {} 14 15 for id in args.ids: 16 # Run the qstat command 17 # can we combine streams? 18 # don’t use check=True so no exception is thrown 19 result = subprocess.run( 20 ['qhold', id], 21 capture_output=True, 22 text=True 23 ) 24 if result.returncode == 0: 25 # command succeeded 26 resultsid = { 27 'status': 'ok', 28 'output': result.stdout 29 } 30 else: 31 # non-zero return code → failure 32 resultsid = { 33 'status': 'error', 34 'error': result.stderr 35 } 36 output = {} 37 output"data" = results 38 print(json.dumps(output, indent=4)) 39 40if __name__ == "__main__": 41 main()
Release Job
To integrate correctly with the software, a script or program that releases one or more jobs from hold and returns results in the required JSON format must be provided.
Functional Requirements
-
The script must accept one or more job IDs as input arguments.
HTML1- Example usage:./releaseJob.py 12345 67890
-
For each job ID provided, the script must attempt to release the job.
HTML1- With OpenPBS, the reference implementation uses:HTML1qrls <jobid>- If using another scheduler, substitute the equivalent release/unhold command.
-
The script must not throw exceptions when a command fails.
HTML1- Both stdout and stderr from the scheduler must be captured. -
The script must return results in JSON format with the following structure:
JSON1{ 2 "data": { 3 "12345": { 4 "status": "ok", 5 "output": "Job released successfully" 6 }, 7 "67890": { 8 "status": "error", 9 "error": "Job 67890 not found" 10 } 11 } 12}-
"data" must be an object keyed by job ID.
-
Each entry must include:
-
-
"status": "ok" if release succeeded, "error" if not.
-
"output": scheduler output if successful.
-
"error": scheduler error message if failed.
-
The script must always print valid JSON to stdout.
-
Notes:
HTML1- This is part of the same action script family as hold, suspend, 2 resume, and rerun.HTML1- The JSON contract is identical across all of them — only the 2 underlying scheduler command changes.HTML1- Multiple job IDs may be processed in one call, and each must have 2 its own result in the "data" object.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser( 8 description='Release a job from the queue') 9 # ids is a list of job ids 10 parser.add_argument('ids', nargs='+', help='The job ID to release') 11 args = parser.parse_args() 12 if len(args.ids) == 0: 13 return 14 results = {} 15 for id in args.ids: 16 # Run the qstat command 17 # can we combine streams? 18 # don’t use check=True so no exception is thrown 19 result = subprocess.run( 20 ['qrls', id], 21 capture_output=True, 22 text=True 23 ) 24 if result.returncode == 0: 25 # command succeeded 26 resultsid = { 27 'status': 'ok', 28 'output': result.stdout 29 } 30 else: 31 # non-zero return code → failure 32 resultsid = { 33 'status': 'error', 34 'error': result.stderr 35 } 36 output = {} 37 output"data" = results 38 print(json.dumps(output, indent=4)) 39 40if __name__ == "__main__": 41 main()
Delete Job
To integrate correctly with the software, a script or program that deletes one or more jobs from the queue and returns results in the required JSON format must be provided.
Functional Requirements
-
The script must accept one or more job IDs as input arguments.
HTML1- Example usage:./deleteJob.py 12345 67890
-
For each job ID provided, the script must attempt to delete the job.
HTML1- With OpenPBS, the reference implementation uses:HTML1qdel -W force <jobid>- With another scheduler, use the equivalent delete/terminate command.
-
The script must not raise exceptions when a command fails.
HTML1- It must capture both stdout and stderr from the scheduler. -
The script must output results in JSON format with the following structure:
JSON1{ 2 "data": { 3 "12345": { 4 "status": "ok", 5 "output": "Job deleted successfully" 6 }, 7 "67890": { 8 "status": "error", 9 "error": "Job 67890 is already finished" 10 } 11 } 12}-
"data" must be an object keyed by job ID.
-
Each entry must include:
-
-
"status": "ok" if delete succeeded, "error" if not.
-
"output": scheduler output if successful.
-
"error": scheduler error message if failed.
-
The script must always write valid JSON to stdout.
-
Notes:
HTML1- This is an action script in the same family as hold, release, 2 suspend, resume, and rerun.HTML1- The JSON contract is identical across all of them — the only 2 difference is which scheduler command is executed.HTML1- Multiple job IDs may be processed in one call, and each job’s 2 result must appear in the "data" object.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser(description='Delete a job from the queue') 8 # ids is a list of job ids 9 parser.add_argument('ids', nargs='+', help='The job ID to delete') 10 args = parser.parse_args() 11 if len(args.ids) == 0: 12 return 13 results = {} 14 15 for id in args.ids: 16 # Run the qstat command 17 # can we combine streams? 18 # don’t use check=True so no exception is thrown 19 result = subprocess.run( 20 ['qdel', '-W', 'force', id], 21 capture_output=True, 22 text=True 23 ) 24 if result.returncode == 0: 25 # command succeeded 26 resultsid = { 27 'status': 'ok', 28 'output': result.stdout 29 } 30 else: 31 # non-zero return code → failure 32 resultsid = { 33 'status': 'error', 34 'error': result.stderr 35 } 36 output = {} 37 output"data" = results 38 print(json.dumps(output, indent=4)) 39 40if __name__ == "__main__": 41 main()
Move Job
To integrate correctly with the software, a script or program that moves one or more jobs to a different queue and returns results in the required JSON format must be provided.
Functional Requirements
-
The script must accept a target queue name and one or more job IDs as input arguments.
HTML1- Example usage:./moveJob.py shortq 12345 67890
- Here, jobs 12345 and 67890 will be moved to the shortq queue.
-
For each job ID provided, the script must attempt to move the job to the specified queue.
HTML1- With OpenPBS, the reference implementation uses:HTML1qmove <queue> <jobid>- If using another scheduler, substitute the equivalent move/transfer command.
-
The script must not raise exceptions on command failure.
HTML1- Both stdout and stderr from the scheduler must be captured. -
The script must return results in JSON format with the following structure:
JSON1{ 2 "data": { 3 "12345": { 4 "status": "ok", 5 "output": "Job moved successfully" 6 }, 7 "67890": { 8 "status": "error", 9 "error": "Job 67890 cannot be moved" 10 } 11 } 12}-
"data" must be an object keyed by job ID.
-
Each entry must include:
-
-
"status": "ok" if move succeeded, "error" if not.
-
"output": scheduler output if successful.
-
"error": scheduler error message if failed.
-
The script must always print valid JSON to stdout.
-
Notes
HTML1- This is another action script in the same family as delete, hold, 2 release, suspend, resume, and rerun.HTML1- The JSON contract is identical across all of them — only the 2 underlying scheduler command and arguments change.HTML1- Multiple jobs can be moved in one call, and each job must have its 2 own result in the "data" object.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser( 8 description='Move a job to a different queue') 9 # ids is a list of job ids 10 parser.add_argument('queue', help='The queue to move the job to') 11 parser.add_argument('ids', nargs='+', help='The job ID to move') 12 args = parser.parse_args() 13 if len(args.ids) == 0: 14 return 15 results = {} 16 for id in args.ids: 17 # Run the qstat command 18 # can we combine streams? 19 # don’t use check=True so no exception is thrown 20 result = subprocess.run( 21 ['qmove', args.queue, id], 22 capture_output=True, 23 text=True 24 ) 25 if result.returncode == 0: 26 # command succeeded 27 resultsid = { 28 'status': 'ok', 29 'output': result.stdout 30 } 31 else: 32 # non-zero return code → failure 33 resultsid = { 34 'status': 'error', 35 'error': result.stderr 36 } 37 output = {} 38 output"data" = results 39 print(json.dumps(output, indent=4)) 40 41if __name__ == "__main__": 42 main()
Hold Machine
To integrate correctly with the software, a script or program that places one or more machines (nodes) on hold and returns results in the required JSON format must be provided.
Functional Requirements
-
The script must accept one or more machine IDs as input arguments.
HTML1- Example usage:./holdMachine.py bigrig bigmac
-
For each machine ID provided, the script must attempt to hold the machine.
HTML1- With OpenPBS, the reference implementation uses:HTML1pbsnodes -o <machineid>- With another scheduler, substitute the equivalent “disable / offline / hold” command for nodes.
-
The script must not raise exceptions on command failure.
HTML1- Both stdout and stderr must be captured. -
The script must return results in JSON format with the following structure:
JSON1{ 2 "data": { 3 "bigrig": { 4 "status": "ok", 5 "output": "Machine bigrig held successfully" 6 }, 7 "bigmac": { 8 "status": "error", 9 "error": "Machine bigmac not found" 10 } 11 } 12}-
"data" must be an object keyed by machine ID.
-
Each entry must include:
-
-
"status": "ok" if hold succeeded, "error" otherwise.
-
"output": scheduler output if successful.
-
"error": scheduler error message if failed.
-
The script must always print valid JSON to stdout.
-
Notes:
HTML1- This is a machine action script rather than a job action script.HTML1- It follows the same JSON contract as the job actions (delete, 2 hold, release, etc.), so the consuming software doesn’t care if 3 the operation is on jobs or machines.HTML1- Multiple machine IDs can be processed in a single call, and each 2 must have its own result in the "data" object.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser(description='Hold a machine') 8 # ids is a list of job ids 9 parser.add_argument('ids', nargs='+', help='The machine ID to hold') 10 args = parser.parse_args() 11 if len(args.ids) == 0: 12 return 13 results = {} 14 15 for id in args.ids: 16 # Run the qstat command 17 # can we combine streams? 18 # don’t use check=True so no exception is thrown 19 result = subprocess.run( 20 ['pbsnodes', '-o', id], 21 capture_output=True, 22 text=True 23 ) 24 if result.returncode == 0: 25 # command succeeded 26 resultsid = { 27 'status': 'ok', 28 'output': result.stdout 29 } 30 else: 31 # non-zero return code → failure 32 resultsid = { 33 'status': 'error', 34 'error': result.stderr 35 } 36 output = {} 37 output"data" = results 38 print(json.dumps(output, indent=4)) 39 40if __name__ == "__main__": 41 main()
Clear Machine
To integrate correctly with the software, a script or program that clears one or more machines (nodes) and returns results in the required JSON format must be provided.
Functional Requirements
-
The script must accept one or more machine IDs as input arguments.
HTML1- Example usage:./clearMachine.py bigrig bigmac
-
For each machine ID provided, the script must attempt to clear the machine, making it available for job scheduling.
HTML1- With OpenPBS, the reference implementation uses:HTML1pbsnodes -r <machineid>- With another scheduler, substitute the equivalent “release / enable / bring online” command.
-
The script must not raise exceptions on command failure.
HTML1- Both stdout and stderr from the scheduler must be captured. -
The script must return results in JSON format with the following structure:
JSON1{ 2 "data": { 3 "bigrig": { 4 "status": "ok", 5 "output": "Machine bigrig cleared successfully" 6 }, 7 "bigmac": { 8 "status": "error", 9 "error": "Machine bigmac is not held" 10 } 11 } 12}-
"data" must be an object keyed by machine ID.
-
Each entry must include:
-
-
"status": "ok" if clear succeeded, "error" otherwise.
-
"output": scheduler output if successful.
-
"error": scheduler error message if failed.
-
The script must always print valid JSON to stdout.
-
Notes:
HTML1- This is a machine control script, like the hold machine script.HTML1- It follows the same JSON contract as both machine and job scripts 2 — so the software can process all actions uniformly.HTML1- Multiple machine IDs can be processed in one run, each with its 2 own result in the "data" object.
Example
Python1#!/usr/bin/env python3 2import subprocess 3import json 4import argparse 5 6def main(): 7 parser = argparse.ArgumentParser(description='Clear a machine') 8 # ids is a list of job ids 9 parser.add_argument('ids', nargs='+', help='The machine ID to clear') 10 args = parser.parse_args() 11 if len(args.ids) == 0: 12 return 13 results = {} 14 15 for id in args.ids: 16 # Run the qstat command 17 # can we combine streams? 18 # don’t use check=True so no exception is thrown 19 result = subprocess.run( 20 ['pbsnodes', '-r', id], 21 capture_output=True, 22 text=True 23 ) 24 if result.returncode == 0: 25 # command succeeded 26 resultsid = { 27 'status': 'ok', 28 'output': result.stdout 29 } 30 else: 31 # non-zero return code → failure 32 resultsid = { 33 'status': 'error', 34 'error': result.stderr 35 } 36 output = {} 37 output"data" = results 38 print(json.dumps(output, indent=4)) 39 40if __name__ == "__main__": 41 main()