You could do something like the following:
- Create a new topic like "WORKSPACE_FAILURE_RESUBMIT"
- When starting all the server jobs, make sure they're configured to notifiy this topic if they fail.
- Create a Workspace subscription on this topic that triggers a new workspace, e.g. ResubmitJob.fmw
The workspace ResubmitJob.fmw could go something like this:
- Read the timestamp for last startup in a small json file or text file
- Use the FME Server rest API to get all the failed jobs since the last startup time
- For each failed job, request detailed job information which will contain information about published parameters
- Use the FME Server REST API to resubmit the job with the same parameter values
- Save the timestamp with the startup of this workspace to your json or text file referenced in the first point above
Sounds like something that could make for a very nice addition to FME Hub!
API call to get a list of all the failed jobs:
/fmerest/v3/transformations/jobs/completed?completedState=failed
API call to get the job details, including published parameters, for job ID 123:
/fmerest/v3/transformations/jobs/id/123/request