Can I prevent multi-client jobs from failing if a single client backup fails?

SBAdmin Support

Need Help? We got you covered.

Can I prevent multi-client jobs from failing if a single client backup fails?

I am running the SBAdmin Network Admin. I have 12 RedHat Linux clients defined. I have added all of the Linux clients to a single backup job so that each of the clients will backup using the same profile at the same time (I have it scheduled weekly). I noticed though, if one of the clients has a fatal error during the backup, the entire job fails and then none of the other clients will be backed up. I’ve seen this happen twice. The first time, one of the clients experienced a fatal error because /storix filesystem filled up. The 2nd time, the client could not communicate with the server/network admin due to a network issue. I realize that I could create individual backup jobs for each of the clients in order to work around this problem. But I think it would be better if SBAdmin would simply skip the problematic client and move onto the next one in order to complete the job. Is this possible?

ANSWER



The job processing does numerous checks to ensure that the client is available, the server is available, and that the client can contact the server before actually starting the backup process on the client. These checks are performed from the administrator system, and if any of these checks fail, the client is simply skipped, the other client backups continue, and the job will end with a warning message indicating that not all clients in the job were processed.

In addition, once the backup process is started on the client, it will exit with an error indicating whether the job should be continued or not. As long as there has been no attempt to write to the backup media or reposition a tape at that point, the job will continue with the next client. The exception is if the user configured a pre-backup program for the client which exited with a fatal return code.

The problem you had was likely due to the /storix filesystem filling up or a network error at some point after the client had already written backup data to the media. There is no way to continue the job at this point because doing so would likely produce an incomplete backup in the middle of the backup label (containing multiple sequential backups). This would make it impossible to append the next backup to the same media because the next backup would be in the wrong position (image number) on the media.

Also, if you were to create multiple jobs instead of including all clients on one job, the failed backup will still shutdown the queue, preventing other jobs from running. This is for the same reason – often muliple jobs are written to the same media. The advantage, however, is that you can fix the problem and re-start the queue which will continue the backups with the client that had failed. When putting all clients on a single job, there is no way to restart the job in the middle.