Failure mode’s a godsend for keeping your print activity humming along when pesky connection errors hit.
Which is why I wanted to dive into how it works using the PaperCut Print Provider installed as part of the Application Server, a Secondary (or Site) Server, or our Direct Print Monitor.
It all leads to oodles of juicy questions: what’s the impact when the Application Server’s unavailable and failure mode kicks in? Is there a time delay? What’s the difference if you use a Site Server? And so on.
It could be tough, but let’s give answering them a go. After all, failure’s the key to success. Or something. I swear I heard that somewhere.
A failure mode story
Let’s start by describing a possible scenario: You’ve got a customer with a PaperCut MF or NG Application Server installed in their data center.
They have a remote office where a print server isn’t practical, and they’ve installed our Direct Print Monitor tool on each user’s computer. Happy days – everything’s working.
When the customer returns from a weekend break, they find the local water company’s dug up the road outside their office, and drilled through the link to their data center. What a way to start a Monday! 😞
But what about that urgent meeting the customer has with their client today? How are they going to print the handouts? No link to the data center = no printing, right?
Well, not quite. We’ve got a way to handle this by using our failure mode feature for each printer. Excellent.
As you can see, there are three different options for failure mode. An administrator can choose the best option for them on a per-printer basis:
- Allow new print jobs to print but do not log (default)
- Allow new print jobs to print and log after reconnection
- Do not allow new print jobs to print but hold and wait for reconnection
Each of these options offer different compromises, and the best option depends on the needs and priorities of a particular customer.
For example, if it’s important to never interrupt printing, choose either option 1 or 2.
If it’s important to strictly enforce quotas (i.e. allow the job to be canceled if they don’t have enough quota), and it’s acceptable to delay printing until the connection is re-established, choose option 3.
To dig into the details of what each option does, check out the manual.
What’s happening behind the scenes
Let’s look at what goes on under the covers when the Application Server becomes unavailable and causes these failure modes to be triggered.
We’ll start with the user sending their print job to a direct queue (i.e. it’ll print straight out).
While it’s spooling to the print queue, the Print Provider will ping off a pre-notification to the Application Server to let it know a job’s being submitted. We don’t need any feedback from this.
It’s a simple heads-up: we’re going to be chatting to you about this job shortly, which you’ll be able to see in our debug logs
2018-08-22 12:38:17,400 DEBUG: xmlrpc.c:282 - Calling print-provider.processPrintJob6 (details of the print job here) on server: 10.10.10.2:9191  2018-08-22 12:38:27,415 DEBUG: os.c:1174 - Timeout connecting to 10.11.183.111:9191  2018-08-22 12:38:27,415 DEBUG: PrintMonitor.cpp:2406 - Unable to contact server for pre-notification. JobId: 7 
Fun fact! It also triggers the desktop pop-up at this point for users who are required to choose a shared account, or charge to their personal account via pop-up confirmation.
This is why you sometimes see the ‘…’ ellipses for pages and cost, because it’s waiting for the full details of the job to arrive from the Print Provider.
Once the job has fully arrived at the print queue, the Print Provider will analyze the job for all the details – whether it’s color or grayscale, who the user is, etc.
Next up, with the analysis complete, it’s time to ask the Application Server what to do with this job. We already know our connection’s down, but the Print Provider doesn’t… So what actually happens?
1. The Print Provider attempts to ask the Application Server what to do with the job while it’s currently paused.
2018-08-22 12:38:27,415 DEBUG: xmlrpc.c:282 - Calling print-provider.processPrintJob6 (details of the print job here) on server: 10.10.10.2:9191 
2. The connection to the application server fails.
2018-08-22 12:38:37,423 DEBUG: os.c:1174 - Timeout connecting to 10.10.10.2:9191 
3. We retry calling the Application Server two more times (here’s hoping).
2018-08-22 12:38:37,423 DEBUG: http.c:63 - http call to 10.10.10.2:9191 failed, retrying soon (retry 1 of 2)  2018-08-22 12:38:50,159 DEBUG: os.c:1174 - Timeout connecting to 10.10.10.2:9191  2018-08-22 12:38:50,159 DEBUG: http.c:63 - http call to 10.10.10.2:9191 failed, retrying soon (retry 2 of 2)  2018-08-22 12:39:03,786 DEBUG: os.c:1174 - Timeout connecting to 10.10.10.2:9191  2018-08-22 12:39:03,786 ERROR: PrintMonitor.cpp:3061 - Unable to contact server with rpc_process_print_job: 10.10.10.2:9191. JobId: 7 
4. At this point, we give up trying to connect, and print the job according to the failure mode.
2018-08-22 12:39:03,786 DEBUG: Printers.cpp:1217 - Resume print job. JobId: 7  2018-08-22 12:39:03,786 DEBUG: PrintMonitor.cpp:3287 - Finished processing job 7 on printer ‘executives printer’. Job Threads: 1, Total Jobs: 1 
Now, why was it worth running through this in so much detail?
Your eagle eyes may have noticed this doesn’t all happen within a couple of seconds – not everyone’s network is lightning fast, so we allow up to 10 seconds for a reply to come through.
If we add up this whole process, it takes up to 50 seconds (allowing a gap between requests) for the failure mode to kick in, and the job to be printed.
To add to this, we don’t have any way of knowing when this connection comes back online, so we’re going to need to run through this process for every job.
Let’s loop back to the goal here, though:
- Does the user get their print job? Yes, they do
- Does it take a little longer to print? Sure, it does – but how often are we expecting this to happen? Which brings us to…
- If the network’s unstable, maybe a Site Server’s needed.
Life with a Site Server
So, what’s different when the customer has a Site Server?
First, an interesting fact: did you know when you install a Site Server, the Print Provider is automatically configured to talk to the Site Server?
Under normal operation, when the Print Provider makes its initial connection, the Site Server will reply, telling it to talk directly to the Application Server.
When disaster strikes and the Print Provider is no longer able to talk to the Application Server, then the same process from earlier kicks off.
The pre-notification, the three attempts to connect to the Application Server – this all happens in the same way, in the same amount of time.
It’s what happens next that highlights the benefit of the Site Server. The Print Provider will know there’s a Site Server available, and – having failed to connect to the Application Server – will now switch to using the Site Server.
This successful connection will allow the job to be processed and printed (or in the case of a secure release scenario, held and released by a user at an MFD).
To make it even better, the Site Server will also adjust the configuration of the Print Provider so that all further jobs are sent directly to it, avoiding the wait while we attempt to connect.
Once the Application Server is back up, we’ll silently switch it back and everything’s back to normal. Simple.
So that’s a look behind the curtains of failure mode within PaperCut MF and NG. I’m off to brew a coffee and keep my fingers crossed the local water company isn’t planning any work.