Building resilient and highly available printing
Here at PaperCut we understand that no two networks are the same and no two customer requirements are ever the same. With different needs, budgets, infrastructures, and implementations there is not a one-size-fits-all solution to building resilient and highly available print environments.
We have been in the print industry for over 20 years and we have worked hard to always ensure that we have developed our solutions with flexibility, simplicity, and choice in mind — to make sure that we can fit into all of these different environments and give you the right solution for your needs.
PaperCut has been implemented in sites ranging from one user through to over one million users, one printer through to thousands of printers, one office to multi-country enterprises, on-prem and cloud servers, the list goes on! During this time we have learnt a lot about what is important when designing the right print implementation for you and your end users and we want to share our knowledge, tips, and ideas with you to help you plan the right solution to meet your needs.
We will assume that you, too, have been asked many questions around providing a Resilient or Highly Available (HA) printing solution and you are wanting to understand the options that may be available to you along with the benefits for these options. Well you are in luck. We are going to dig into the different HA options and resiliency solutions that you can employ with your PaperCut installation to give you the confidence that your systems are protected to the appropriate levels.
To make it easier for those looking for specific information, feel free to jump to the relevant section for your needs. Otherwise we will work through the various options and discuss them, going from the simplest through to the most complex, to give some thoughts around the different solutions available. At the end we will run through the 5 top tips we have around building a resilient and highly available print network.
Now, let's dig into the fun stuff.
This is really like a fact finding mission for you, and the information you find can be applied not just to your print environment, but to every aspect of your end users requirements.
Forming a good Business Continuity Plan (BCP) starts with communicating with each of the department head's within your organization to understand their needs. From a print perspective, the questions should revolve around their print/copy/scan/fax/etc needs. You should also question what the impact of these services being unavailable would have on their daily work, what alternatives could they have to use during an outage, and how long they could cope without these services being available.
The information should be fully documented and provided to each department detailing the steps to take in the event that these services were unavailable.
This then feeds directly into your IT Disaster Recovery Plan (IT DR Plan) — the Systems Administrator playbook. The IT department uses it as the recovery manual for a step-by-step process to restore services to end users.
It's important that the IT DR Plan is written and developed based on the Business Continuity Plan, as the acceptable downtime and data loss will determine the appropriate level of data/system protection needed.
Say hello to a Systems Administrator's best friend. This is truly the entry level into building a resilient print environment. A good backup strategy can go a long way. As I'm sure you all know, there is so much flexibility in designing what will be the right backup strategy for you. With options ranging from fast backups via Differential and Incremental, which in turn gives a longer restore time, through to a slower backup via a Full backup, which gives a fast restore time.
PaperCut NG/MF also has built-in backups that can be run weekly. This backup will create a backup file of the database, which can be used to restore to a new server in the event of a catastrophic failure or corruption. If you choose to use this backup, ensure that you store this weekly backup in a safe location and not on the PaperCut Application server.
This backup will give you some protection for the database to prevent total loss of data, but you should use it in conjunction with a server backup strategy, which gives you several layers of protection and a more frequent backup process to reduce the amount of data loss.
- Easy for the System Administrators to implement and maintain.
- Fits easily into standard IT DR plans for all servers.
- Low learning curve.
- Potentially cheap solution, with the option to use a NAS as the backup location.
- Scheduled backup testing needed ongoing to validate solution working.
- Relies on backup hardware to work.
- Can be time consuming to restore.
- Can be subject to data loss depending on backup schedule.
- If corruption occurs during backup, this can be unknown until time of restore.
The introduction of VMs onto physical servers changed the way of working for most organizations. Prior to VMs, a single physical server generally was used for one, maybe two uses. This was a waste of resources in most cases. VMs changed the landscape by allowing multiple servers to co-exist on the same physical machine, but be segregated, so they acted as if they were the only machine on the physical server.
This meant greater flexibility, better resource usage and greater options for Systems administrators. This then gave us the options for VM snapshots, which is a point in time backup of the system that can be obtained quickly, easily, and provide a simple roll back procedure if an upgrade or changes went wrong.
- Quick and easy to obtain snapshots and roll back to previous versions.
- Built in as an option for most VM solutions.
- Allows for multiple snapshots to be obtained each day.
- Great to capture prior to a change to a production system.
- Storage for the snapshots may become costly, depending on how much space is required to store them.
- Still a manual process to restore a VM.
- As this is a backup of the system, it can still allow for data loss, especially if used on the Application Server or Database server.
- Often reliant upon an external HDD solution such as NAS/SAN if keeping a large number of snapshots.
Spinning up a stand-by system is another simple way to provide you with some level of basic redundancy. This is a simple case of having a machine ready to be plugged in and started up to restore print services. This is not a personal favourite, but can provide some level of confidence that you can get the system operational again quite quickly. This solution would usually be suggested for more static systems, but could be used for a secondary print server, that is only being used to process print jobs and not manage or store any critical data. A site server could also be configured in this manner, but upon powering up your stand-by system, be aware that a site server will instigate a database syncrhonization prior to being available for use.
- Can allow a quick restoration of print queues.
- Easy to implement and easy to instruct anyone on how to replace a failed system.
- Machine is always in a ready state, just needing to be turned on.
- A physical machine is required to be on standby. This is a waste of resources for most sites.
- Will only be appropriate for secondary print servers or potentially a site server.
- Will need to be updated/maintained if changes made to the active print server.
- Can be costly depending on the specs of the print server it is in standby for.
Up until this point for the most part we've been talking about HA solutions that aren't offered with PaperCut NG/MF out of the box.
Enter Site Servers. Introduced in PaperCut version 16.0, Site Server offers a way to protect users from losing access to their MFDs off the glass functions in the event of a network disruption from the PaperCut Application server hosted in another location.
A Site Server also helps to keep print jobs locally on the site they were printed at and reduce the bandwidth load on their WAN links.
- Allows printing and photocopying to continue in the event of a WAN outage.
- Helps keep jobs local to the site in which they were printed.
- Keeps a record of the end users credit, to help ensure that they do not exceed their limit if their account is a restricted account.
- Still reliant upon a physical server to run. This means that the server will still need some protection to prevent a hardware failure stopping the server.
- Is a resilient option, so any services such as Integrated Scanning or print job archiving will not work during an outage.
Network Load Balancers (NLBs) have been around now for a number of years and are starting to become more popular as the versatility and costs start to make them more accessible to smaller organisations. Essentially a NLB allows incoming traffic to easily be routed to multiple backend servers. This allows for scalability and some level of redundancy within the network.
From a print perspective this becomes a fantastic tool to allow System Administrators to take control of the incoming traffic and easily scale up and down the backend servers to meet the print management needs.
Taking this to the next level with PaperCut, it allows for multiple print servers to be monitored and tracked while providing a full range of print services to the end users. Each backend server would be configured as a secondary print server, with the same print queues advertised. If running as a VM, an image could be taken of one of the print servers, to be used as a template to spin up additional servers as required.
You can also configure each server to run as a Mobility Print server to cater for you BYOD users. The biggest benefit comes from a single print queue being advertised via the load balancer, but in actual fact the print jobs are spread across a range of servers. Therefore the failure of one print server, will not affect the end users ability to be able to print. In most cases they will actually not even be aware of this failure having occurred.
For more information around how PaperCut can work in a NLB environment, please check out our Network Load Balancing Knowledge base.
- Single print queue presented to end users, but actually load is spread across all print servers. Reducing server wastage.
- Scalable, to allow print server to be added/removed without any user downtime.
- Can be difficult to configure and troubleshoot, depending on implementation.
- The failure of one print server will, in most cases, be unknown to the end users.
- Can help to segregate the printer VLAN from the end user VLAN.
- Reliant upon the Network Load balancer to work. If the NLB fails, then this could take out all printing.
- Can be expensive, depending on the NLB purchased.
Server clustering is an industry standard for helping to protect the important servers in your environment, by allowing multiple physical machines to work together to ensure server uptime.
By putting your PaperCut Application server / PaperCut Site Server and / or your PaperCut Secondary server into a clustered environment it allows for the scenario where you have a physical server failure or even a Data Centre Failure (if you are running your servers from multiple data centres).
Clustering can be configured to have multiple physical servers configured to run the PaperCut Application / Site servers / Print servers locally on the physical machines or to use the multiple physical servers to host the PaperCut Application server / Site servers and / or print servers in a clustered VM environment.
If you are looking at setting up a clustered environment you will need to determine which servers need to be protected and how you want to protect them.
Most customers will run an external clustered database as well as a clustered PaperCut Application server to ensure that the core aspects of the PaperCut software is protected (as the database is where the tracking, charging, reporting, and configuring is saved). But a cluster configuration can be as complex as you want it to be. With the ability to cluster all print servers in your environment to ensure maximum uptime is achieved. It really comes down to understand the business needs and the trade off associated with maintaining extremely complex systems.
- Can provide high levels of uptime.
- Can provide automatic failover and failback solutions.
- Can cater for whole site outages.
- The failure of one print server will, in most cases, be unknown to the end users.
- Can protect against data loss.
- Can be complex to configure correctly.
- Can be time consuming to maintain, document and upgrade.
- Requires a higher level systems admin knowledge.
- Can be expensive.
Public cloud hosting takes away some of the headaches related to clustering. This is because when you are using a public cloud, such as Azure, Google Cloud, AWS, etc. their cloud infrastructure already works in a cluster configuration. Their platforms are already duplicating your servers across their server environment to ensure that one failed data centre will not take your servers offline.
So considerations for a cloud hosted solution is not really around protecting the server, so much as it is protecting the underlying network environment.
For cloud hosted solutions we see a couple of different implementations across most of our customers to protect their print environment.
Option 1: All print servers are hosted in the cloud
For this type of scenario as discussed you don't need to worry too much about making the server highly available. But, you will need to consider all other aspects related to your print environment.
To ensure the security of your print jobs and to allow the PaperCut server to resolve your internal printers you should look at running a VPN from site to cloud. This will ensure all transmissions are encrypted and can allow for some compression to be applied.
You will need to consider protecting your internet connection. No Internet means no access to your cloud printers. Thinking about backup Internet connection can help here for redundancy. If you choose to do this use a different provider for the two connections. That way if one provider has an issue the other will keep you online.
Sending print jobs up and down across an Internet connection can create some job latency, due to the size of the spool files. As discussed above using a VPN can allow for some job compression to improve this, but will likely still have some overhead.
Option 2: Host a Site server / Print server on premises
If you choose to host a site server or print server on premises then you gain some advantages over having everything in the cloud.
Print jobs can be kept local to your site. This will prevent the need for large spool files from being sent across the internet and back to site again. This will help remove the latency issues.
Running a print server onsite will allow you to configure the system to allow printing to continue in the event of an Internet Outage. If you choose to run a Site Server, you will also then be able to keep your MFDs working to track photocopying on top of the printing.
- For Public Cloud the servers will already be in a Highly Available state.
- Less cost for hardware purchasing and maintenance.
- Via VPN, can make server available to end users anywhere in the world.
- With Site/Print servers can still allow printing even during Internet outage.
- May require increased bandwidth to manage the print jobs in the network if all servers in the cloud.
- Greater reliance upon Internet connection. Which may require a second connection into the business.
- Consideration for privacy
We have run through a number of different options that are available to allow you to protect your print environment from failures. The options that we have discussed are not solutions that have to be used as stand alone solutions. In fact, you would usually use these solutions in conjunction with each other to ensure that you provide your network with the greatest resilience possible.
A good IT Disaster Plan will allow you to recover from a failure.
A great IT Disaster Plan will be well thought out with a solid understanding into the business needs with well-documented steps that can be easily followed to restore services after a failure.
Many customers that we have choose to implement a multi-tiered level of protection. A basic solution will use VM Snapshots in conjunction with an effective backup strategy. A more complex solution may implement a Clustered environment for the PaperCut Application server and external database in conjunction with a load balanced print server environment. Which provides redundancy, server protection AND scalability.
Which leads nicely into Top 5 Tips...
1. Understand the business needs
Before you decide on an IT Disaster Recovery Plan it's important to understand the business needs. There is no point in deciding that a simple backup/restore procedure is the right DR solution if your business decides that they cannot be without print services for longer than five minutes. There is also not much point in setting up clustering if your business decides that eight hours is acceptable downtime.
2. Do not over engineer the solution
As IT Professionals we are often subject to suggesting complex solutions to our needs due to our desire to learn something new and keep developing our skills. A fully redundant print network often desirable with heaps of learning and skills development to be achieved. However, knowing and implementing the right solution for the business is what you need to remember, because it is not only the implementation you must consider but the cost and ongoing support that will be required to maintain that system. Including training others to troubleshoot and maintain.
This is a step that is often forgotten about or left as a to do later. Whatever solution you decide to implement into your environment ensure that you have great documentation, including diagrams and troubleshooting/recovery steps. Creating this documentation as you implement will ensure the highest success at capturing the important information.
4. Share the knowledge
This is the second step that is usually overlooked. Once you have a working solution in place ensure that you share this knowledge with your team. Shared knowledge will ensure that no one person can be a single point of failure. We once saw a highly complex H/A solution implemented at a customer site that was bullet proof with no single point of failure, or so they thought. The Systems Administrator that implemented the solution left the company and took all of the knowledge with him on how their solution was set up. The result: An issue arose with one of the clusters that no one knew how to identify and resolve, with a massive downtime incurred. The customer had to remove this H/A solution and implement something simpler that they could actually support and maintain.
5. TEST, TEST, TEST
Whatever solution is implemented to protect your print environment ensure that you schedule in time on a regular basis to validate that your solution works and still meets business needs. Too many times a solution is put into place and then forgotten about until something goes wrong. If the business needs has changed or there has been a change in the Infrastructure, identifying that the current solution does not meet the needs is too late after a problem has occurred. Depending on your environment, you should be reviewing this at least annually and testing at least bi-annually.