Forty Two

42 Printer Languages and Counting

Is it a coincidence that PaperCut currently supports 42 printer description languages , that I am turning 42 years old this year and, as we all know, 42 is the Answer to the Life, the Universe and Everything? Yeah, it probably is :-)

Since joining the PaperCut team, I have mainly been working on the area of PaperCut responsible for the analysis of print jobs. I have come from a background of programming in the computer language, C, doing file system programming in the Linux kernel. I worked for SGI on the XFS file system team. The job analysis code in PaperCut is entirely written in C which has given me a chance to sink my teeth into something quickly. Luckily for me it is also well written C code and has been designed to be easily extended for new print languages and print drivers.  Every few weeks printer manufactures release new print drivers and these need to be tested with PaperCut.  Changes are always occurring.  Sometimes these changes are minor and just a little bit of tweaking is required, while others are a lot more complex. In some cases we need to support whole new print languages.

Many of the lower cost printers these days are using GDI drivers.  These are drivers that don’t support known popular standards like PCL or PostScript and implement their own protocols – often not documented nor following any existing standard. In order to handle new GDI printer languages, it’s a bit like playing a game of detectives. By reverse engineering, we try out various sample documents, changing different page attributes and work out how the driver encodes the various attributes (e.g. duplex, gray-scale and paper-size). We have a set of pattern tools and a few techniques that help us perform this decoding process.  After a bit of teeth pulling we develop an algorithm to support the new driver.  The techniques are in some ways similar to those implied in reverse engineering file system formats!

After we’ve made changes to our print analysis code, we run our analyzer tool over a corpus of over thousands of real-world documents to ensure we haven’t introduced any regressions (steps backward).  The tools also check for memory leaks and test performance against previous test runs to make sure the software is not getting slower.

I hope you’ve enjoyed the behind the scenes look into some of the technical workings here at PaperCut. For a less technical look, make sure you check out what print management has to do with Coffee?

42 Image from Answer To Life / CC-BY-SA

Posted in General | Leave a comment

PaperCut Version 10.5 Released

The new Paper-Less Desktop Widget to track and compare paper usage.It’s been two months since our last release. One of the longest gaps we’ve had between releases for a long while. This is however to be expected as this is our largest release yet! It’s also one of our most innovative, pushing new ideas and concepts. This release contains many big ticket items voted for in the last few rounds of voting:

  • Watermarking and job attribution
  • Document digital signatures
  • Print policy popups
  • Multiple personal accounts
  • New desktop widget
  • … and much much more.

New & noteworthy in this release:

Watermarking, Job Attribution and Digital Signatures
Adding text such as a user name to the bottom of a page in a print job was one of our most voted for features through 2009 and 2010. We’ve taken this request and added some of our own innovative ideas to create the new watermarking and job attribution feature. It is now possible to add dynamically constructed text to the bottom of each page (e.g. username), set different font sizes, gray-level and position on page.

We’ve also extended the watermark to include support for digital signatures using a cryptographic HMAC based on SHA1 or MD5. Every document may have a unique signature which can be used to verify the origin and author of any print job. We’ve gathered feedback from a number of our larger corporate and government customers to design this feature and are very excited about the new document tracking possibilities it opens. Our view is that print management software should more than just tracking & reporting and we’re working hard to innovate is all areas.

Watermarking is currently listed as an experimental feature and currently only supports PostScript printers. Peter is working on PCL support and this is targeted for a subsequent release.

Print Policy
Print scripting now includes a standard corporate print policy recipe. This allows organizations to implement a print policy where:

  • users are reminded via a popup to print duplex (and must opt-in to print simplex)
  • printing emails is discouraged
  • printing web pages in color is discouraged

Multiple Personal Accounts
Users can now have more than one personal account. At a simple level, this can be used in education environments to separate free print quotas from cash payments, for example, allowing simpler management and reporting. At a more advanced level, multiple personal accounts can be combined with print scripting to allow different departments to manage their own pot of funds and determine on which devices this pot can be used. This feature has been developed in conjunction with Cambridge University in the UK with the aim of satisfying their complex inter-college and inter-department environment.

Multiple Personal Accounts - ideal for higher education

Ad hoc bulk user actions
Ad hoc bulk user actions has been one of the top voted for features for the past few months. Priyanka has done a great job and she’s worked had to get this into this release.

A new environmental impact desktop widget
We’ve worked with Do Something, the non-profit organization supporting the Paper-Less Alliance, to bring this innovative desktop widget to PaperCut (see screenshot above). The aim of this widget is to help organizations reduce paper by arming users with information. Users can also benchmark their use against the organization average. You can download the widget here.

The widget is also used a fund-raiser. Organizations looking at deploying this widget are encouraged to make a donation of $0.99 per user with all proceeds going through to Do Something to help implement paper saving and environmental initiatives.

Re-sending data after connection failure
We’ve added new code to handle exceptional cases such as network connections failing between servers – for example when PaperCut is used over a WAN. If the connection temporarily fails, PaperCut can now be configured to locally record transactions and re-send them across when the connection comes back up. Read more here.

We hope you enjoy the bag full of new features. We love hearing your feedback so if you have any comments or suggestions please do let us know. For the full list of changes see the release history and get your downloads here. We’ll keep you posted about features for the next release on our blog and twitter feeds.

Posted in General, Releases | 2 Comments

Five Secret Power Features

Hit the power putton to turn on advanced print management features.

Power Features!

One of my roles at PaperCut is providing technical support by working directly with you to find the specific features required to resolve your print management problems. As each individual site deploys PaperCut to address their unique issues, I receive suggestions for new product features and enhancements. Many of the ideas that are sent to us have been developed into features and are available in PaperCut. In the past year over 100 new features and enhancements have been added in 14 version releases, and we have more on the way with version 10.5! The ever growing list of is chronicled in our Release History, news feeds, blogs and twitter.

It is difficult to predict which of the many features will become the most popular, but I would like to share with you my personal list of secret power features. These are feature that are off the beaten track, but are received with great enthusiasm when I explain them to customers. Many of the features are not new, but provide critical functionality for a site once they are discovered and implemented.

  1. PaperCut can stay synchronized with the Office and Department fields in Active Directory or LDAP allowing you to create reports to compare printing within or between offices and departments such as the Department printing – job type summary report.
  2. Administrators can receive automatic email alerts on printer error conditions that contain information including the error type (e.g paper jam, toner low), time that the error was first reported, location and number of jobs in the queue. Here is a link to the manual section that covers System Notifications.
  3. Print Scripting was introduced earlier this year and has had many “Recipes” and “Snippets” added over the last few months. Scripts can be used to provide precise control of print job handling including configuring PaperCut to perform least-cost-routing, print job redirection, and environmental warnings based on Group, job size, time of day and many other criteria. Maybe you wish to remind people to print duplex/double-sided or even stop printing of emails. This level of print management (print policy control) is all possible! From the Admin Console select the Printer tab then select a test printer or the Template Printer. Select the Scripting tab then the “Import Recipe” or “Import Snippet…” button to get a list of pre-built script and segments. There is also a summery of pre-build recipes at the bottom of this page.
  4. Web Print allows unauthenticated laptop computers that do not have drivers for your network printers to upload PDF and Microsoft Office documents to the PaperCut server where they can be printed and tracked to the user’s PaperCut account. This feature has extended the campus print infrastructure to include student laptops in dorms and other wireless access areas. You can even allow students to select the destination printer from a map, floor plan or site plan.
  5. There is a version of PaperCut that is available from resellers and Authorized Solution Centers (approved resellers) that can track off the glass copy, fax and scan images using embedded software that connects the multi-function device to the PaperCut Admin Console for consolidated control and reporting with the network printers. In addition, PaperCut MF can use multi-function devices as Release Stations for network printing in a secure printing or find-me printing configuration. Other hardware devices such as Pay Stations can also be integrated with PaperCut MF.

Feel free to comment with any of your own favorite power features.

Image courtesy of schani on flickr

Posted in General, PaperCut Tips | 3 Comments

Quick update on pending 10.5 release

Ticking watch

Just a minute!

We’ve recently had a few people contact us via Twitter and email asking about the 10.5 release. We’re running a little late behind schedule, however this is for good reason. This is likely to be our largest release in terms of features yet. Hence testing and feature finalization is taking a little longer than expected. Some of the highlights in the release include:

Watermarking:
The ability to add some text to the bottom of every page printed. This text is configurable in terms of content, font-size and color. Typical uses include:

  • adding student names or student numbers to the bottom of their print jobs
  • writing job metadata in the footer such as print time, printer, document name, etc.
  • add a digital signature (SHA1 or MD5 HMAC) to all pages allowing you to track documents and verify authenticity/originality/source.

Multiple Personal Accounts:
It will now be possible for users to have more than one personal accounts. This feature has been developed in conjunction with Cambridge University in the UK. A typical use would be splitting student cash payments and free print quotas into separate buckets to make refund management easier. However in large organizations such as Cambridge it can be used, in conjunction with print scripting, to allow different departments/groups/colleges to manage their own print credits on their own printers.

Re-sending data after connection failure:
We’ve added new code to handle exceptional cases such as network connections failing between servers. For example say you have PaperCut installed on a business WAN with print servers spread across geographic regions. If the connection temporarily fails between offices, PaperCut can now be configured to locally record transactions and re-send them across when the connection comes back up.

All the three features listed above have been on the top of the vote list for many months. It will be great to have them released. And don’t forget that we’ll always include in many, many minor improvements and bug fixes.

We’re working hard to get the release out next week and will keep you posted on progress via our twitter feed.

Posted in PaperCut Updates | 5 Comments

25 Years of Digital Printing

A Printing Press

A Printing Press

Printing is a gigantic industry. It employs about 1 million people in the USA, in contrast to the approximately 800,000 working in the automobile industry. It is a slow-growing, traditional industry.

When counted by number of pages printed, most printing still takes place on printing presses like the one on the right.

The following table and pie chart, based on numbers from an American Printer article, show the state of the US printing industry in 2004.

Printing Industry by Segment

Printing Industry by Segment

Segment Size ($US billion)
General commercial 53
Package printing 38
Specialty printing 10.5
Catalogs/directories 10
Forms/labels 10.5
Trade services 12
Newspaper 15
Direct mail 8
Inserts/coupons 7
Financial 5

Within this slow-moving industry there have been dynamic pockets of rapid growth and the adoption of modern technologies for the last 25 years. One of these has been digital printing.

Chester Carlson with his early printer

Chester Carlson

I started working in digital printing the early 1990s. Like many people my involvement in printing can be traced back to Chester Carlson. My first printing job was at a company that connected computers with high quality Xerographic (Carlson’s invention) digital printers and printed using PostScript (a printing language that was invented by John Warnock and Charles Geshke based on their work at Xerox’s PARC).

My employer was one of many small companies thriving in the market opened by the cheap Apple LaserWriter PostScript laser printer in 1985 and the cheap desktop publishing applications created by Apple, Adobe and other development companies. This market remains vibrant as work that was previously done in other parts of the printing industry moves to digital color printing and more corporate printing moves in house as office printers and office software becomes more capable. (There was a parallel and related trend of growth built around home computer printing and drop on demand (dod) inkjet technology but this post will focus on graphics and office printing.) A brief history of digital printing follows.
Continue reading

Posted in General | 5 Comments

A quick tip to keeping a DHCP network organized

I was recently helping a customer with a couple of reporting questions they had and they mentioned that they were going to be rolling out 20 new printers in the near future. I commented that this would require quite a bit of work, lugging hardware around, changing printer drivers etc. They agreed and then said that the worst of it was setting the static IP addresses of all of the new devices.

Having had to manage rollouts of small to large sizes I’ve come across this problem before and luckily found two features of a Microsoft DHCP server that can save a lot of time.

DHCP Reservations

Ultimately there is no difference between a static IP address and a DHCP allocated IP address that doesn’t change. The results are the same, you connect to the IP address and the same device is there each and every time. This is essential for servers, switches, routers, printers and more. In many cases setting the device to use a static IP address is suitable and possibly even best practice (routers in large networks for example).

However, if you have a large number of devices that require non-changing IP addresses, why not set your DHCP server to always give out the same IP address each time the device turns on or refreshes its IP address? Simply open your DHCP management interface (I’ll assume Windows here) and navigate to your Scope and then Reservations. From here you can create new reservations and all you will need is the MAC address from the printer. For example: 00:1D:09:FE:64:04 is converted to 001D09FE6404 and given then IP address of 10.1.1.12. Set the printer to DHCP assigned address and give it a moment.

A couple of seconds later you should be able to communicate with the printer on the new IP address.

The benefits of this are quite easy to see, imagine if you had to send a printer to a remote office where there are no tech-savvy users. Grab the MAC address of the printer, create a new DHCP reservation, create the new print queue and share it out. When the printer gets out there, simply get someone to plug it in to power and network and you’re done.

If you replace the device (hardware does fail unfortunately), simply change the MAC address in the reservation then replace the old hardware. No fiddling with the interface panel on the printer required.

DHCP import/export

“That’s nice.” I hear you say, “But it doesn’t help with these 20 devices, I’m at the printer anyway, might as well do the time there.”

This is quite valid, except you can export/import DHCP settings via the command line enabling you to use your favourite spreadsheet program to manipulate the data.

Running the following command will give you a simple dump of the existing DHCP setup.

netsh dhcp server 192.168.0.1 dump > C:\Dhcp\Dhcpcfg.dmp

Inside you will see commands like:

dhcp server \\10.1.1.1 scope 10.1.1.0 add reservedip 10.1.1.12 001D09FE6404 "printer-library" "" "BOTH"

By adding “netsh ” to the start of those lines, you have a pre formatted script to add DHCP configuration settings.

Congratulations! You’ve now set a reserved IP.

If you wanted to get creative, you could export your existing DHCP reservations, load the file into Microsoft Excel, modify the MAC addresses for the relevant devices and re-import it to your DHCP server. 30 seconds per printer by this method, or several minutes standing at the print queue.

Remember, this is just for printing, there is no reason you can’t use this for new servers, printers, switches, access points, specialized desktops and more.

For more netsh DHCP commands see Microsoft’s TechNet website:
http://technet.microsoft.com/en-us/library/cc787375(WS.10).aspx

For more information in general about netsh commands see Microsoft’s Help and Support website: http://support.microsoft.com/kb/242468

Hope this post is useful. Best-practice printer management on a large sites not just about PaperCut. It’s also about your network management practices!

Posted in General | Leave a comment

Scale up, but don’t skimp

CPU

I recently helped out one of our biggest corporate customers to resolve issues with their print server. During the the last week of the financial year (when printing load is the highest) their print server became overloaded and stopped working. This sounds bad, but we quickly got things working smoothly again and learned that …

PaperCut scales incredibly well if you allocate appropriate system resources!

This customer had been running PaperCut for about 6 months without issue. Over this period they were gradually transitioning 100s of print queues from legacy print servers to the server hosting PaperCut. This single print server was hosting all queues for their offices country-wide. The extra load of these additional print queues combined with the end-of-year printing load pushed the server to the limit.

When analyzing the problem I noticed that this server was handling a huge print load. In the 30 day period prior the following printing occurred:

  • 477,287 print jobs
  • 2,021,454 pages printed
  • Between 22,000 to 25,000 print jobs each week day

Wow! That’s a lot of printing!

They were also using hold/release queues and Find-Me printing (aka follow-me printing) to provide secure print release and to reduce paper wastage. The result was an average of around 500-600 print jobs waiting in the queue to be released.

The cause of the problem was under resourcing. Their setup was:

  • A single server hosting the both the print queues and the PaperCut application server
  • The server was a virtual machine assigned only a single processor
  • Allocated 3GB of RAM
  • Running on a 32-bit Windows Server operating system

My recommendation was to leave the print queues on the existing server, but move the PaperCut Application Server service to a server with 4GB of RAM, 2 or more processors, and running a 64-bit operating system with the 64-bit add-on pack. This configuration:

  • Spreads the load between 2 servers
  • Allows the PaperCut Application Server to take advantage of more memory (64-bit)
  • More available processors allowed efficient processing of simultaneous print jobs

Since making these changes, their system has been running very smoothly. Their servers are now handling more load than ever, and without overloading the servers.

If you’re managing a large PaperCut installation, and in particular leveraging some of PaperCut’s advanced print management features such as secure print release, then there’s a few lessons to take from this:

  • Don’t skimp on RAM or CPU resources
  • Monitor your servers. Particularly if you’re adding print queues and increasing print load
  • Consider running a 64-bit OS to allow for future expansion (e.g. more memory)
  • Run PaperCut on an external database like SQL Server, Oracle, PostgreSQL or MySQL

CC image courtesy of Emilian Robert Vicol on flickr

Posted in General | 3 Comments

What does print management have to do with Coffee?

Priyanka transitions from Java code to Java drink!

Priyanka transitions from Java code to Java drink!

The regular readers of our blog will have noticed a few off-topic posts slipping in from time to time. The common theme is coffee and beer. As a group of passionate computer programmers and tech geeks it’s no surprise that we have developed a strong corporate coffee culture. Coffee is our secret weapon! Over the past 10 years we’ve changed programming languages, compilers, and development practices, but one factor has remained constant: Coffee. It must be the pillar for PaperCut’s success.

Coffee is very much part of our culture. The company funds a continuous flow of lattes, cappuccinos and macchiatos (Hendrik’s favorite) all arriving from the coffee shop directly opposite the office. Most of us have espresso machines at home (e.g. Rancilio Silva) and discussions on brewing techniques seem to pop up in developer meeting agendas unannounced.

Recently management decided that attending a formal coffee barista course would be a good idea. Traditional businesses would have called this a “cooperate team building exercise”, however for us it’s “core competency training” :-) The whole Melbourne development team (minus Tom) spent a day at a coffee training academy learning the finer points of coffee production.

Lessons included:

  • The art of wasting lots of milk perfecting the perfect froth.
  • The amount of coffee one must waste to calibrate the ideal 25 second espresso pour.
  • Latte art: The art of convincing someone that the shape on the top of their coffee was deliberate.
  • How to make beverages unknown to computer programmers (chai lattes, and hot chocolates)

The day finished off with a competition. We paired up into teams and had to make 8 coffee variants in 8 minutes. Congratulations to Matt and Jason who took out the title.

To take a slight deviation, my favorite pieces of coffee trivia:

Overall it was a very fun day. We even got to walk away with a formal certificate – we’re now qualified Baristas! If we all get sick of writing print management software we now at least have a fall back option – open a Cafe!

Thanks to Jason for the great images!

Posted in General | 6 Comments

Highest vote takes it all

Pink Mouse I am one of the female developers here at PaperCut… well at the current time the only female developer! For the past few months most of my development has been driven by your votes in the feature survey. We are constantly analyzing the votes and using it to prioritize our development. I have already implemented various highly voted features such as:

  • the ability to edit scheduled reports.
  • the ability save scheduled reports to disk.
  • the option to create and manage printer groups (using a tagging paradigm).

The next feature I have my sights on will be “Adhoc bulk user updates” which is one of the highly requested feature at the time of writing. An MSI packaged client (and secondary/local print server installer) is another one on the horizon, but I suspect one of the “boys” might beat me to that one…

I will post updates on my progress as I move along. Voting will re-open soon. In the meantime here’s how voting stands for some of the top requests:
Results of voting

CC image courtesy of osde8info on flickr

Posted in PaperCut Updates | 5 Comments

Trailing Slashes On Your URLs: To Be, or Not to Be, is it a Question?

A few weeks ago we launched our new website design. Working on the website took up the majority of my time in the few weeks prior. As part of the redesign we added a number of new pages, removed some and moved others from one place to another. During this process a question was raised, “Should our URLs have trailing slashes?”. What we were talking about is this: http://example.org/page/ versus this: http://example.org/page . The former has a slash on the end, the latter doesn’t. Does it matter? Which one is better?

Does it matter?

slash

It certainly matters if you are allowing both URLs to access the same content. Google’s SEO Starter Guide says

“Provide one version of a URL to reach a document”

or risk splitting the reputation of the page between all the URLs used to access it. This means that at best it won’t matter for you. At worst you’ll have your page’s reputation diminished, and you might also be penalised for duplicate content.

The best option is to pick one scheme, use that scheme in all your links, and redirect users who access your pages using the other scheme.

So it probably matters. Now do we slash or not?

I lean heavily towards slashless URLs, but at first I couldn’t put a finger on why. “They just look right”. So I thought about it for a while, made a list of the factors I could think of, and wrote it up as a blog post.

Semantics

What does it mean for a URL to contain a trailing slash? For me, it traditionally means “this is a directory listing”. When performing a request on a URL that maps to an actual directory in the file system in absence of an index document most web servers serve up a listing of the files in that directory. This is similar to doing an ls or dir from a terminal. Does the same apply when accessing http://www.papercut.com/blog/? That page is serving up the X most recent blog posts, and the view might depend on whether or not you’re logged in. In my opinion it is not a canonical list of sub-items items being served up, so the directory analogy doesn’t fully hold.

The more interesting part of how slashes affect semantics is the format of the response. What’s actually happening when a web browser requests /blog is the web server recognising that no specific format was requested in the URL (i.e. there was no .html extension), but because it’s a web browser making the request it makes an assumption that it wants the response in HTML. On the modern web that’s no longer a given. We have .json, .atom, .rss, .xml and others. If a browser wants a view of something in a format other than HTML, it makes sense to allow that simply by adding the requested extension. E.g. http://example.org/blog.atom . If the URL had a trailing slash that would become http://example.org/blog/.atom, which looks horrible.

Reddit provides at least one example of where both a trailing slash and format extension are used together!

Role Models

What are the big boys doing? Google sometimes add slashes and sometimes don’t, but they pick one and use redirects to enforce it. Stack Overflow allows both (!), but their own links omit them. Wikipedia omits slashes and doesn’t know what you mean if you add one.

Ruby on Rails put a lot of thought into their URL routing functionality. The result is, in my opinion, the most intuitive and complete definition for URL schemes anywhere. It should serve as a model to other frameworks and to those creating URL schemes by hand. Oh, and trailing slashes aren’t used (unless you go through some extra work to add them back in).

Legacy

For us the main factor was legacy. Most of our website URLs would remain the same (we were just introducing some new ones), and they already had trailing slashes. Would it hurt to redirect all the slashful URLs across to slashless ones? The best we could come up with is “maybe”, but that was enough to can the idea. Redirecting one URL to another via a 301 redirect (“moved permanently”) is rumoured to result in the page’s reputation flowing to the new URL. In practise, and confirmed at least once by Google, some of that reputation will be lost.

We’ve done a similar thing once in the past when we switched our main domain from papercut.biz to papercut.com. We used 301 redirects for all our URLs but several pages lost some reputation (e.g. from PageRank 6 to 5). Any pages that lost reputation gained it back after a month or two, however.

Implementation

Ask four web devs how to implement pretty URLs or remove/add your slashes and you’ll get seventeen answers. We use Apache with PHP, and implement the redirection in our root .htaccess file via Apache’s mod_rewrite.

Firstly we provide access to .php pages using “directory naming”:

RewriteCond %{PATH_INFO} ^/$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule . %{REQUEST_FILENAME}.php [L]

(the above reads “if the requested filename doesn’t exist as a real file or directory, but adding .php on the end results in a real file, serve up that .php file (but don’t change the URL”).

Then we add a trailing slash if none was present in the request:

RewriteCond %{PATH_INFO} ^/?$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f [OR]
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule (.*[^\/])$ $1/ [R=301,NS]

(the above reads “if the requested filename doesn’t exist as a real file or directory, but adding .php or .html on the end results in a real file, if they didn’t add a trailing slash then send the browser a permanent redirect to add the slash”).

Search on this topic and you’ll find hundreds of ways to do similar things, many of which have subtle problems in certain situations. Ours probably isn’t perfect, but it’s been working for us so far.

Summary

  • Picking slashful versus slashless URLs probably matters. Pick one or the other, don’t allow both to return the same content.
  • Accessing a URL with a slash on the end is not fully analogous with performing a directory listing.
  • Slashless URLs look much better when you want to support multiple formats (/page.json not /page/.json).
  • In the wild, people do it both ways.
  • Ruby on Rails has a very nice and complete system for dealing with URLs, and they don’t use trailing slashes.
  • If you already do it one one, your pages will probably lose some reputation if you redirect them to the other way.
  • Implementation is a black art. Allow yourself time to understand the details.
Posted in General | Leave a comment