The PDFBox project is a well known and widely used Java library for reading and writing documents in the Portable Document Format (PDF). Here’s my perspective on the recent developments of the project.
The project was quite dormant when it entered the Apache Incubator about a year ago after we had discussed the idea first at the ApacheCon US 2007 and then on the Incubator mailing list. For a while it looked like project would remain quiet, but in the past few months we’ve seen a clear increase in project activity. Thanks for that goes especially to the contributions of the two new committers, Andreas Lehmkühler and Brian Carrier.
My main focus in Apache PDFBox has recently been the thorough license review that I’ve been conducting. Before entering the Incubator, the PDFBox library was liberally licensed under a BSD License. However, the copyright or licensing status of many external components included in PDFBox was neither well documented nor well understood by downstream projects. For example, PDFBox used to contain parts of the Java Advanced Imaging (JAI) library that is only available under the Sun Binary Code License, a license that is not compatible with Apache policies.
The license review has taken me through a number of legal issues, put me in contact with the Adobe legal team, and made me solve some followup issues. And we also took care of proper export control notifications needed for the PDF encryption support in PDFBox. Luckily the end is finally in sight, and I’m optimistic about having all the remaining open issues closed within a month or so. Altogether it’s been a very interesting and educational process.
With the license review nearing completion and lots of unreleased fixes and improvements accumulating in the project trunk, it is time to start preparing for the first incubating PDFBox release. This release will be called Apache PDFBox 0.8.0-incubating, and will be a major improvement over the 0.7.3 release from over two years ago. All downstream projects should seriously consider upgrading as soon as the release becomes available. It would be really great if the release was out by the ApacheCon Europe at the end of March.
As a mentor and champion of the project I am really happy with the current status. It seems reasonable to expect PDFBox to graduate from the Incubator sometime later this year.