2009-06-04

Commits per weekday and hour

The punchcard graphs at Github are a nice way to quickly detect the rough geographical distribution (or nighttime coding habits) of the key contributors of an open source project. Here’s a few selected examples from the ASF.

Apache HTTP Server

Apache HTTP Server

Apache Maven (core)

Apache Maven

Apache Jackrabbit

Apache Jackrabbit

2009-05-17

Would you trust a pirate?

Apparently they’re now setting up a Pirate Party also in Finland. I guess it’s good to have a political force that questions the appropriateness of traditional copyright in the digital world. However, as a knowledge worker I’m not that excited about drastic changes in the protection of immaterial rights.

Anyway, my appreciation for the movement in Finland went down considerably when I saw their spokesman in the news today. When asked about the main goals of the new party he only mentioned freedom of speech and protection of privacy. Did he just forget the massive overhaul of copyright and patent laws that they’re primarily after?

2009-05-10

Midgard: Where it all began

On Friday we celebrated the tenth anniversary of the Midgard project. The celebration took the form of a very nice gala evening with good food and drinks with live music, show and of course some speeches. I was asked to deliver a few words about how it all began for Midgard.

Here’s my speech, reconstructed from my draft notes and edited for the web audience:

We were a group of teenagers and young adults doing historical re-enactment and live action role playing games. One evening in early -97 we were sitting in a bus, returning from the woods with all our viking gear on. Bergie said to me: “Hey Yaro”, as I was known as Yaroslav at the time. “Hey Yaro”, he said, “you’re over 18 and you have a drivers license. Would you like to take a dozen teenagers to a trip to Norway and back?” Even back then Bergie was the one with big dreams and the power to inspire people. I had the skills required to make those dreams happen but not yet enough experience to tell that we perhaps should think twice. So I just answered: “Sounds cool, let’s do it!” That’s pretty much what happened also with Midgard.

The trip to Norway went well for us and was followed by a number of other adventures. One of them was our quest to build a better web site for our group. It was -97 and the web was booming. The de facto web publishing technology was FTP, that people used to push static HTML to a web server. Geocities was a major cool thing as it allowed you to publish your static HTML for free. We however had bigger plans and our own server running in the closet of a friendly internet company. And we were publishing lots of stuff: news, photos, articles, etc. Quite a few people were actively contributing new content to the web site.

Our first serious attempt at better managing the site was based on technologies called SGML and DSSSL. For the technically minded: nowadays you’d use XML and XSLT for similar tasks. We used this system to “cook” our content into nicely formatted HTML that was then served to the world. It worked pretty well, but was hopelessly too complex for almost all of our contributors. This was a time when people were only just discovering the Internet. Most of our contributors were teenagers who were using the net from libraries or schools. Internet connections with modems were only just finding their ways to normal households. Even FTP was often out of the question, so there was little hope of making the heavy SGML tooling work as well as we’d like.

We wanted a system that could be managed entirely through the browser. Not just the content you saw on the web site, but the layout templates and even the functional code used to list pages or to handle the forms for adding or modifying content. The system should allow you to build an entire web site, including all the administration interfaces, without any other tooling than a web browser. Such systems simply didn’t exist at the time and in fact they’re pretty rare even today.

So we had to build our own system. We looked at a number of potential platforms for something like this, and the LAMP stack seemed like a good fit. Our server already ran Linux and, like pretty much everyone, we used the Apache web server. We hadn’t used PHP or MySQL before, but they were getting some good press and were easy enough to get started with. In fact we hadn’t done much anything when we started: we hadn’t done Apache modules, we hadn’t extended (or even written!) PHP, and at the time I had only read about relational databases. As we used to say: “How hard can it be?” We didn’t know, and so we just did it.

The result of our efforts was called Midgard. We had used it to power our web site for about a year when Bergie was hired to build a new web site for a Finnish tech company. Midgard seemed like a good fit for that need, and we figured that also other people might find the system useful. Open source was cool and we wanted to join the movement so we decided to publish Midgard as open source. After nights spent researching licensing options, writing press releases, creating the project web site and setting up mailing lists and public CVS access we were finally ready to publish Midgard 1.0 to the world. That happened exactly ten years ago.

The 1.0 release was like the Land Rover it was named for. The magnificent car from -62, that we used on many of our trips, was really cool and when it worked, it did so very well. However every known and then it required some “manual help” to get it started or to keep it going. This was also the case for Midgard 1.0. The first external installation that I know of was done on a Solaris platform and required a few days worth of help and patches delivered over the mailing list before it was up and running. Much of that early feedback and experience was reflected in Midgard 1.1 that was our first release that people were actually managing to install and run without direct assistance. That started the growth of the Midgard community.

Meanwhile I had also been hired by the same company where Bergie worked, and much of our work there resulted in improvements to Midgard. Together with the feedback and early contributions we were getting from the mailing lists this made Midgard 1.2 already a pretty solid piece of software. It was fairly straightforward to install (at the standards of the time), it performed well and it had most of the functionality that you’d need to run a moderately complex web site.

And the results were showing. We were getting increasing traffic on the mailing lists, some companies would start offering Midgard support and the number of Midgard-based sites around the world was growing. One of my earliest concrete rewards for doing open source was a bottle of quality whiskey that some Midgard user from Germany sent me with a note saying: “Thanks for Midgard!” The whiskey is long gone, but I still treasure the memory. A few years later Bergie and a few other friends and Midgard developers went on to start their own company based on Midgard. I was tempted to join them, but at the time my life was taking  a different route and I gradually left Midgard to pursue other things.

Seeing the Midgard project take off and build a life of its own has been a very inspiring process for me. Having your first open source project become so successful is pretty amazing and also quite humbling. Looking at all the things Midgard is today fills me with pride of not what I’ve done, but of what you, the Midgard community, have accomplished. Thank you for that. Especially I’d like to thank my long time friend and co-conspirator in starting the Midgard project. Bergie, without your dreams and refusal to take  ”no” as an answer we wouldn’t be here today. Thank you.

2009-04-27

Content Technology at the ApacheCon US 2009

I’m putting together a plan for a Content Technology track at the ApacheCon US 2009 in Oakland later this year. The original plan for the track was focused on JCR and related stuff, but there’s some interest in expanding the scope to cover a wider range of things related to content management and web publishing.

The track proposal has been discussed on the Jackrabbit and Sling mailing lists, and people from POI and Lenya have chimed in with interest. I also contacted Wicket, Cocoon, JSPWiki and Roller about their interest, and the initial feedback seems good. Any other projects I should be contacting?

I’m not sure how this works for the conference planners, who are probably facing some real deadlines in terms of fixing the conference schedule and contacting selected speakers. Let’s see how it all plays out.

Update: Added JSPWiki and Roller.

2009-04-04

One month, five languages

The past month was probably the first time in about 20 years when the number of natural languages I used was greater than the number of programming languages I wrote code in. I’ve never thought of myself as much of a language person, but here I am actively using five different languages! Here’s a list of the languages in order of my fluency.

Finnish

FinnishOf course. I was in Finland twice in the past month and every other day or so I spend a lot of time on Skype talking Finnish with Kikka. I read Finnish news every day, and keep in contact with my Finnish friends mostly through various Internet channels.

My main concern with my Finnish is that nowadays I don’t do much serious writing in Finnish. Of course I write letters, postcards and email to friends and family, but that’s about it. I used to be a fairly good writer (grammatically, etc., not so much artistically), but now I think my skills are rapidly eroding.

English

EnglishEnglish is currently the language I use most actively. I speak it daily at work and elsewhere. I read and write piles of email in English every day. All the code and documentation that I read and write is in English, just like the various tech and world affairs sites and blogs I follow.

Even though I understand English well and can get myself understood with little trouble, I still don’t think I’m particularly good with the language. As they say: The universal language of the world is not English; the truly universal language is bad English. The last time I actually studied English was in high school 15 years ago, so I believe I would really benefit from taking some more advanced courses on the finer points of the language.

Swedish

SwedishLearning Swedish is mandatory in Finland, so I spent ten years studying the language at school. Thus I have a reasonably strong theoretical background in the language, but since I very rarely use it anywhere my practical skills aren’t that great. Prodded by Kikka to do something about that, I recently bought and started reading Conn Iggulden’s book Stäppens Krigare (Wolf of the Plains) in Swedish. The first 20 or so pages were a struggle, but then it all came back to me and now I’m going strong at around page 200 and can barely set the book aside.

The funny thing about the Swedish I’ve learned is that it’s not really what they speak in Sweden, but rather a dialect spoken only by a small Swedish-speaking minority in Finland. I have a feeling that I’m going to end up with something similar, just on a larger scale, also for German…

French

FrenchI’ve never been too enthusiastic about learning languages, so in high school I dropped French (that I had studied for two years earlier) in favor of more math and physics. I did some more French courses at the university to fill up the mandatory language studies, but I’ve never really mastered the language. However, I have relatives in France and Morocco, so I do have a “live” connection to the language that I’ve lately tried to keep up through occasional visits.

My latest visit was a few weeks ago when I took the TGV train from Basel for a quick weekend visit to Paris. During the visit I tried to speak as much French as I could, and was able to keep up reasonably well when people around me were speaking French.

German

GermanLast but not least. I started actively learning German when I moved to to Switzerland about half a year ago. First I used an online course, and after finishing it I’ve now been taking an evening course with a real teacher and a group of seven students. It’s hard work, especially since the Swiss German I hear around me every day is quite different from the Standard German I’m learning at the course.

I can increasingly well manage simple shopping and restaurant interactions in German, and I try to read (or at least browse) the local newspapers every day. I’ve also started using the German Wikipedia as my first source of any non-technical trivia. I go there a few times a week and only switch to the English counterpart when I can’t figure out some specific details.

I guess my studies are starting to take effect, as my first germanism already found it’s way to a tweet I posted yesterday. Earlier this week I also had my first dream in German! In my dream I continued doing the German exercises that I had been doing when I fell asleep…

What’s missing?

All the languages I’m using are (originally) European. I’d really love a chance to brush up my Japanese (I studied it for a while at the university) or learn the basics of Mandarin (and Arabic would be cool too), but I guess that for the next few years I’ll be too busy getting up to speed with German to even consider doing something new.

2009-03-26

Maven meetup report

A few days late, here’s a quick report on what I managed to do this Monday here at the ApacheCon EU. As mentioned earlier, I arrived at the conference hotel on Monday evening and headed straight for the Maven meetup.

Maven meetup

The meetup was already in progress when I arrived, but I managed to catch a part of a presentation about the Eclipse integration that just keeps getting better. Nowadays it’s so easy to import and manage Maven projects in Eclipse, that I get really annoyed every time I need to do manually set things up for projects with Ant builds.

Other interesting topics covered were Maven archetypes and the release plugin. I’ve for a long time been thinking about doing some archetypes to help setting up new JCR client applications. We should probably also do something similar for setting up new Sling bundles.

The release plugin demo was interesting, though I’m not so sure if I agree with all the conventions and assumptions that the plugin makes. On a related note, we should configure the GPG plugin for the Maven build in Jackrabbit.

We talked a bit about Maven 2.1.0 and the upcoming 3.0 release.  I’m already pretty happy with the recent Maven 2.0.x releases, so we’ll probably take a while before upgrading, but it’s good to hear that things are progressing on multiple fronts. We also briefly touched on the differences between the Maven and OSGi dependency models and the ways to better bridge the two worlds.

In summary the meetup was really interesting and served well in giving me a better idea of what’s up in the Maven land. Thanks for everyone involved!

Chops, ribs and beer

After the meetup a few of us headed out to Amsterdam city center for some food and drinks. Monday evening wasn’t perhaps the best time to go out as we needed to wander around looking for places that would be open long enough. Anyway, we found some “interesting” places to visit before returning to the hotel in the early hours. Good times.

2009-03-23

ApacheCon plans

It’s ApacheCon time again. I’ll be flying to Amsterdam later today, and will probably be pretty busy for the entire week. Some highlights:

Monday

  • Maven meetup. I’ll probably arrive at the conference hotel just in time for the Maven meetup, where I’m hoping to catch up with the latest news from the Maven land.

Tuesday

  • Git hacking. During the Hackathon on Tuesday I hope to get together with Grzegorz and anyone else interested in setting up git.apache.org.
  • Commons Compress. There’s some useful code in the Commons Compress component that I hope to use in Apache Tika. If I have time during the Hackathon I want to help push the component towards its first release.
  • CMIS / Chemistry update. I’ve been meaning to check out the CMIS code that Florent Guillaume has been working on recently. I’d love to get the effort better integrated into Jackrabbit.
  • Commons XML. I’ve been gathering some JAXP utility code to a new XML library in the Commons sandbox. I hope to spend some time pushing more code there and perhaps discussing the concept with some interested people.
  • Juuso lab. I have lots of new ideas about RDF processing and Prolog. Hoping to turn those into working code.
  • Lucene meetup. Catching up with the latest in Lucene and telling people about Tika and the Lucene integration we have in Jackrabbit. Unfortunately I only have one hour to spend here before the JCR meetup starts.
  • JCR meetup. Starting at 8pm, the JCR meetup is one of the key highlights of the conference for me. We’ll be covering stuff related to the Jackrabbit and Sling projects. You’re welcome to join us (sign up here) if you’re interested in the latest news from the content repository world.

Wednesday

And lots of other stuff, too much to keep track of…

2009-02-14

When all you have is a hammer

Helsingin Sanomat, a newspaper in Finland, has an article (in Finnish) where some experts give advice on what Helsinki should do to succeed among the metropolises of the world. I’m paraphrasing:

  • Humanist: “Multiculturalism”
  • Environmentalist: “Save energy”
  • Architect: “More buildings”

Who’s got the big picture?

2009-02-14

First flowers of the year

After recovering from the flu, I went walking around in Basel and found a nice park about 1.5 kilometers from where I live. It’s been a bit rainy lately in here, but the sun is already pretty warm when it peeks from behind the clouds.

Flowers in a sunny spot

There was this sunny spot beside a tree where all these small yellow flowers were pushing up from the ground. I just had to take a closer look.

Closeup of the flowers

I took some other pictures as well. It was a nice afternoon.

2009-02-10

Comparing Midgard and JCR

MidgardMidgard is the open source content management framework that we originally created with Henri Bergius more than ten years ago. In the past few years I have been more involved with Java content repositories like Apache Jackrabbit, but I’m still following what goes on in Midgard and Henri’s recent comparison of Midgard and JCR prompted me to write up some of my thoughts on these two technologies. My experiences with Midgard and other content management systems that I’ve implemented go a long way explaining why I find the content repository concept so powerful.

In Midgard everything is content that is stored and managed inside a central content repository. The Midgard repository is an organized collection of MgdSchema objects stored in a specifically structured MySQL database. The repository contains site templates,  user preferences, content hierarchies and much more. All these content objects are accessed and managed through the Midgard core API and the language bindings that have been built on top of the API.

As Henri mentions, the Midgard repository clearly resembles to the JCR content repository model. The similarity is strong enough that I find it very interesting to look deeper at where the repository models differ and see which features I like better. Here’s a quick overview:
 

  • JCR typing is more flexible. The MgdSchema model makes it very easy to extend the repository with custom object types and the parameter feature allows even further runtime extensibility, but all objects are still clearly associated with a defined type. Midgard does not have unstructured nodes or mixin types that you find in JCR.
  • Midgard is less constrained by the hierarchy. In Midgard hierarchies are just a well supported special case of a more generic object linking mechanism. JCR references or event the shareable nodes in JCR 2.0 are not as powerful as the many-to-many relationships that you can easily handle in Midgard.
  • JCR is more addressable. As a downside of the above point, Midgard does not support as powerful path-based addressing of content objects as JCR does. The Midgard repository is only partially addressable by paths while in JCR everything has a path. On the other hand all Midgard objects are addressable by their identifiers, whereas only referenceable  nodes in JCR can be accessed by identifier.
  • Midgard queries are more powerful. The JCR 1.0 query model restricts search criteria to only refer to properties of a single node. Repository implementations like Jackrabbit extend the query model somewhat, but the Query Builder feature in Midgard allows more flexible search criteria to be used.

 
As a summary I think both JCR and the Midgard repository are good examples of the kind of infrastructure that provides a strong base for building modern content management systems. And Midgard’s relationship with the desktop world is an interesting example of how content repository technology isn’t really limited to just traditional content management systems.