Friday, June 7, 2013

Bolsena #4 - More JAXB fun

Adding to the JAXB fun I had yesterday, I've decided to fix related problems when binding our schemas at compile time, where the maven-jaxb2-plugin tried (sometimes) to load schemas from the net, although a XML catalog file was provided and configured.

The first thing I noticed was that only some of the related modules had that problem. Very strange, so I ran maven in debug mode (-X) for a module where it worked properly, and one module where it failed.

The plugin was configured identically in both cases (we manage plugins from our parent pom.xml). The only difference I've noticed was that in the working module all project dependencies were passed to the xjc call, whereas in the other module only the plugin's own dependencies were added. So obviously the schemas could not be loaded from the classpath.

I was not able to find out a reason for this, but just adding the project dependencies as plugin dependencies fortunately fixed the problem. This means that apart from the unit tests deegree can now also be compiled offline without problems.

If you want to know the details on xjc + catalogs, read this guide. It might be a little out of date, but most of it is still valid.

Thursday, June 6, 2013

Bolsena #3 - JAXB fun

In deegree we make heavy use of JAXB for unmarshalling configuration files. That works quite well, but had a drawback when making use of schema inclusion. The included schemas were always loaded from the internet.

Using the JAXB SchemaFactory I thought it was pretty easy to work around loading schemas from the net and using the ones included in our .jars instead. But somehow that didn't work out, the base schemas were still loaded through the internet.

Smart people found out that the order of the schema URLs when giving them to the SchemaFactory plays a role, and it turns out to be true! I've just opened a pull request that fixes the problem in deegree.

For it to work the schemas need to be in reverse order of inclusion, so put the schema without dependencies first and so on.

Bolsena #2 - coordinate system ramblings

Time for an update, even though there's not much to discuss on actual progress.

First for some good news, the resource dependencies pull request is finally here! It would be great if many of you would test it out, with 189 commits it's the biggest change since our move to GitHub yet, and although the unit and integration tests are working some details might still need some tuning.

Since my last post, I've been thinking and experimenting with the coordinate subsystem in deegree. The API is currently heavily based on the GML representation of coordinate systems, with references everywhere, for  axis, datums, ellipsoids and so on.

While I'm a friend of models that are 'complete', I'm not sure whether that's the best approach for coordinate systems. There are two major use cases for the CRS package. One is to keep the information on what system is being used with what identifier, the other is to transform coordinates from one system to another. Other use cases would be to import/export coordinate system definitions from/to GML, WKT, proj4 etc.

A review of the code revealed that all identifiers are stored in lower case. So exporting to GML or finding out exactly what identifiers exist for a given system is impossible, because the proper identifiers do not exist. The convenient use case of having eg. your layer configuration set up with the correct CRS identifier from the datastore also becomes impossible sometimes, in case the underlying data store is not configured explicitly with a CRS identifier.

So to summarize, the current package has several shortcomings. First is the identifier mess, which is not easily fixed short of a complete reimport of CRS definitions (which would override some manually modified definitions). Second is the model, which is just too complex and makes it hard to compare two definitions. Third, transformations are slow, sometimes not thread safe, sometimes they are synchronized and thus not scalable on multicore machines. Four, the whole system is statically initialized and makes heavy use of global static state.

I've tried unsuccessfully to fix some of these issues the past days, but I fear a complete rewrite is the only thing that will do the trick.

Monday, June 3, 2013

Bolsena 2013 #1

It's that time of year again, where OSGeo hackers from around the globe meet in a former monastery to collaborate and code under the Italian sun.

Things are a little different this year. Looking at the past years we've got a record number of people attending this time! Also, sadly, the Italian sun is missing. I hope that people are right when they tell me it's going to get better...

So back to business. I'm in the process of completing the resource dependencies branch/pull request, I can probably create the pull request today or tomorrow. Things are looking good, the web console is already adapted, no tests are failing and it works.

The Mapbender people have installed deegree on their computers and are working on integrating a simple workflow to create a new WMS/layer based on a shape file in a remote deegree instance using our REST API. That's already working, and we're currently trying to get it running in a more user friendly way. Not a bad start!

In a related note, I've created a new pull request that adds some more features to the REST API, like querying for all supported coordinate systems, checking if a coordinate system is supported by deegree and an experimental call to retrieve all known identifiers for a WKT encoded coordinate system.

In theory the equality relation is defined on coordinate systems (not taking identifiers into account obviously), but in practice I was not able to compare the Utah UTM zone (EPSG:26912) to a WKT encoded variant. I guess that's another reason why a rewrite/cleanup of the CRS package is needed.

Stay tuned for more!