Before I begin this new project in earnest I thought I’d take sometime to introduce my basic setup. I am of course using Eclipse as my development environment along with the Google App Engine plugin. The project itself will use Google’s High Replication Datastore (HRD) but I have switched off the Datanucleus JDO/JPA access to the Datastore. More about this in a moment.
When we first start out a new project we have a lot of choices to make, some of them will be good ones while others we will regret, hopefully sooner rather than later. As you may have noted in some recent posts, I am not that much of a fan of monolithic frameworks, but prefer to use smaller more tightly focussed libraries. This may not be your approach and I don’t pretend to present it as an optimal solution, but I have not regretted it in the past. So let us introduce two of the stars of our show.
It will come as no surprise to some of you, that I am using the Stripes Framework as my web framework. I’ve used it successfully on both Google App Engine and Amazon Web Services with great results. It is dangerous to suggest that one framework is better than another, but I can say that this little framework managed to save me a significant amount of time in creating my web facing interface on a number of projects. I’m confident it will do the same again here. Despite its importance, it will not play a prominent part in the upcoming series of posts. We are after all more interested in what Google App Engine is all about. Having said that, I will of course be keeping an eye on the performance of the framework and making sure it is indeed fit for purpose.
As a side note, I intend to play with the Play Framework at some point in the future. It looks promising, but will leave it for another day.
The Google Datastore
When I first started developing software, many many years ago now, my first programming language was C. My first commercial project was as a member of the Advanced Communications team at IBM. I met many talented developers there and was fortunate enough to be assigned a great mentor from within the team. His starting advice to me was to learn assembler. I of course balked at this advice. I was after all a C developer (for those interested, we’re talking K&R). His response was a wry smile followed by learn assembler, and then you’ll understand C. It was a while before I understood what he meant, but his advice has stayed with me for the last 18 years.
Abstractions whether it be the compiler or a framework are dangerous. They have the benefit of insulating you from the pain and overhead of dealing with the lower levels. However, they also hide the workings of the lower levels and if you misunderstand the internal workings, you will end up hitting your head off a brick wall.
I’ve seen a lot of traffic related to the Google App Engine Datastore for this very reason. I believe that the introduction of JDO/JPA for Java was a bad move by the GAE development team. I can sympathise with their reasoning, in that it may have encouraged more developers to the platform than would have otherwise joined. However, the assumptions that these developers brought with them from the relational world caused consternation for them and I don’t doubt the development team. The GAE datastore is a different kind of beast, and to get the most from it we have to understand how it works. Treating it like a distributed SQL DB without the joins is going to make for expensive code.
To this end, I started to use the Datastore API early on in my exploration of GAE. I’ve learned a great deal about how to organise your data and how to get the best out of it. I know there are libraries that provide ORM specifically for the GAE datastore and I’ve played with them too, but for the purposes of this project I will be using the Datastore API. It is intact relatively simple and straight forward.
I am not encouraging anyone else to take this approach and would strongly advise that you look out one of the persistence libraries for GAE datastore. Which ever library you choose, whether it be Twig or Objectify or other, I would advise you learn as much as you can about the datastore and its related costs. Failure to do so, is likely to cost you money.
In the up an coming series, I’m going to be walking through my design decisions, the key reasons and the economics of the choices I’ve made. Because the Datastore is a key contributor to your GAE hosting costs, you can be confident we will be spending a lot of time in this area. Any code I post will use the native Datastore API. The exercise of converting to your favourite persistence library is left as an exercise for the reader.
One last thing. If you don’t use appstats, do it now. Appstats is a filter that you can install to allow a forensic analysis of GAE api costs within your app. Seriously useful.
Next week I’ll be start the first series in a continued exploration of GAE using the scretary project as a test bed. I’ll try and fit in one series per week of between 3 to 5 posts. The first series will be on the datastore itself. I had intended to cover authentication in the first part of the series, but found myself referring to the datastore in such detail, that it became apparent that I should really cover this first. Obvious really. I’m looking forward to seeing you there.