Sunday, December 28, 2008

JSF: Frameworks and Security

One thing that has been conspicuously missing from every J2EE project I've worked on - ever - is a clean way of handling authorization. That is no less true for applications that use JSF. I shouldn't go out on a limb and say that authentication is all that clean either...actual implementations tend to range from being inextricably tied to a specific J2EE server in code to being effectively tied to a security implementation (albeit not a complete app server) because of the amount of configuration (and provision of custom security classes) that one has to do.

What a developer would like is to choose an authentication method (let's assume here that it'll be FORM-based over HTTPS, because this is quite common), and have the container do the authentication in a standard way (j_security_check and all that), with login and error pages configured in the web.xml. The same goes for logout. In addition, the login/logout code should not be inextricably tied to a server or security provider.

Furthermore, once authenticated, authorization should be easy: coming up on the end of 2008 a developer might expect a standardized way of rendering (or not rendering) JSF page elements based on authenticated role, and a standardized way of marking managed bean methods with required roles.

Note the word "standardized" - there are ways of doing all of this, even if it comes to hand-rolling it, but it ends up being a horrible mishmash. Let's bear in mind, too, that the customer usually dictates the J2EE server, and you, the developer, are possibly going to choose a framework or frameworks based on what you know they can do for you. I say possibly because that decision may not be yours either...for example, if you're working with Oracle SOA you may use Oracle across the board: JDeveloper to work with Oracle ADF etc. So, for example, you probably don't have any choice as to whether it's JBoss or OAS or Glassfish or Websphere, and at best you'll have some leeway in deciding, say, on Facelets and ICEFaces, or perhaps SEAM (maybe with ICEFaces too). But all of these decisions will be made before you start considering security.

You may think that you'll do requirements and design first, then look at how you want your security to work, and then pick servers and frameworks...I wish you luck with that.

Login Forms

Anyway, let's start with that login form. Try this little experiment...once you've set up security realms, groups and users in your application server, create a barebones web application with FORM based login. You may as well add your JSF framework(s) - it proves nothing to leave that stuff out. First off, create a target page which is JSF (say, a facelet using ICEFaces components). Perhaps that could be your welcome page in the web.xml.

Next, create two pages for your login and login error pages. This may not hold true for every server or framework, but you ought to be able to use JSF pages (say .jspx or .xhtml)...the main thing is to use only HTML tags for the login form. I know that for ICEFaces/Facelets on Glassfish V2 I can use .xhtml pages processed by the ICEFaces servlet no problem.

For the login HTML or JSP, use all HTML tags, as in the below snippet:

<form action="j_security_check" method="post">
<table>
<tr>
<td>Username:</td>
<td><input type="text" name="j_username"/></td>
</tr>
<tr>
<td>Password:</td>
<td><input type="text" name="j_password"/></td>
</tr>
</table>
<input type="submit" value="Login"/>
</form>

Set up your error page similarly, except perhaps with an informative message. Make sure your web.xml has a section that looks something like:

<security-constraint>

<display-name>All Pages</display-name>
<web-resource-collection>
<web-resource-name>All</web-resource-name>
<description/>
<url-pattern>*.iface</url-pattern>
<http-method>GET</http-method>
<http-method>POST</http-method>
</web-resource-collection>
<auth-constraint>
<description/>
<role-name>all</role-name>
</auth-constraint>
</security-constraint>
<login-config>
<auth-method>FORM</auth-method>
<realm-name>test</realm-name>
<form-login-config>
<form-login-page>/login.jspx</form-login-page>
<form-error-page>/login_error.jspx</form-error-page>
</form-login-config>
</login-config>
<security-role>
<description/>
<role-name>all</role-name>
</security-role>

You may also have to configure an application server-specific web.xml (like sun-web.xml) to map groups to roles.

If you build, deploy and run, and navigate to your JSF welcome page, you should first see the login page come up. Try putting in the username and password you set up in your security realm incorrectly first, to make sure the error page comes up. Then, if you enter the credentials correctly, you'll be redirected to your welcome page, replete with JSF effects.

What Exactly Is The Problem?

Between you and me that's actually a really good question. Because, as you may have found out already, if you modify our example to make the login page use JSF components for the form, input elements, command button etc, so-called JSF name-mangling (or ID mangling really) comes into play. For example, if your form has ID 'mainForm', then the ID for the username input (h:inputText) becomes 'mainForm:j_username'. And this is not what the container expects to see.

What people are essentially complaining about is that they can't use JSF tags for the login form itself (unless they use MyFaces Tomahawk with the 'forceId' attribute, or use a regular backing bean that resubmits to the container behind the scenes, or suchlike). Apparently the main problem with this is that they can't validate in the login form. Now, maybe it's just me, but it seems to me that if the username and/or password are wrong (including empty fields), the only answer you need is that the login failed...being able to pop up a nice message in red that a password is required is really sweet, but I imagine most users can work that one out for themselves. If necessary the login error page can provide a few tips.

In any case, a very common model in web applications is to dispense with FORM-based login, in the standard sense, entirely, and have a JSF form (possibly complex) that does have the username/password fields, and a JSF login action on a backing bean. The login method eventually ends up calling the container authentication programmatically, plus a whack of other things like redirecting based on role.

It strikes me as a bit silly to want a rich JSF login screen based on the 'j_security_check' model, but that's what a lot of people want.

Authorization

One of the things that a developer typically wants to do is to display or not display a UI component based on the security role of the authenticated user. For example, one user may only be able to view certain information on a page, but another might be able to edit certain items. Being able to control 'rendered' and 'disabled' attributes based on role is pretty handy.

This is actually not so difficult to do. It's merely that there are no security-specific attributes on standard JSF tags, although some JSF extensions do have them (MyFaces, ICEFaces, ACEGI). The JSF Security Project is looking (looked?) at making this kind of information available through EL.

Until such a thing is completely standardized it's just as easy to roll your own, unless you're already using MyFaces or IDEFaces or ACEGI. I certainly wouldn't pull in any of those only for this purpose...for example, adopting ACEGI means adopting Spring. Note also that there can be many quirks...if authenticating with JAAS you may have issues using the ICEFaces authorization attributes, and if mixing frameworks...well, sometimes they don't mix well, and sometimes they do (Tomahawk and ICEFaces don't cooperate on a single page).

Summary

In the J2EE world right now, we still spend too much time on implementing security. The requirements of 95%+ of developers in this area are simple and they are constant, and yet we tend to re-invent the wheel. As a development shop you can certainly reinvent once, and then re-use many times, but it would be nice if you never had to re-invent. What's still missing is simple off-the-shelf J2EE/JSF security solutions.

Sunday, November 16, 2008

Page Controllers and Front Controllers

There is a variety of opinion as to whether JSF uses a page controller pattern or a front controller pattern. I've personally managed to get by using the technology without knowing the answer to this burning question, but I confess to having got interested despite myself. After failing to locate a site that analyzed the matter to my satisfaction I decided to figure it out for myself.

Note that both patterns are specific controller implementation strategies within MVC. To refresh your architectural understanding, Model View Controller (MVC) is a way of managing interaction between a user of a system and the data and behaviour of the system. The Model is responsible for the data and behaviour; the Controller interprets user inputs; the View manages display of information. Both the Controller and the View depend on the Model.

For starters, here are some definitions of the Page Controller pattern:

Martin Fowler - An object that handles a request for a specific page or action on a Web site. Either the page itself or an object corresponding to it;

Microsoft - each dynamic Web page is handled by a specific controller. Possible use of controller base class.

Sun - doesn't apparently admit that there is such a thing in the J2EE world.

And here are some definitions of the Front Controller pattern:

Martin Fowler - A controller that handles all requests for a Web site;

Microsoft - single controller coordinates all of the requests that are made to the Web application. The controller itself is usually implemented in two parts: a handler and a hierarchy of commands.

Sun - The controller manages the handling of the request, including invoking security services such as authentication and authorization, delegating business processing, managing the choice of an appropriate view, handling errors, and managing the selection of content creation strategies.
Note: Sun also mentions typical ways of decomposing a complicated front controller.

Fowler mentions Application Controller in the context of front controllers: A centralized point for handling screen navigation and the flow of an application. Sun Java architects (for example, Alur, Crupi & Malks) mention the use of an application controller in conjunction with a front controller as well.

Now, what does JSF do? Well, no question but that the
FacesServlet intercepts all requests. However, if you peruse the source code for a typical FacesServlet implementation class, it doesn't do much - create the Lifecycle and FacesContextFactory instances in init(), and then call execute() and render() on that lifecycle instance in the servlet service() method, obtaining a new FacesContext instance for each request to pass to execute() and render().

What actually does the navigation - assuming you haven't customized this - is the default
NavigationHandler. Although even this is just acting on the result of an action and the rules defined in faces-config.xml. So some of your controller logic is embodied in the navigation rules and cases...the rest of it resides in the code you write in your managed beans, because that's what actually generates the outcomes acted upon by the navigation handler.

Now, is a managed bean - specifically one that hosts methods referenced in page component action attributes - a page controller, or is it a decomposed chunk of a conceptual application controller supporting the
FacesServlet front controller? There is no shortage of authors who argue that JSF follows a page controller pattern, precisely because so much of the dynamic navigation is contained in objects that often have nearly a one-to-one correspondence with pages.

Nevertheless, we're faced with one inescapable fact - each request to a JSF page goes to
FacesServlet, and the flow of execution is through a stack of other JSF classes. In particular, there is not a page controller for each page that is run by the server (this is the default ASP.NET model). So I would classify JSF as definitely using a Front Controller - it just so happens that identifying FacesServlet as being the controller is misleading, albeit technically accurate. It's really FacesServlet plus a whack of other stuff.

As an aside, the ASP.NET MVC framework uses the Front Controller pattern. The routing subsystem is sort of like
FacesServlet, and then there are a bunch of user controller classes that act on requests that are routed to them. I consider this relatively new offering from Microsoft well worth checking out.

Sunday, November 9, 2008

Ubuntu, Subversion, Apache and Blogs

The title says it all...you Googled because you've got Ubuntu, and you're trying to get Subversion working with Apache, likely Apache2. This probably also isn't the first blog or article that you stumbled across. You may already be in some state of frustration by now, particularly if - like me - it's the first time you encountered the non-standard way that Apache2 is configured on Debian and Ubuntu.

I have no intentions of spelling everything out in detail. I am just going to indicate what worked for me. Also, my intention was merely to get Subversion working over HTTP, with basic password authentication...nothing fancy. If something I did conflicts with other blogs, I'll point it out, and explain why. My primary objective was to get a working setup - I can tweak it later. Once you know that something works, more or less, you're 90% there.

To cut to the chase, before I discovered that Ubuntu configures Apache 2.x in a non-standard way, I messed up the installation...both in /etc/apache2 and also in /usr/local/apache2. Because I'd started out by building Apache from a tarball, which I'm used to on other platforms. If you end up in the same fix, ruthlessly prune the above directories, and use Synaptic or another apt GUI to zap every package related to Apache2. Maybe even completely remove all Subversion-related packages ("Mark for complete removal" approach). Start with a clean slate, in other words.

Here's what Debian/Ubuntu does for Apache2 setup. The /etc/apache2 directory is where stuff happens. The central file is apache2.conf - this looks like the httpd.conf you are familiar with. In fact there is a /etc/apache2/httpd.conf..it starts out empty, and for the purposes of this discussion you won't need it.

apache2.conf includes, among other things, all *.conf and *.load files in /etc/apache2/mods-enabled. The former have configuration directives associated with the LoadModule directives found in the corresponding *.load files. apache2.conf also pulls in VirtualHost definitions in /etc/apache2/sites-enabled. The ports.conf file defines port numbers. The user/group for Apache2 is set in the envvars file.

If starting from scratch, just use apt-get (with sudo if necessary) to install apache2, then subversion, then libapache2-svn. If (it happened to me) you get complaints during the libapache2-svn package install of dav_svn not being found, you may discover that you have no dav_svn.conf and dav_svn.load in /etc/apache2/mods-available. A simple fix for this is just to create them yourself, as follows:

dav_svn.conf:


<Location /test>
DAV svn
SVNPath /var/svn/test
AuthType Basic
AuthName "Subversion Repository"
AuthUserFile /etc/apache2/dav_svn.passwd
<LimitExcept GET PROPFIND OPTIONS REPORT>
Require valid-user
</LimitExcept>
</Location>


dav_svn.load:

LoadModule dav_svn_module /usr/lib/apache2/modules/mod_dav_svn.so
LoadModule authz_svn_module /usr/lib/apache2/modules/mod_authz_svn.so

Note that I've set this up for a single Subversion repository located at /var/svn/test, to be accessed with

http://:/test/

Re-install libapache2-svn if you had to create these files for the aforementioned reason. Hopefully it will succeed.

Start or restart your server with

sudo /etc/init.d/apache2 start/restart

At this stage you ought to see your "It Works!" page at http://localhost/. If so, create your Subversion repository if necessary...in any case edit the /etc/apache2/mods-available/dav_svn.conf to reflect the actual location and the path you want to use to access it. Also, create a user with htpasswd, as in

sudo htpasswd -cm /etc/apache2/dav_svn.passwd myuser

Ensure that your repository user (chown -R) is www-data (the user as set up in envvars), and that you have appropriate permissions.

Note: I didn't even touch anything to do with virtual hosts. My successful setup is working with the 000-default virtual host...no others are defined, and I didn't edit the configuration file associated with the default host. Some blogs suggest that you must mess around with virtual hosts to get Subversion running with Apache2 on Ubuntu/Debian - that's not the case. i'm not suggesting that you shouldn't set up a virtual host to access your repo with - fill your boots. But at this point you probably just want to do Subversion things over HTTP and worry about details later.

Another note: if you do create virtual hosts, don't forget /etc/hosts.

I also didn't mess around with SSL. Once you get this far - in my experience - it's no big deal. In any case I don't need it...my repository is a private one for testing. The ones I really manage require users to be on a private network before they ever contemplate accessing a Subversion server.

One main point I'd like to make: don't trust blogs, including this one. Blog writers aren't usually trying to make your life miserable, either. But we forget that critical step (or all three of them) that really made the evolution succeed. We also forget that something often works in more than one way. Finally, we omit mention of things that are obvious to us, but not to others. Was the fact that Ubuntu configured Apache 2.x differently obvious to me? Hell no. And a lot of blog writers didn't say anything about that at all.

Saturday, November 8, 2008

Linux Software Installation - Either Easy or Painful

I'm certainly not the first to have an opinion about Linux packaging and software installation. Iain Murdock, among others, has had something to say about it, and his points are well taken.

I ended up installing Ubuntu Intrepid today, tossing out Debian Etch, just so I could get a recent Anjuta through the apt packaging system. Ubuntu had version 2.24, and Debian was still at 1.x in Etch (Lenny does have the latest, but considering how many other packages get installed, I didn't want to use a distribution in a testing state).

I spent most of the waking hours of one weekend day trying to build Anjuta 2.24 from source on Debian Etch. I'm not too bad at the process...I've done it a lot, and a decade ago or more that was basically what you had to do on Linux anyhow. However, this particular evolution was bad - once you're in a tarball situation, you'll find that many of your prerequisites also become tarball builds. Mainly because if the package for application/library A in your distro was too old, then probably most of the other packages it needs are as well, for the purposes of the latest source for application/library A.

I did get a good ways into it, installing gtk+, Cairo, Pango, ATK, the latest GLib, and about fifteen other prereqs - sometimes down 3 or 4 levels in the dependency tree. I had to keep very careful track on paper to remember where I was in the game of what needs what. I also had to do some patching, lots of Googling, and even some source code fixes of my own.

I gave up when two things happened. One, in one particular area of my dependency tree, it sure looked like tarball X needed an installed Y, and tarball Y needed an installed X. Two, after patching the source of ORBIT-2.0 once, it was still not compiling, and I was already mentally rejecting the entire idea of building a CORBA ORB just so I could have a nice IDE.

Some research had already indicated that moving over to Ubuntu 8 would allow me to apt-get install Anjuta, and so it was. The actual time spent in downloading and burning the Ubuntu ISO, installing Ubuntu, setting up my WUSB600 wireless adapter with ndiswrapper, re-installing Java 6, NetBeans 6.1, GHC 6.8.2, J 6.0.2, ntfs-3g, and installing Anjuta 2.4 was perhaps 3 hours, and could have been reduced to two hours if the package repository hadn't been so slow.

Point being, things are pretty bad when a body is willing to change their Linux distro just to avoid building from a tarball with umpteen dependencies.

Murdock states that when you can locate a recent version of your desired application or library in your distro, you're laughing. Well, no argument from me.


A little tip for novices wanting to pass an environment variable when using sudo...you'll often want to sudo when installing from a .bin or .sh, so that the script goes to /usr/local rather than your $HOME:

sudo JAVA_HOME=/usr/local/jdk1.6.0_10 ./netbeans-6.1-ml-linux.sh

Another little tip. To get the .bin script for JDK on Linux to install where you want it to go, cd to the target directory before running the script. This is actually in the Sun installation documents, which all of us read of course. :-)

The obvious problem - the central problem - is dependencies. This is a problem for package systems as well as tarballs. There are two main issues here, as I see it. First off, while I'm cool with modular development, I'm not cool with with Swiss Army apps that require 15 or 20 other programs and libraries. Some dependencies are clearly acceptable...others should be optional (many more than is currently the case). I'm specifically talking about absolute dependencies. And from time to time, maybe re-inventing the wheel in one or two source files is preferable to requiring someone else's library, which in turn requires four more. If you simply must use someone else's code, choose something that fits the bill but minimizes the agony for end-users.

What would be really sweet is if developers cooperated on more public APIs. Let's say that every C/C++ implementation of XSL-FO provided support for Version 1.3 of the C/C++ XSL-FO Processor API, perhaps by providing adapter classes or functions. That way a programmer who needed XSL-FO support in his application could just code to the API header files, knowing that any compliant XSL-FO processor could be chosen by the end-user.

Second, developers tend to use the latest version of everything. What Java developer do you know who develops new stuff using JDK 1.4? Not if they have a choice about it. Similarly, C/C++ developers use the latest version of every library they need. An end-user may have version 1.8.2 of libfoo, but be forced to install version 1.9.0, for no good reason at all...the older version may have been fine.

I recognize that asking developers to examine, build with, and test with, earlier versions of required libraries is extra work, but it cuts down on the aggro for end users. In a number of cases it may get someone to actually use the program rather than having them abandon the build process in disgust, or stick with an earlier version they can install through a packaging system.

In a future related post I'll have more to say about public APIs.

Thursday, May 22, 2008

Professional Development in IT: Part I

As a professional software developer it's easy to get stuck into using small subsets of development technologies. One two-year job may primarily require skill with PHP, CSS, JavaScript and SQL. Another may demand almost nothing except systems-level C++ expertise on Linux, and let's say that lasts 3 years. Yet another may see you spend 3 or 4 years working with Ruby On Rails. Or maybe you'll knock out 5 years doing scientific programs in Fortran.

To a greater or lesser degree each job you have will improve your knowledge of certain specific areas. At the conclusion of each, though, how marketable are you in general? If you just spent three years working with J2EE 1.4 (and let's say that the application was EJB 2.x SLSB/SFSB/BMP entity beans, servlets, JSP, XSLT/XML, and Oracle as the database), at the end of that you can't even claim any particular leg up in approaching a Java EE 5 project that uses annotations to the hilt, EJB 3.0, the Java Persistence API, dependency injection, JAX-WS 2.0, JSF and MySQL. You'd have to learn nearly as much new stuff as if you were suddenly asked to tackle a project using .NET 3.5, C# 3.0, Linq to SQL, WPF, and SQL Server Express.

Point being, every job you have is a tradeoff between general and specific marketability. In the worst case you finish a job only to find that every technology-related skill you have is obsolete (unless you find a new job maintaining the kind of project you just finished). In the best case you'll be current in a subset of technologies that are widely in use. For most developers the reality is somewhere in between those two extrema.

Think of this analogy. You're a glider pilot, and you're looking for sources of lift: thermals, ridge lift, mountain waves, and so forth. Each such source is of a given extent and strength...and duration. Your object, of course, is to stay aloft as long as you can. If you extend the analogy by saying that one horizontal axis is time, and the other is specific technologies, and the vertical axis is market share, the slowly expanding 3D image of "thermals" (expanding because the number of technologies increases, and time advances) provides a view of what you, as the glider pilot, have to do. Maybe you'll end up in a powerful thermal of just one major technology but extended duration, and ride that for ten years, just to get dumped out as that thermal suddenly collapses, and the nearest thermals on the technology axis are now a good distance away. Or perhaps you'll keep your glider in a wide technology area with many weak thermals, closely spaced, each lasting for a few years...as one wanes and stops providing lift it's a short hop over to the next.

It's a way of thinking of the constant juggling act involved in professional software development. There are a lot of coders out there who have felt like fish cast up on a beach by an especially big wave at high water, and now they're left gasping on wet sand as the tide recedes. "Hmmmm, maybe I shouldn't have specialized in CORBA", or, "becoming a guru of TeX seemed like a good idea at the time, but..."

How do you prepare yourself as best you can for keeping yourself aloft? A lot of jobs out there keep you stuck in one spot. You learn a fair bit at first just to get up to speed, then over the years you slowly become a true expert in...well, not much. For example, you may become a Zen Master of the latest servlet API and HTTP 1.1, but unless you're writing web containers that really doesn't give you any practical advantage over a competent programmer who spends a few weeks immersed in either spec.

In other words, most jobs don't really give you time during working hours to do much, if any, professional development. You just get better at a few things, and you fervently hope they stay popular (and hence marketable) in the outside world. You certainly don't have much time of your own to put into learning new skills - not unless you're single with no significant other, no kids, no hobbies, no recreational interests, and no desire to veg out watching TV or listening to music. You do have some time, and the question at hand is, how do you maximize the use of it?

This series examines just that. We'll look at professional reading, formal education and certifications, and in the later posts, we'll discuss "bootstrap" projects in more detail. I'll have a bit to say about each in this first column of the series.

Professional Reading

Let's look at professional reading first. Some of this is naturally dictated by immediate requirements, whether job-related or influenced by your educational plans. This is not what I mean here. What I refer to is keeping up with the IT industry through industry news, short articles, mailing lists and newsgroups...and blogs. For example, I have a constant inflow of mailing list digests...among them The Code Project Insider, TheServerSide.com, the Artima Developer Newsletter and so forth. Each such is profitably digested in 10-30 minutes, depending on my interest level, and may generate a further hour or two of extra reading at another time depending on what interesting articles are linked to.

Of all the newsgroups I am subscribed to, currently six of them are ones I use to see what other developers are talking about. These include comp.lang.c++.moderated, comp.lang.functional, comp.lang.java.programmer, comp.lang.ruby, comp.text.xml, and microsoft.public.dotnet.languages.csharp. Just perusing thread subjects, and scanning a few posts in likely threads, provides a great deal of information for relatively small expenditure of time.

Blogs can be very useful too. I won't really recommend a list of blogs - I find that each developer should seek out and choose their own list, just as with mailing lists and newsgroups.

Stuff like this is excellent for productively consuming 5 minutes, 15 minutes, 30 minutes, or an hour. I've found that for me an hour is sort of the dividing line between passive professional development and active professional development. What I mean by passive is, essentially reading. And active means hands-on: it means coding. If I don't have at least an hour of uninterrupted time set aside I likely won't fire up the development environment and dive into a current project...I'll read instead. That's not to say that you can't fire up an interpretive shell or an IDE and dash out a quick script or tiny program in 10 minutes, just to test out a code snippet, it's just that you doing so is probably in support of something you're reading, rather than being part of a larger coding project.

And read industry news. Mailing lists and blogs will often be your best starting point. Bear in mind, the people that write this stuff are doing a lot of research on your behalf - why waste that resource? You may quite frankly not care much that
Microsoft is due to launch three Oslo CTPs in October, but in fact it's a useful thing to sort of know what Oslo is...a few years down the road you may have to know what it is.

Formal Education

What about formal professional development? This may include going back to school full or part time, or attending night classes, or doing home study. You may be angling for a degree, a certificate, or a certification, or just have credentials for having attended a week-long course.

I've had my moments of cynicism about everything ranging from 4-year CS degrees through 2-year IT programs to certifications offered by companies. Fact of the matter is, every one of these things is as useful as the effort you put into it. An MCTS in MS Office Project is as valuable as the amount of dedication the person has to truly learning Project and software project management. So perhaps very few credentials can be dismissed out of hand.

Coding Projects

Finally, what do I mean by bootstrap projects? I may think of a better name for these - suggestions are welcome - but essentially I refer to a small set of generic projects, none of them particularly large, that are technology-agnostic. The purpose of such a project is to provide a focus for learning a new group of technologies.

For example, let's consider a personal information manager - a PIM. The generic requirement is simply that the PIM will allow a user to view, retrieve, enter and edit data about contacts, the latter being people and organisations. The conceptual data model is simple - persons/organisations, addresses, phone numbers, email addresses, web addresses, and the relationships between them (1:1, 1:N, and M:N). We don't stipulate at this stage what technology the PIM will use.

The bootstrap PIM consists of the requirement document and the design documents. Once complete it takes relatively effort to keep the requirement document up to date. It should also be possible to keep a small number of sets of high-level design docs up to speed, focusing on perhaps PIM as desktop app, PIM as web app, PIM as web service, PIM clients, and so forth (note that design methodologies evolve - every so often your objective will be to learn something new in this regard; therefore the production of new HLD docs will be your project).

The main idea is this: every so often it'll become clear to you that maybe, just maybe it wouldn't hurt to learn something about Technology Family X. Maybe that's .NET 3.5 with C# or VB, plus Linq, plus Windows Forms or WPF. Maybe you want to come up to speed with Java EE 5 and the latest stuff involved in that. Maybe it's RoR that you'd like to investigate. Or perhaps you want an excuse to improve your C++ so you decide to code up a Linux desktop PIM using C++ and libglade. Just for giggles use a different database for each one.

A well-chosen bootstrap project has longevity. People wanted PIMs in the 1980's, they want them now, and they'll want them in 2050. What they want in a PIM won't even change that much - your requirements and high-level design address those concerns. Actually implementing the PIM in a certain way to keep your skills with specific technologies fresh is the real purpose of the bootstrap. Ideally you'll spend a minimum of time to learn a maximum amount.

In Part II of this series I'll take a look at refreshing of core skills.

Friday, April 25, 2008

A Cornucopia of Frameworks

Software developers these days are faced with a bewildering array of choices. Although I refer to frameworks in the subject line, by that meaning primarily Web frameworks, we also have a stupendous amount to choose from when it comes to programming languages and libraries. The number of application programming interfaces (API's) for many languages is now so large that one can only scratch the surface.

Matt Raible posted a very good comment by David Sachdev about the proliferation of Web frameworks. And not infrequently I have found myself also agreeing with Angsuman Chakraborty, who writes about abandoning frameworks and going back to Java, JSP and servlets (and plain old JDBC). Having started out with J2EE when this was all that was available, on the Web side, I'd recommend this experiment for any developer who has never gotten down to the bare metal. Best practices, of course... The same applies for frameworks on other platforms. The fact of the matter is that these frameworks are often overkill.

The Python Wiki has something to say about API proliferation, as do innumerable other sources referring to other languages:

A persuasive argument once upon a time was the simplicity of the Python standard library's layout in comparison to the "aggressively hierarchical" layout of the standard Java APIs, for example. But with a large number of overlapping modules and packages within the Python standard library reducing its relative coherency to Java's API proliferation (see java.sun.com for details), it seems appropriate to perform a reorganisation of the library's layout in order to promote a more memorable and intuitive structure that can be more coherently documented.

Java has come under criticism for excessive API bloat, as has .NET. This November 2006 article on Java Web technologies even says "There are too many Java technologies to list in one article." When one considers that a Java enterprise programmer will have to be fluent in Java, conversant with a large subset of J2EE APIs, familiar with any other Java APIs relevant to the problem domain, and very likely will use a framework besides, there is a very real obstacle to rapid delivery of reliable and scalable applications. Note that no small number of enterprise apps actually have fairly simple requirements, and really don't need a massive API stack to solve the problem. .NET has essentially the same problems. [1]

Couple this with the influence of advanced IDEs - regardless of the very real benefits that such environments offer, the argument can be made that they dumb programmers down. Charles Petzold in 2005 delivered a presentation to the NYC .NET Developers' Group, which discusses the effect of Visual Studio on programming habits...this is applicable to other comparable IDEs as well (say, Eclipse and NetBeans or IntelliJ). To quote from Petzold's article:

It’s not that IntelliSense is teaching us to program like a machine; it’s just that IntelliSense would be much happier if we did.

And I think it’s making us dumber. Instead of finding out exactly the method I need, or instead of trying to remember an elusive property name, I find myself scrolling through the possibilities that IntelliSense provides, looking for a familiar name, or at least something that seems like it might do the job.

I don’t need to remember anything any more. IntelliSense will remember it for me. Besides, I justify to myself, I may not want those 60,000 methods and properties cluttering up my mind. My overall mental health will undoubtedly be better without them, but at the same time I’m prevented from ever achieving a fluid coding style because the coding is not coming entirely from my head. My coding has become a constant dialog with IntelliSense.

So I don’t think IntelliSense is helping us become better programmers. The real objective is for us to become faster programmers, which also means that it’s cheapening our labor.

Anyone who has used Java IDEs can identify with these comments. Despite the undeniable usefulness of these programming environments, one wonders if any novice developer should ever be exposed to them. The combination of a good programmer's text editor and API documentation, and running SDK tools from the command line, should be required first exposure to a new language (and libraries) for anyone...you have to think about what you are doing when you have to hunt for the class and/or method in an API document, and then type out the code.

There used to be a guideline that a good programmer only turned out a few dozen lines of tested, efficient code per day. Modern IDEs and frameworks have now made it possible for developers to churn out hundreds of lines of code during a shift, but there's no guarantee that it's good code...although your IDE will tell you if it will compile.

What we need is a movement back towards the basics. In other words, a much more careful use of "labour-saving" technologies, less reliance on frameworks, and strict guidelines accompanying the use of IDEs.

1. See Microsoft worried about .NET fragmentation