Programming White-Papers

This section presents a series of white papers on various aspects of programming and computing. We hope you find them interesting, in that they provoke lively debate; radical, in the sense they revisit fundamental ideas and conceptions about how programming is, or should be, undertaken; and maybe even practical, although this is not directly the concern here.

There are many 'practical' texts regarding programming and computing available on the web, some of which are available through our links section.

We hope the white-papers made available here will stimulate a reflective approach to programming which allows space for creativity, innovation and enjoyment. We believe programming can be approached in part as an 'art', and that this approach transforms the experience of programming and the quality of the software produced.

Persistent OO Patterns

This paper given at the ICOODB conference 2009 discusses a series of patterns, anti-patterns and meta-patterns concerned with object persistence encountered in the last decade working with the ObjectStore OODBMS. They are presented in the same style as the original book, and although all code examples are in C++ using the ObjectStore database, they are extensible to other object oriented databases and programming languages.

Maximizing Performance with Bespoke Programming

This paper recently published in the Java Developers Journal discusses the importance of Bespoke Programming when trying to maximize performance and scalability. The thesis discussed can be summarized as: "It doesn't matter how good automatic optimization gets, there will always be a place for bespoke programming using a general purpose programming language, because context specific knowledge of a particular use-case, unavailable to the writers of the automatic optimization, is available to the programmer directly implementing the particular use-case." This is demonstrated empirically by presenting novel integer sorting routines that yield an improvement over the standard Java sort routines by a factor of between x2 to x20.

Innovation in Programming Languages and Systems

This paper discusses the subject of innovation within the IT industry today, and highlights the crises in software evaluation throughout industry. It supports the view that new is not necessarily better with an historical analysis of how C became C++ and went on to become Java, what motivated these changes, and what problems with the existing technology were solved by introducing new technology. It goes on with a comparison with how scientific theories evolve, and makes the important point that a new technology must not only solve new problems more easily or efficiently than existing technology, but must remain as successful solving the existing problems as the old technology; it must not introduce more problems than it solves. This point is often overlooked during the technology evaluation process, and the paper urges 'innovators' to release new functional components as Java and/or C++ libraries, as these are technically the best way to distribute software for flexible and efficient integration with other systems, in an attempt to control ever-increasing technological complexity and converge IT onto a much smaller set of common languages, for everyone's benefit.

Persistent STL

This paper describes how to store C++ objects, and particularly STL containers and iterators, persistently on disc without object-relational mapping. De-constructing C++ objects into table rows and columns takes CPU cycles and therefore has runtime costs. Using the ObjectStore database as described here enables direct, transactional access to C++ object structures already stored exactly as required when they are used in memory.

Combining this approach, called 'memory-blasting' with the power of the STL library, has the potential to revolutionize the way we conceptualize and implement data storage and data access in C++.

Scalable GIS: a Billion C++ Objects at Ordnance Survey in the UK

The purpose of this paper is to describe the Geospatial Object Server (GOS) that lies at the heart of the OS Master Map project deployed Ordnance Survey in the UK. The Geospatial Object Server (GOS) holds the largest, seamless spatial dataset in the world, covering the whole of the UK at 1:10000 scale. Its primary task is to supply this data in various XML formats in response to queries in near real-time . The aim of OS Master Map is to underpin all commercial and government activity that involves spatial data into the next millennium. Here we describe what the GOS does, why it does these things and how it is implemented to scale to billions of objects. We also look at project risks and some of the business benefits.

The 'D' Programming Language

Just when you thought that C, C++, Java and C# are more than enough for your average programmer to learn this paper presents academic research that introduces another incremental improvement to this family of languages - the 'D' Language. Object-oriented, fast, type-safe, easier syntax than C++, non-proprietary etc. etc. Base line compiler/linker is available. This paper is definitely worth a read for language mavens.

C++ Metaprogramming

This paper introduces the idea of metaprogramming, discusses why it is useful and starts with a very simple example of a C++ template that calculates factorials at compile time. It then goes on to describe the Boost C++ template metaprogramming library (MPL), an extensible compile-time framework of algorithms, sequences and metafunction classes. The library brings together important abstractions from the generic and functional programming worlds to build a powerful and easy-to-use toolset which makes template metaprogramming practical enough for the real-world environments.

EJB Performance & Scalability

This paper investigates EJB scalability from the perspectives of bean type, container design and communication layer efficiency. Five different versions of an eBay-like auction site are tested: stateless session beans, entity beans with CMP, entity beans with BMP, session beans that access entity beans, and session beans that access entity beans using the EJB2.0 local interfaces. Performance is calibrated against a servlet-only implementation. These tests are run on two different open source J2EE application servers: JBoss and JOnAS. The backend database is MySQL using transactionless MyISAM tables.

The results are predictable but very useful. The bean type used determines the efficiency and scalability of the system, compared to all other factors. Stateless session beans are as fast as the servlet only implementation and an order of magnitude faster than any entity bean implementation regardless of the efficiency of the container.

Comparison of Java Persistence Mechanisms

This paper presents research that experimentally measures the relative performance of five different approaches to java persistence: Java Object Serialisation (JOS), JavaBeans (JBP), Orthogonal Persistence (OPJ), Java Database Connectivity (JDBC), Java Data Objects (JDO) and Enterprise Java beans (EJB).

The results show OPJ to be the fastest, albeit a totally non-transactional persistent mechanism and not one where the data can be shared between multiple users. EJB proves to be quite dreadful. JDBC fares well, particularly on cold queries, but the paper concludes that JDO is the overall best approach, largely because of its hot query performance. Interesting research but the relevance of these conclusions to any technology choices for a particular project heavily depends on the actual details of that project, except perhaps for EJB - it really is pants!

Website Design and Performance

This paper presents research that empirically measures user tolerance of slow and unresponsive web sites. It shows how a user's acceptance of delay varies with the task in which they are engaged, the length of time they have been interacting with a site, and the method used to load pages. Subjective issues such as the conceptual models users have of how the system works also influence the results. The implications for web page design and server implementation are explored. The conclusion that poor web-site performance leads to poor corporate image is unsurprising but important; the fact that this appears to compromise user conceptions of the security of the site is perhaps more surprising.