varlena
varlena
PostgreSQL Training,
Consulting & Support
General Bits
By A. Elein Mustain

7-June-2004 Issue: 74

Archives | General Tidbits | Google General Bits | Docs | Castellano | PortuguÍs | Subscriptions | Notifications | | Prev

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

The Netherlands Unix User Group
Views of a Conference 31-May-2004

I have spent almost two weeks in the lovely country of The Netherlands. The primary purpose of my visit was to speak at The Netherlands Unix Users' Group's (NLUUG) conference on Open Source in Business. Needless to say, this was a challenge in languages as few talks were actually in English, my native language. (My second language, Spanish, did not help much except in the restaurant but that is another story :-) However, some of the talks in Dutch had papers in English--I was glad of that. Most people do speak English in The Netherlands and all I met were very positive and friendly. I had the chance to shake hands with and thank Bram Moolenar for his fine work with vim. And, particularly nice and helpful were JC, Piet, Jeroen and Johan. Find us in the Pictures.

Here I will summarize a few particularly interesting points from a couple of the talks that I heard or could read. My own talk was on PostgreSQL and if you've been reading my column here, you know what my talk was like.

Keynote

Ron Tolido, Capgemini
The keynote address had no paper and was not in English, but the slides were. Two very interesting lists on the slides included how to choose an open source product and how to evaluate whether the project was "mature" and therefore reliable. I am very happy to report that PostgreSQL, as an open source project measured up very well against these lists. This idea is covered in the next article in this issue and is based on work from www.seriouslyopen.org.

Is Open Source Better?

Rix Groenboom, Reasoning, The Netherlands
In this paper, Rix reviews several projects which objectively measured the number of defects in commercial and open source software at various times in the development cycle. The software used to evaluate the process is Reasoning's automated software inspection (ASI) service which reports, among other items, memory leaks, NULL pointer dereferences, bad deallocations, out of bounds array accesses and uninitialized variables.

Defects are measured per thousand lines of code (KLOCs). The average density of defects of Reasoning's more recent 200 projects, totaling over 35 million lines of code, is 0.57. In the analysis of MySQL 4.0.16 (as the Open Source product) a density of 0.09 was found. In an average of commercial database projects, a density of 0.58 was found, similar to the total expected average.

Other projects analyzed were Linux 2.4.19 TCP/IP and a commercial TCP/IP and a commercial web server vs. Apache 2.1. In these cases the number of defects was shown to fall fast in the development cycle for open source where the number of defects in the commercial code fell noticeably slower, not correcting the majority of defects at all by the end of the release cycle.

The reasons that Rix states for open source success are that the open source model encourages several activities that are not common in the development of commercial code, including:

  • Many users don't just report bugs, as they would do with commercial software, but actually track them down to their root causes and fix them.
  • Many developers are reviewing each other's code, if only because it is important to understand the code before it can be changed or extended. It has long been know that this peer reviewing is an effective way to find defects.
  • In the typical Open Source model, programmers organize themselves around a project based on their contributions. The most effective programmers write the most crucial code, review the contributions of others and decide which of these contributions make it into the next release.

Open Source Software in Globally Embargoed Countries

Andy Haxby, Competa IT BV
US technology is forbidden to be exported to places that the US has designated Globally Embargoed Countries, or GECs. Apparently this designation is ignored locally but in multi-national companies, this is a serious legal affair. If ignored could cause the US to forbid the company trading in the US.

This talk covered two projects for a multi-national oil company setting up offices in Libya and Iran. In both cases, the use of Open Source Software was key in creating the infrastructure needed for the offices. The legal problems were complex, ambiguous and different for each country. For example,

"Indications of how much of a US technology or product can be exported. For example a product may be capable of export if it includes less that 10% of an otherwise non-exportable US technology. Does a PC from Taiwan contain less than 10% US technology?"
Also, encryption is forbidden in Libya, by Libya and other encryption technologies are on a case by case basis.

Two clear points were most pertinent from my point of view. First, that office infrastructure could be and was constructed from Open Source. Second, that the first project failed in the long run while the second succeeded primarily due to time spent on design and the effort accorded to project management.

A more minor interesting point was that the standard packages available, through Red Hat, for example, were difficult to deconstruct to remove the software that was not GEC compliant. A new packaging was needed. The first project used a cut up version of an existing package and the second project, more successfully created a new package based on Debian.

The first project was created on the fly without a clear idea of what the users wanted and needed. There were also cultural concerns regarding calling support. No one called support or reported bugs because they equated it with the admittance of failure. The second project gained insight from the first and spent considerably more time on "what the user sees" and the centralized project management coordinated the engineering, legal, design and other parts of the project as a whole.

Other Talks

Other talks or papers in English included The FreeBSD Core Network Server Concept Patrick Schoo, The Future of CIFS (samba) Jelmer Vernooij, Open Source Package Management Jos Vos, X/OS Experts in Open Source. And Piet de Visser did half of his talk on databases in Open source in English (on my behalf), but notes do not do him justice except to say he invoked MySQL, Codd's Rules, PostgreSQL (with a charging elephant) and Ernie Ball.

Open Source in Business is clearly a hot topic. It is inspiring to see that there is visible effort and success in using Open Source products either alone or with proprietary products in business. More information on the NLUUG conference and NLUUG can be found at www.nluug.nl.

Contributors: elein at varlena.com
Open Source Maturity Model
PostgreSQL as a Mature Open Source Project 05-Jun-2004

The Open Source Maturity Model is a white paper published on the web site www.seriouslyopen.org. The basics of this topic were covered at The Netherlands Unix Users Group Conference on Open Source in Business in the keynote address by Ron Tolido. The paper covers a specific technique for evaluating open source projects. This is critical information for people in business opening up to the concept of Open Source products. Two key areas of interest are the measurement of maturity of an open source project as well as the product indicators that tell whether the product is applicable for use. In this article I will review the measurements of maturity of PostgreSQL as a project.

Tables and descriptions in italics are quoted directly from The Open Source Maturity Model 1.5.3 You are encouraged to read the entire article.

DIFFERENCES BETWEEN OPEN SOURCE AND COMMERCIAL PRODUCTS
Open Source products are freely available and naturally that is not the case for commercial products. Generally the users of a commercial product do not receive the source code of the product, but users of an Open Source product expect to receive the source code. Some of other differences between these types are presented in the table below.
 CommercialOpen Source
SupplierA company A community
Product development Driven by corporate economics Driven by product functionality
Developers Limited numbers with product knowledge, all paid for the supplier Varies from a small to very large group of developers. Often permanently employed, sponsored or volunteers.
Stability New trends are incorporated quickly if there is a commercial incentive. New ICT developments are incorporated into the product if this benefits the users.
Users Commonly not organized, every user maintains contact with the supplier independently. Users participate in virtual communities and discuss among themselves and with the developers about the product and future developments.

This comparison of a company and an Open Source project should look very familiar to open source proponents.

  • In Product Development PostgreSQL is also led by 1) Standards and 2) People and Time. Functionality for PostgreSQL is driven by the people showing needs for features, moderated by technical appropriateness, standards and resources.
  • The most active PostgreSQL Developers are sponsored or employed by various companies: Red Hat, SRA, Fugitsu, Affilias to name a few. However, there is still a large number of knowledgeable volunteers and a contingent of past developers whose knowledge can be drawn on.
  • PostgreSQL Users are organized organically into mailing lists. There are a large number of active contributors to every mailing list--both questioners and answerers. From my experience with commercial customer support, I would much rather have many eyes considering my questions than to have it passed from tier to tier before finding someone who maybe knew something about my question.

This is the table describing generally the measurement of maturity of an open source product. The PostgreSQL rows are my comments. Everything else is quoted directly from The Open Source Maturity Model 1.5.3
Product
IndicatorImmatureMature
AgeThe project has just started. The stability of the developers group and need for the product are unclear. The project is been active for some time. The project stability and need for the project are no longer issues.
PostgreSQL was begun in 1996 as postgres and continues today in 2004. The product is very stable with most bugs relating to usability and features. The industry has shown that the database market, even the "commodity" database is a multi-billion dollar industry. Clearly there is a need for a stable, open source, relational database system such as postgreSQL, particularly now with the market for open source products opening quickly.
Licensing Not fully described or clearly unsuitable for the product. One of the standard licenses (www.opensource.org/licenses/) Offers clear motives for choosing the license type, which is supported by the user community. Often allows both commercial and Open Source variants to co-exist.
PostgreSQL has always carried the clear and simple (modified) BSD license. See General Bits Issue #69 for a more complete description of the BSD license.
Human hierarchies Original founder is lead developer and solely responsible. Development depends on a single person. Large community, multiple leaders who coordinate. Separation of development and maintenance.
PostgreSQL is led by a small team of core members of the PostgreSQL Global Development Group and development includes a large international community. There is some division of labor in the core team with Josh Berkus addressing advocacy, Bruce Momjian leading on the project management and the win-32 port. Tom Lane is the most visible and general technical lead and Jan Wieck is concentrating at this time on the Slony-I project. Marc Fournier coordinates the infrastructure and is the team lead. There is not a clear group of people who maintain and do not develop. Most people who develop also maintain their own code area. It is my experience that this is a better model than having a distinct set of bug fixers.
Selling points Only enthusiasm. Commercial issues like security or maintainability.
PostgreSQL is clearly past the fundamental building blocks of a large system. Code maintainability has been addressed in many areas with macros and several re-writes, but this is a solved issue for no product. Security and standards, as well as data integrity, are clear motivations for prompt attention.
Developer community Small tight knit group. Very active developers community, several hand-offs have taken place. Documented procedures to becoming a member.
PostgreSQL is very active. If you are on any of the mailings lists, primarily general and hackers, then you know this. On the mailings lists and on IRC you can ask questions of knowledgeable people or listen and participate in discussions of feature development.

The key handoffs that have taken place for PostgreSQL happened twice. In 1992 there was handoff from the university code to Miro'/Illustra for the commercialization of postgres. In 1996, the upgraded university code then called postgres95 was handed over to the PostgreSQL Global Development Group where it is now. Several early core members of the PGDG have 'retired' and a couple of people have been added over the years.

The only clear procedures for becoming a member of the PostgreSQL community is that you are interested in using PostgreSQL. In order to become a developer, you must show that you've learned the code well enough by showing your work. There are no hard and fast rules and no need of them. However, to be a member of the core development group you must be invited by.

Integration
IndicatorImmatureMature
Modularity No modules, still one single product. Functionality is offered on a take all or nothing basis. Product has been separated into smaller pieces of functionality. Users can select which parts are required. Allows tailoring of the product to a particular situation.
PostgreSQL was initially rather well designed and the code tree has remained very stable. Functionality is the parts of the code that I am familiar with are fairly modularized enabling rewrites of various systems over the years. A database is really one very, very large program, with some supporting programs, like psql, pg_dump.

But remember PostgreSQL was designed with extensibility in mind. Languages, indexes, and data types have been added to the core server. PostgreSQL has a variety of client interfaces and administration GUIs that are written independently of the main server. The interfaces and languages tend to be supported on the gborg server while the GUI interfaces are often developed commercially outside of the postgresql.org domain. Extensions generally are put into the contrib area or moved off onto gborg, the development project server. The contrib area seems to falling out of popularity, although it remains a treasure chest of resources.

Collaboration with other products Not in focus yet. Product development is still firmly centred on core functionality. Product is close to completion. Attention is shifting to linking the product to other products.
PostgreSQL is still firmly centered on core functionality. This kind of focus maintained over time makes the product more robust. Attention to interfacing with other programs, however, has always been important. With PostgreSQL you have the ability to write your client in the language(s) of your choice and also do the same with the server procedure. We could, however, as a human group, be closer to communities like the perl and python groups and the various linux groups. Spreading the word on our compatibility will help us gain desirable talent to do things such as extending pl/perl to enable triggers.
Use
IndicatorImmatureMature
Standards Uses propriety protocols or uses dead end technologies. Uses current accepted protocols and models. Deals with issues surrounding standards, integration etc.
PostgreSQL was the first object relational database management system. Object Relational technology is now being recognized by the standards groups, in SQL2003, for example. After 18 years, the features that PostgreSQL was designed with have become leading edge technology for other database systems. Procedural languages, user defined data types and other extensibility features can now be found in Oracle. Some other databases are just catching up to simple implementations of relational database systems.
Support Just within the own community and then only provided by a small minority within that community. Besides community support, professional support can be purchased. A (SLA) single license agreement can be negotiated. The community itself is active and questions draw responses from a wide section of the community.
PostgreSQL is supported through its mailing lists by one of the largest and most diverse groups of people I've ever seen on a project. Although deference is given to core and active developers, anyone that can help answer questions does. And the answers are checked by even more people. In general the tone is friendly and comraderly. IRC help is also available.

Professional support and consulting is also available by several companies and consultants world wide, including (shameless advertising) my own Varlena, LLC. There is a perceived lack of commercial support for PostgreSQL that must be overcome by these companies in order to make the business world feel more secure about PostgreSQL. Because of our (modified) BSD license, no Single License Agreement is needed.

Acceptance
IndicatorImmatureMature
Ease of deployment Little to no training facilities or courses. Documentation is poor, particularly with regard to maintenance. Training or courses available. In addition to well written documentation lots of HowTos of users detailing particular situations. Within the group knowledge about maintaining the product is readily available.
PostgreSQL documentation has improved considerably in the last several years. The translation of the documentation to many languages is key to our world wide presences. (More shameless advertising) GeneralBits does its share of adding how to and informational guidelines and there is lots of great stuff on techdocs, if you can find it.

Maintenance in this context is the administration of the database system. This topic is given full coverage in the official documentation. More How Tos guidelines are always welcome.

User Community Small group, possible with a high proportion of lurkers. Large group that often has divided itself into sub-groups. Each group has a specific focus. Traffic in general is best described as high-volume. Market penetration Few references, just local promotion. Low exposure is the reason the product isn't wildly known. Often mentioned by others (for example Gartner, ZDNet, Netcraft, IDC). Multiple cases of successful implementation across a range of companies. Well known.
PostgreSQL has divided itself organically into several mailing lists, general, hackers, SQL, interfaces, etc. and it has divided out the client and interface projects over to gborg. We have a very active advocacy group as well as groups of people for different packages and platforms.

Like Ingres and Illustra, PostgreSQL does suffer from "great technology, lousy marketing." The advocacy group is working hard to promote PostgreSQL's visibility, however, our strongest assets are our broad grassroots communities. I encourage the formation of more PostgreSQL users groups. There are many of us willing to speak at other user groups, such as Linux and Perl when asked. (Contact me for more details.)

The Open Source Maturity Model guidelines are excellent ways for us to measure the success of our PostgreSQL project. There are things we can be doing better and perhaps a little bit differently. However, we meet and surpass most of the guidelines easily. All that have participated in the project, if you simply just asked a question or rewrote an entire subsystem, should be proud of the PostgreSQL product.

For more information about the Open Source Maturity Model see www.seriouslyopen.org

Contributors: elein at varlena.com


Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009

Top
Google
Search General Bits & varlena.com Search WWW