The Netherlands Unix User Group
|
Views of a Conference
|
31-May-2004
| |
I have spent almost two weeks in the lovely country of The Netherlands.
The primary purpose of my visit was to speak at
The Netherlands Unix Users' Group's (NLUUG) conference on
Open Source in Business. Needless
to say, this was a challenge in languages as few talks were actually
in English, my native language. (My second language, Spanish, did not
help much except in the restaurant but that is another story :-)
However, some of the talks in Dutch had papers in English--I was glad
of that. Most people do speak English in The Netherlands and all I
met were very positive and friendly. I had the chance to shake
hands with and thank Bram Moolenar for his fine work with
vim.
And, particularly nice and helpful were JC, Piet, Jeroen and Johan.
Find us in the
Pictures.
Here I will summarize a few particularly interesting points from
a couple of the talks that I heard or could read. My own talk was on
PostgreSQL and if you've been reading my column here, you know
what my talk was like.
Keynote
Ron Tolido, Capgemini
The keynote address had no paper and was not in English, but
the slides were. Two very interesting lists on the slides
included how to choose an open source product and how to
evaluate whether the project was "mature" and therefore
reliable. I am very happy to report that PostgreSQL, as
an open source project measured up very well against these
lists. This idea is covered in the next article in this issue
and is based on work from
www.seriouslyopen.org.
Is Open Source Better?
Rix Groenboom, Reasoning, The Netherlands
In this paper, Rix reviews several projects which objectively
measured the number of defects in commercial and open source
software at various times in the development cycle. The software
used to evaluate the process is Reasoning's automated software
inspection (ASI) service which reports, among other items,
memory leaks, NULL pointer dereferences, bad deallocations,
out of bounds array accesses and uninitialized variables.
Defects are measured per thousand lines of code (KLOCs).
The average density of defects of Reasoning's more recent
200 projects, totaling over 35 million lines of code, is
0.57. In the analysis of MySQL 4.0.16 (as the Open Source product)
a density of 0.09 was found. In an average of commercial
database projects, a density of 0.58 was found, similar to
the total expected average.
Other projects analyzed were Linux 2.4.19 TCP/IP and a commercial
TCP/IP and a commercial web server vs. Apache 2.1. In these
cases the number of defects was shown to fall fast in the
development cycle for open source where the number of
defects in the commercial code fell noticeably slower,
not correcting the majority of defects at all by the end
of the release cycle.
The reasons that Rix states for open source success are that the
open source model encourages several activities that are not
common in the development of commercial code, including:
- Many users don't just report bugs, as they would do with
commercial software, but actually track them down to their root
causes and fix them.
- Many developers are reviewing each other's code, if only
because it is important to understand the code before it can be
changed or extended. It has
long been know that this peer reviewing is an effective
way to find defects.
- In the typical Open Source model, programmers organize themselves
around a project based on their contributions. The most effective
programmers write the most crucial code, review the contributions
of others and decide which of these contributions make it into the next
release.
Open Source Software in Globally Embargoed Countries
Andy Haxby, Competa IT BV
US technology is forbidden to be exported to
places that the US has designated Globally Embargoed Countries,
or GECs. Apparently this designation is ignored locally but
in multi-national companies, this is a serious legal affair. If
ignored could cause the US to forbid the company trading in the US.
This talk covered two projects for a multi-national oil company
setting up offices in Libya and Iran. In both cases, the use
of Open Source Software was key in creating the infrastructure
needed for the offices. The legal problems were complex, ambiguous
and different for each country. For example,
"Indications of
how much of a US technology or product can be exported.
For example a product may be capable of export if it
includes less that 10% of an otherwise non-exportable US technology.
Does a PC from Taiwan contain less than 10% US technology?"
Also, encryption is forbidden in Libya, by Libya and other
encryption technologies are on a case by case basis.
Two clear points were most pertinent from my point of view.
First, that office infrastructure could be and was constructed
from Open Source. Second, that the first project failed
in the long run while the second succeeded primarily
due to time spent on design and the effort accorded to project management.
A more minor interesting point was that the standard
packages available, through Red Hat, for example, were
difficult to deconstruct to remove the software that
was not GEC compliant. A new packaging was needed.
The first project used a cut up version of an existing
package and the second project, more successfully
created a new package based on Debian.
The first project was created on the fly without a clear idea
of what the users wanted and needed. There were also cultural concerns
regarding calling support. No one called support or reported
bugs because they equated it with the admittance of failure.
The second project gained insight from the first and spent
considerably more time on "what the user sees" and the
centralized project management coordinated the engineering,
legal, design and other parts of the project as a whole.
Other Talks
Other talks or papers in English included The FreeBSD Core
Network Server Concept Patrick Schoo, The Future
of CIFS (samba) Jelmer Vernooij, Open Source
Package Management Jos Vos, X/OS Experts in Open Source.
And Piet de Visser did half of his talk on databases
in Open source in English (on my behalf), but notes do not do him
justice except to say he invoked MySQL, Codd's Rules,
PostgreSQL (with a charging elephant) and Ernie Ball.
Open Source in Business is clearly a hot topic. It is inspiring
to see that there is visible effort and success in using
Open Source products either alone or with proprietary products
in business. More information on the NLUUG conference
and NLUUG can be found at www.nluug.nl.
Contributors:
elein at varlena.com
Open Source Maturity Model
|
PostgreSQL as a Mature Open Source Project
|
05-Jun-2004
| |
The Open Source Maturity Model is a white paper published
on the web site www.seriouslyopen.org.
The basics of this topic were covered at The Netherlands Unix Users Group
Conference on Open Source in Business in
the keynote address by Ron Tolido. The paper covers a specific technique
for evaluating open source projects. This is critical information for
people in business opening up to the concept of Open Source products.
Two key areas of interest are the measurement of maturity of an open source
project as well as the product indicators that tell whether the product is
applicable for use. In this article I will review the measurements of
maturity of PostgreSQL as a project.
Tables and descriptions in italics are quoted directly from
The Open Source Maturity Model 1.5.3
You are encouraged to read the entire article.
DIFFERENCES BETWEEN OPEN SOURCE AND COMMERCIAL PRODUCTS
Open Source products are freely available and naturally that is not
the case for commercial products. Generally the users of a commercial product do
not receive the source code of the product, but users of an Open Source product
expect to receive the source code. Some of other differences between these types
are presented in the table below.
| Commercial | Open Source |
Supplier | A company | A community |
Product development |
Driven by corporate economics |
Driven by product functionality |
Developers |
Limited numbers with product knowledge, all paid for the supplier |
Varies from a small to very large group of developers.
Often permanently employed, sponsored or volunteers. |
Stability |
New trends are incorporated quickly if there is a commercial incentive. |
New ICT developments are incorporated into the product if this benefits the users. |
Users |
Commonly not organized, every user maintains contact with the supplier independently. |
Users participate in virtual communities and discuss among themselves and
with the developers about the product and future developments. |
This comparison of a company and an Open Source project should look very familiar
to open source proponents.
- In Product Development PostgreSQL is
also led by 1) Standards and 2) People and Time. Functionality for PostgreSQL
is driven by the people showing needs for features, moderated by technical
appropriateness, standards and resources.
- The most active PostgreSQL Developers
are sponsored or employed by various companies: Red Hat, SRA, Fugitsu, Affilias
to name a few. However, there is still a large number of knowledgeable volunteers
and a contingent of past developers whose knowledge can be drawn on.
- PostgreSQL
Users are organized organically into mailing lists. There are a large
number of active contributors to every mailing list--both questioners and
answerers. From my experience with commercial customer support, I would much
rather have many eyes considering my questions than to have it passed from
tier to tier before finding someone who maybe knew something about my question.
This is the table describing generally the measurement of maturity of
an open source product. The PostgreSQL rows are my comments. Everything
else is quoted directly from
The Open Source Maturity Model 1.5.3
Product |
Indicator | Immature | Mature |
Age | The project has just started. The stability of the developers group and need for the product are unclear. |
The project is been active for some time. The project stability and need for the project are no longer issues. |
PostgreSQL was begun in 1996 as postgres and continues today in 2004.
The product is very stable with most bugs relating to usability and features.
The industry has shown that the database market, even the "commodity" database
is a multi-billion dollar industry. Clearly there is a need for a stable, open source,
relational database system such as postgreSQL, particularly now with the
market for open source products opening quickly.
|
Licensing |
Not fully described or clearly unsuitable for the product. |
One of the standard licenses (www.opensource.org/licenses/) Offers clear motives for choosing the license type, which is supported by the user community. Often allows both commercial and Open Source variants to co-exist. |
PostgreSQL has always carried the clear and simple (modified) BSD license. See
General Bits Issue #69
for a more complete description of the BSD license.
|
Human hierarchies |
Original founder is lead developer and solely responsible. Development depends on a single person. |
Large community, multiple leaders who coordinate. Separation of development and maintenance. |
PostgreSQL is led by a small team of core members of the PostgreSQL Global
Development Group and development includes a large international community.
There is some division of labor in the core team with Josh Berkus addressing advocacy,
Bruce Momjian leading on the project management and the win-32 port. Tom Lane is the
most visible and general technical lead and Jan Wieck is concentrating at this
time on the Slony-I project. Marc Fournier coordinates the infrastructure and
is the team lead. There is not a clear group of people who maintain and do not
develop. Most people who develop also maintain their own code area. It is my
experience that this is a better model than having a distinct set of bug fixers.
|
Selling points |
Only enthusiasm. |
Commercial issues like security or maintainability. |
PostgreSQL is clearly past the fundamental building blocks of a
large system. Code maintainability has been addressed in many areas
with macros and several re-writes, but this is a solved issue for
no product. Security and standards, as well as data integrity, are clear
motivations for prompt attention.
|
Developer community |
Small tight knit group. |
Very active developers community, several hand-offs have taken place. Documented procedures to becoming a member. |
PostgreSQL is very active. If you are on any of the mailings lists,
primarily general and hackers, then you know this. On the mailings lists
and on IRC you can ask questions of knowledgeable people or listen and
participate in discussions of feature development.
The key handoffs that
have taken place for PostgreSQL happened twice. In 1992 there
was handoff from the university code to Miro'/Illustra for the
commercialization of postgres. In 1996, the upgraded university code
then called postgres95 was handed over to the PostgreSQL Global Development
Group where it is now. Several early core members of the PGDG have 'retired'
and a couple of people have been added over the years.
The only clear procedures for becoming a member of the PostgreSQL community
is that you are interested in using PostgreSQL. In order to become a developer,
you must show that you've learned the code well enough by showing your work.
There are no hard and fast rules and no need of them. However, to be a member
of the core development group you must be invited by.
|
Integration |
Indicator | Immature | Mature |
Modularity |
No modules, still one single product. Functionality is offered on a take all or nothing basis. |
Product has been separated into smaller pieces of functionality. Users can select which parts are required. Allows tailoring of the product to a particular situation. |
PostgreSQL was initially rather well designed and the code tree has
remained very stable. Functionality is the parts of the code that I am familiar
with are fairly modularized enabling rewrites of various systems over the years.
A database is really one very, very large program,
with some supporting programs, like psql, pg_dump.
But remember PostgreSQL was designed with extensibility in mind.
Languages, indexes, and data types have been added to the core server.
PostgreSQL has a variety of client interfaces and administration GUIs
that are written independently of the main server. The interfaces
and languages tend to be supported on the gborg server while the
GUI interfaces are often developed commercially outside of the postgresql.org
domain.
Extensions generally are put into the contrib area or moved
off onto gborg, the development project server.
The contrib area seems to falling out of popularity, although it
remains a treasure chest of resources.
|
Collaboration with other products |
Not in focus yet. Product development is still firmly centred on core functionality. |
Product is close to completion. Attention is shifting to linking the product to other products. |
PostgreSQL is still firmly centered on core functionality. This kind of focus
maintained over time makes the product more robust. Attention to interfacing with
other programs, however, has always been important. With PostgreSQL you have the
ability to write your client in the language(s) of your choice and also do the
same with the server procedure. We could, however, as a human group, be closer
to communities like the perl and python groups and the various linux groups.
Spreading the word on our compatibility will help us gain desirable talent to
do things such as extending pl/perl to enable triggers.
|
Use |
Indicator | Immature | Mature |
Standards |
Uses propriety protocols or uses dead end technologies. |
Uses current accepted protocols and models. Deals with issues surrounding standards, integration etc. |
PostgreSQL was the first object relational database management system.
Object Relational technology is now being recognized by the standards
groups, in SQL2003, for example. After 18 years, the features that PostgreSQL
was designed with have become leading edge technology for other database systems.
Procedural languages, user defined data types and other extensibility features can
now be found in Oracle. Some other databases are just catching up to simple
implementations of relational database systems.
|
Support |
Just within the own community and then only provided by a small minority within that community. |
Besides community support, professional support can be purchased. A (SLA) single license agreement
can be negotiated. The community itself is active and questions draw responses
from a wide section of the community. |
PostgreSQL is supported through its mailing lists by one of the largest and
most diverse groups of people I've ever seen on a project. Although deference is given
to core and active developers, anyone that can help answer questions does. And
the answers are checked by even more people. In general the tone is friendly and
comraderly. IRC help is also available.
Professional support and consulting is also available by several companies and
consultants world wide, including (shameless advertising) my own
Varlena, LLC. There is a perceived lack
of commercial support for PostgreSQL that must be overcome by these companies
in order to make the business world feel more secure about PostgreSQL.
Because of our (modified) BSD license, no Single License Agreement is needed.
|
Acceptance |
Indicator | Immature | Mature |
Ease of deployment |
Little to no training facilities or courses. Documentation is poor, particularly with regard to maintenance. |
Training or courses available. In addition to well written documentation lots of HowTos of users detailing particular situations. Within the group knowledge about maintaining the product is readily available. |
PostgreSQL documentation has improved considerably in the last several years.
The translation of the documentation to many languages is key to our world wide presences.
(More shameless advertising) GeneralBits
does its share of adding how to and informational guidelines and there is lots of
great stuff on techdocs, if you can find it.
Maintenance in this context is the administration of the database system. This topic
is given full coverage in the official documentation. More How Tos guidelines are
always welcome.
|
User Community |
Small group, possible with a high proportion of lurkers. |
Large group that often has divided itself into sub-groups. Each group has a specific focus. Traffic in general is best described as high-volume. Market penetration Few references, just local promotion. Low exposure is the reason the product isn't wildly known. Often mentioned by others (for example Gartner, ZDNet, Netcraft, IDC). Multiple cases of successful implementation across a range of companies. Well known.
|
PostgreSQL has divided itself organically into several mailing lists, general,
hackers, SQL, interfaces, etc. and it has divided out the client and interface
projects over to gborg. We have a very
active advocacy group as well as groups of people for different packages and platforms.
Like Ingres and Illustra, PostgreSQL does suffer from "great technology, lousy marketing."
The advocacy group is working hard to promote PostgreSQL's visibility, however,
our strongest assets are our broad grassroots communities. I encourage the
formation of more PostgreSQL users groups. There are many of us willing to speak
at other user groups, such as Linux and Perl when asked.
(Contact me for more details.)
|
The Open Source Maturity Model guidelines are excellent ways for us to measure
the success of our PostgreSQL project. There are things we can be doing better
and perhaps a little bit differently. However, we meet and surpass most of the
guidelines easily. All that have participated in the project, if you
simply just asked a question or rewrote an entire subsystem, should be proud
of the PostgreSQL product.
For more information about the Open Source Maturity Model see
www.seriouslyopen.org
Contributors:
elein at varlena.com
|