Web Site

For our purposes, the web site means web pages devoted to helping people participate in the project as developers, documenters, etc. Note that this may be different from the main user-facing web site. In many projects, users have different needs and often (statistically speaking) a different mentality from the developers. The kinds of web pages most helpful to users are not always the same as those helpful for developers. Don't try to make a "one size fits all" web site just to save some writing and maintenance effort: you'll end up with a site that is not quite right for either audience.

The two types of sites should cross-link, of course, and in particular it's important that the user-oriented site have, tucked a way in a corner somewhere, a clear link to the developers' site, since most new developers will start out at the user-facing pages and look for a path from there to the developers' area.

An example may make this clearer. As of this writing in January 2017, the office suite LibreOffice has its main user-oriented web site at https://www.libreoffice.org/, as you'd expect. If you were a user wanting to download and install LibreOffice, you'd start there, go straight to the "Download" link, and so on. But if you were a developer looking to (say) fix a bug in LibreOffice, you might start at https://www.libreoffice.org/, but you'd be looking for a link that says something like "Developers", or "Development", or "Get Involved" — in other words, you'd be looking for the gateway to the development area.

In the case of LibreOffice, as with some other large projects, they have a few different gateways to developer-land. There's a prominent link that says "Get Involved", and at the top there's also a dropdown menu named "Community" that offers both "Get Involved" and another link entitled "Developers" .

The "Get Involved" page is aimed at the broadest possible range of potential contributors: developers, yes, but also documenters, quality-assurance testers, marketing helpers, web infrastructure experts, financial or in-kind donors, interface designers, support forum helpers, etc. This frees up the "Developers" page to target the rather narrower audience of programmers who want to get involved in improving the LibreOffice code. The set of links and short descriptions provided on both pages is admirably clear and concise: you can tell immediately from looking whether you're in the right place for what you want do, and if so what the next thing to click on is. The "Development" page gives some information about where to find the code, how to contact the other developers, how to file bugs, and things like that, but most importantly it points to what most seasoned open source contributors would instantly recognize as the real gateway to actively-maintained development information: the development wiki at https://wiki.documentfoundation.org/Development.

This division into two contributor-facing gateways, one for all kinds of contributions and another for coders specifically, is probably right for a large, multi-faceted project like LibreOffice. You'll have to use your judgement as to whether that kind of subdivision is appropriate for your project; at least at the beginning, it probably isn't. It's better to start with one unified contributor gateway, aimed at all the types of contributors you expect, and if that page ever gets large enough or complex enough to feel unwieldy — listen carefully for complaints about it, since you and other long-time participants will be naturally desensitized to weaknesses in introductory pages! — then you can divide it up however seems best.

From a technical point of view there is not much to say about setting up the project web site. Web hosting is a fairly easily solved problem, and most of the important things to say about layout and arrangement were covered in the previous chapter. The web site's main function is to present a clear and welcoming overview of the project, and to bind together the various collaboration tools (the version control system, bug tracker, etc.). To save time and effort, many projects just use one of the canned hosting services, as described below.

Canned Hosting

A canned hosting site is an online service that offers some or all of the online collaboration tools needed to run a free software project. At a minimal, a canned hosting site offers public version control repositories and bug tracking; most also offer wiki space, many offer mailing list hosting too, and some offer continuous integration testing and other services[28]. For many projects, canned hosting provides a perfectly adequate developer-oriented entry point to the project, and there is no need to set up a separate web site.

There are two main advantages to using a canned site. The first is server capacity and bandwidth: their servers are hefty boxes sitting on really fat pipes. No matter how successful your project gets, you're not going to run out of disk space or swamp the network connection. The second advantage is simplicity. They have already chosen a bug tracker, a version control system, perhaps discussion forum software, and everything else you need to run a project. They've configured the tools, arranged single-sign-on authentication where appropriate, are taking care of backups for all the data stored in the tools, etc. You don't need to make many decisions. All you have to do is fill in a registration form, press a button, and suddenly you've got a project development web site.

These are pretty significant benefits. The disadvantage, of course, is that you must accept their choices and configurations, even if something different would be better for your project. Usually canned sites are adjustable within certain narrow parameters, but you will never get the fine-grained control you would have if you set up the site yourself and had full administrative access to the server.

A perfect example of this is the handling of generated files. Certain project web pages may be generated files — for example, there are systems for keeping FAQ data in an easy-to-edit master format, from which HTML, PDF, and other presentation formats can be generated. As explained in the section called “Version Everything”, you wouldn't want to version the generated formats, only the master file. But when your web site is hosted on someone else's server, it may be difficult to set up a custom hook to regenerate the online HTML version of the FAQ whenever the master file is changed.

If you choose a canned site, try to leave open the option of switching to a different site later, by using a custom domain name as the project's development home address. You can forward that URL to the canned site, or have a fully customized development home page at the main URL and link to the canned site for specific functionality. Just try to arrange things such that if you later decide to use a different hosting solution, the project's main address doesn't need to change.

And if you're not sure whether to use canned hosting, then you should probably use canned hosting. These sites have integrated their services in myriad ways (just one example: if a commit mentions a bug ticket number using a certain format, then people browsing that commit later will find that it automatically links to that ticket), ways that would be laborious for you to reproduce, especially if it's your first time running an open source project. The universe of possible configurations of collaboration tools is vast and complex, but the same set of choices has faced everyone running an open source project and there are some settled solutions now. Each of the canned hosting sites implements a reasonable subset of that solution space, and unless you have reason to believe you can do better, your project will probably run best just using one of those sites.

Choosing a Canned Hosting Site

There are now so many sites providing free-of-charge canned hosting for projects released under open source licenses that there is not space here to review the field.

So I'll make this easy:

If you don't know what to choose, then choose GitHub (https://github.com/). It's by far the most popular and appears set to stay that way for some years to come. It has a good set of features and integrations. Many developers are already familiar with GitHub and have an account there. It offers https://develop.github.com/ for interacting programmatically with project resources, and while it does not currently offer mailing lists, there are plenty of other places you can provision those, such as https://discourse.org/ and https://groups.google.com/.

If you're not convinced by GitHub (for example because your project uses, say, Mercurial instead of Git for version control), but you aren't sure where to host, take a look at Wikipedia's thorough comparison at https://en.wikipedia.org/wiki/Comparison_of_open_source_software_hosting_facilities; it's the first place to look for up-to-date, comprehensive information on open source project hosting options.

Hosting on Fully Open Source Infrastructure

Although all the canned hosting sites use plenty of free software in their stack, most of them also wrote some proprietary code to glue it all together. In these cases the hosting environment itself is not fully open source, and thus cannot be easily reproduced by others. For example, while Git itself is free software, GitHub is a hosted service running partly with proprietary software — if you leave GitHub, you can't take a copy of their infrastructure with you, at least not all of it.

Some projects would prefer a canned hosting site that runs an entirely free software infrastructure and that could, in theory, be reproduced independently were that ever to become necessary. Fortunately, there are a few such sites. As of mid-2015, https://phacility.com/ offers low-cost project hosting in http://phabricator.org/, a fully open source web platform for open-source style collaboration.

There is also GitLab (https://gitlab.com/), which until recently was fully open source. However, they now seem to offer an open source edition for self-hosting and a so-called "enterprise edition"[29] that they host and that has a few features the open source edition doesn't have. Unfortunately, GitLab.com doesn't currently seem to offer a hosting package based on the strictly open source edition, which is too bad, because that would be the best choice for an open source project that wants to outsource hosting while preserving its commitment, both philosophical and utilitarian, to software freedom.

Furthermore, any service that offers hosting of the Redmine (http://www.redmine.org/) or Trac (https://trac.edgewall.org/) code collaboration platforms effectively offers fully freedom-preserving project hosting, because those platforms include most of the features needed to run an open source project. Some companies offer that kind of commercial platform hosting with a zero-cost or very cheap rate for open source projects.

Should you host your project on fully open source infrastructure? While it would be ideal, from the free software philosophical perspective, to have access to all the code that runs the site, I cannot say I've seen any evidence that hosting or not hosting on that kind of site has any effect on a project's success. The vast majority of developers who work on free software projects are willing to participate through a non-free hosting platform when that's what the project is using.

In my opinion, the crucial thing is to be able to interact with project data in automatable ways, and to have a way to export the data. A site that meets these criteria can never truly lock you in, and will even be somewhat extensible, via its programmatic interface. While there is some value in having all the code that runs a hosting site available under open source terms, in practice the demands of actually deploying that code in a production environment are difficult and sometimes prohibitive for most projects anyway. The largest of these sites need multiple servers, customized networks, and full-time staffs to keep them running; merely having the code would not be sufficient to duplicate or "fork" them anyway. The main thing is just to make sure your data isn't trapped.

Of course, all the above applies only to the servers of the hosting site. Your project itself should never require participants to run proprietary collaboration software on their own machines.[30]

Anonymity and Involvement

A problem that is not strictly limited to the canned sites, but is most often found there, is the over-requirement of user registration to participate in various aspects of the project. The proper degree of requirement is a bit of a judgement call. User registration helps prevent spam, for one thing, and even if every commit gets reviewed you still probably don't want anonymous strangers pushing changes into your repository, for example.

But sometimes user registration ends up being required for tasks that ought to be permitted to unregistered visitors, especially the ability to file tickets in the bug tracker, and to comment on existing tickets. By requiring a logged-in username for such actions, the project raises the involvement bar for what should be quick, convenient tasks. It also changes the demographics of who files bugs, since those who take the trouble to set up a user account at the project site are hardly a random sample even from among users who are willing to file bugs (who in turn are already a biased subset of all the project's users). Of course, one wants to be able to contact someone who's entered data into the ticket tracker, but having a field where she can enter her email address (if she wants to) is sufficient. If a new user spots a bug and wants to report it, she'll only be annoyed at having to fill out an account creation form before she can enter the bug into the tracker. She may simply decide not to file the bug at all.

If you have control over which actions can be done anonymously, make sure that at least all read-only actions are permitted to non-logged-in visitors, and if possible that data entry portals, such as the bug tracker, that tend to bring information from users to developers, can also be used anonymously, although of course anti-spam techniques, such as captchas, may still be necessary.



[28] Note that for successful free software projects, interested commercial entities will eventually often step up to fund many of these services anyway; see the section called “Providing Build Farms and Development Servers” for further discussion of this.

[29] See the section called “"Commercial" vs "Proprietary"” for why I put scare quotes around "enterprise edition".

[30] The exception to this is proprietary Javascript code that is received from the hosting site and run confined or "sandboxed" in one tab in the user's browser. The question of whether such code is conceptually an extension of the server, or should be thought of as running on the client machine even though in some senses it has more access to server resources than to client resources, is a deep and ongoing debate. We won't settle it here, but it's at least more complex than just which CPU is executing the instructions.