Review report of Samizdat (copy)

Person(s) reviewing it: angdraug, boud

More information about it: homepage | devel wiki | ruby 1.6 book | ruby 1.8 news

Software version(s): samizdat 0.6.0 | debian packages

Demo URL:

Demo users and passwords:

Real-world sites: Indymedia Belarus | Indymedia Ukraine

Status of the report: completed 2006-11-28

What We Have

The original feature list is available at CMSWhatWeHave Score:

FeatureScore (0-3)Remarks
Anonymous open publishing3Available out of the box. It is possible to allow or disallow guest users to post messages without registration. No IP logging in database, default Apache configuration disables IP logging in access.log.
Syndication out/in2Export: RSS 1.0 feeds for features, recent updates (newswire), focus updates (see "Categories"), and RDF search queries. Import: only available via import_feeds-0.2 patch.
Comments3Comments have all the properties of ordinary messages except that they can not be promoted to features. As for any other message, a comment can be edited by the same registered user (or by any registered user if it was posted as "open for all") and can have nested comments posted by anyone.
Categories3Samizdat's system of "focuses" is an RDF based Folksonomy. Any message (or any other resource on the site, such as a registered user) can become a focus if a user with "vote" access rights (at least a registered user, can be limited to moderators only) publishes an RDF statement declaring a uni-directional relation between this and any other resource. Such statements can be voted up and down by other users, determining weights of statements and getting messages displayed on focus pages, promoted to features on the front page, or treated as translations of other messages. In a typical indymedia site, "themes" or "categories" are subsets of the set of all articles on that site. In samizdat, you can think of "focuses" as nodes rather than sets: a theme is a node directly linked to a set of messages, but may itself be linked to other focuses, and so on, creating a network of messages rather than a hierarchical structure. The UI for this categorization system is simple and doesn't require any special knowledge.
Search3Search page provides UI to compose RDF queries with arbitrary criteria (e.g. all images published between given dates by given user), basic understanding of RDF or SQL is required to compose anything more than primitive queries. Queries can be published as messages or syndicated, example query selecting all recent comments is linked from the front page. Primitive substring search is also provided on top of the same functionality.
Features3See "Categories". Any message is automatically promoted to the front page features wire if the weight of its relation to any focus (theme) is above a certain configurable threshold. Textile and sanitized HTML markup allows the posting of articles of any complexity.
Calendar0No. Only exists in the development plan.
Multimedia handling2Supports uploads of images, audio, video, and other files (including BitTorrent trackers). List of supported MIME types is stored in config file, grouped by inline (rendered by the engine), images (rendered with <img/> tag), and others (rendered as links for download). No image galleries, no streaming.
The ability to create multiple instances3Available out of the box. No code is shared between instances, each instance only has to have its own config file with non-default options and a piece of Apache configuration, it might also need its own uploads directory (when enabled), and static files, such as icon, logo, SSI-included fragments for static parts of the front page.
Easy mirroring capability2Active read-only mirror is easy to set up: wget cronjob to mirror uploads (known to work), PostgreSQL replication (e.g. Slony, never tried but should work), config file ACL to make mirror site read-only (trivial), mod_rewrite to redirect POST requests to the master site (trivial).
Good performance on affordable hardware2No static-only sites as in Mir, but the code is light-weight and makes extensive use of multi-layer caching. Recent benchmarks show performance of 50-1400 pages per minute with both scripts and database running on same 1.5Ghz single-processor machine.
Customisability3All layout is done via CSS, the only layout table is on the front page. 4 ready CSS themes are included and can be selected by users. Logo section appearing on all pages and static header and footer on the front page are configured in site-specific config files.
Internationalisation3Full Unicode and GNU gettext (.mo files) support. UI translated to 7 languages: ru, be, fr, eo, ua, en, pl.
Translation3Message translations can be published as any other messages, and can be labeled as translations by users with "vote" access rights. Translations are matched against HTTP accept-language header reflecting browser settings or a language cookie for explicitly selected language; the best-matching translation is used to represent message and its title to the given user throughout the site.
Easy moderation3Moderators can hide and unhide messages, move (reparent as a comment to a different message), edit, displace (edit without leaving trace of previous version in the database), change "open for all" status. Moderation log (who did what when to which message) is linked from the front page to keep moderators accountable.
Anti-abuse measures0No captcha, no spam filter, no in-memory IP logging and rate control.
Documentation2Included in the package: verbose installation manual, introduction to Samizdat concepts, extensive review of Samizdat RDF storage implementation, all in English. User documentation is limited to the "How to use the site" section of Indymedia Belarus, mostly in Russian with sparse translations to English, Polish and Esperanto.
Scaling 3Easily clusterizable. Uploads are stored in a directory tree grouped by user logins and can be served from different machines with some mod_rewrite magic. Distributed Ruby based synchronized memory cache serves as a dispatch semaphore for multiple CGI servers accessing the same database.

Legend: 0, feature not present at all - 1, the feature exists but is poor and doesn't suit to our goals - 2, the feature isn't as complete as we want but we can use it - 3, the feature is fully functional

What We Want

The original feature list is available at CMSWhatWeWant

FeatureScore (0-3)Remarks
User logins3Yes. Registration can be configured to require valid email address. Registered users ("members") can be allowed via role-based ACL to vote on elements of site structure, such as categories, front page, translations (see "Categories"). Open publishing for guest users can be disabled via config file on per-site level.
User logins (network wide)0No. Registered user is a part of site structure, decentralized jabber-style (user@site) id from another site can be internalized using existing identity verification code, replacing confirmation email with access to the user's primary site.
Access controls2Site-wide role-based ACLs only - not per resource or individual user.
Notify moderator button0No. Trivial to add with existing sendmail and moderation capabilities.
Anti-bot systems0No.
User moderation - Open editing3Registered users can edit their own messages, user can set "open for all" option at publication time to allow all other registered users to edit the message. Moderators can change open editing flag of any message. Textile (wiki-style) and HTML markup is supported, HTML rendering of any message is filtered through Samizdat::Sanitize library class to remove all non-whitelisted tags, attributes and CSS constructs. Also, see "Categories" on open editing of site structure.
User profiles2Shows list of messages by user, users can also become focuses (see "Categories"). Allows to re-edit messages using a range of markups. Declaring friendship relations between users is designed in, but not implemented.
Version control2Yes. Change history with colorized paragraph-level diffs is available for any edited message, diffs not yet as good as mediawiki.
Podcasting/Vodcasting0No. Can be added as smarter audio/video syndication.
User notifications0No. Easy to add using existing sendmail-based user mailing implementation.
Customizable skins by users3Yes. See "Customisability": users can select from themes configured for a site, or override it with their own CSS.
Accessibility2Yes. Renders well-structured XHTML suitable for alternative presentations. Wasn't verified against Section 508 and WCAG.
Photo galleries1Not there yet. Only possible by doing an RDF search by "image" message class.
Image Manipulation0Not yet.
P2P integration2Allows to publish BitTorrent trackers (but not to generate them automatically), P2P hooks are ready.
Social networking - Filtering systems1Limited social networking can be done based on existing focuses system, more advanced features are designed, but not implemented. See "User profiles" and "Categories".
WYSIWYG editor1No. Easy to plug into existing message publishing code. Textile markup available for Wiki-style formatting.
Tagging3See "Categories".
Cross Site Search0No.
Licencing options0No.
Redundancy (DB content storage)2Yes. See "Easy mirroring capability".
Easy installation2Package in Debian experimental by default prepares a demo site that can be activated by running database generation script and dropping example Apache config in place. The plan is to have one-line "apt-get install samizdat" installation.
Documented APIs3Heavily relies on RDF standard along with Dublin Core metadata and Squish query language. RDF storage library is well-documented (RDoc comments, whitepapers and slides), the rest of internal APIs also include RDoc documentation. Uses existing Ruby libraries wherever possible (Ruby/DBI, YAML, REXML, DRb, Ruby/Gettext, Algorithm::Diff, Redcloth, HTMLTidy).
Software modularity3Code is modular and well-documented: 3600 lines of code and 900 lines of comments (mainly RDoc documentation of classes and methods), 6 core, 7 helper, and 4 library classes, 8 database tables, 11 CGI scripts, unit and regression tests. Ruby is an object-oriented language universally praised for encouraging writing clear and maintainable code.
Healthy community behind2Samizdat community is tiny, consisting mostly of Indymedia activists from Central and Eastern Europe. The Ruby community is a healthy international community including both old LISP and Smalltalk academics and fresh blood fascinated by the Rails framework.

Legend: 0, feature not present at all - 1, the feature exists but is poor and doesn't suit to our goals - 2, the feature isn't as complete as we want but we can use it - 3, the feature is fully functional

What We Could Also Have

Other interesting things this software provides and we can use.

"Rapid Application Development" support1Switch to Nitro/Og framework envisioned. Ruby-on-Rails is considered, but its deployment model appears to be incompatible with the requirement of multiple instances deployment.


Major drawbacks that allow no excuses and need to be worked on as first priority: RSS import (patch available and will be integrated soon), anti-abuse/anti-bot measures (including notify moderator button), calendar (at least in a basic form). Once these features are implemented, there won't be other reason not to choose Samizdat over other CMSes for Indymedia purposes.

Other weak spots (bells and whistles, but of the kind that users are increasingly expecting to have for granted): multimedia handling (including photo galleries, image manipulation, podcasting/vodcasting), WYSIWYG editor, user profiles and social networking.

Areas of improvement (things that can't be called drawbacks, but still are worth working on): performance and mirroring, P2P integration, OpenID support, user notifications, GIS, developer community.

Strong points. First of all, it's ''complete openness and transparency in all aspects of operation'': publishing and editing, categorization and feature articles, moderation. No other CMS pays special attention to having everything possible decided by all users and accountable to all users.

Next in order of importance is support for multi-lingual communities. Samizdat is not that far ahead here, but still no other CMS seems to go beyond switching between language specific site versions. Samizdat goes the extra mile to make sure every user gets the content that matches the list of languages they can understand.

Using RDF as underlying data model has a lot of potential, current search capabilities of Samizdat show the direction where this can go, even though current implementation is not even 10% of what can be done in this area. Semantic Web is still far from widespread adoption, but once it's there, we won't understand how we coped without it.

Last but not the least is maintainability. For the number of features it already has, Samizdat code is really tiny, and it grows slower than linearly as new features are added. It is well documented from developer point of view, no one ever complained about it being hard to understand. Comparison of the "amount of developer time spent to useful functionality" ratio is also in Samizdat's favor, showing how easy it is to add new capabilities. And to keep it secure.

Back to the survey