Thalassa CMS logo

Thalassa CMS

User comments

Contents:

Introduction

To let the users comment on the site's pages is actually the main thing Thalassa CGI is created for, so there are lots of configuration aspects directly related to comments, and also there are things that look generic enough, but in reality are only used for comments, such as

request arguments (BTW, if you didn't read that section or don't remember it, it's now the time to refresh it).

To have the comment support on your site to the full extent possible with Thalassa CGI, it is necessary to configure:

The last step can be omitted in case you decide not to premoderate comments on the site.

Anyway, without the comments facility, Thalassa CGI in its present version may only be useful for its contact form, but then you don't need all user account-related stuff too.

Comments storage and format

Before proceeding any further, please get back to Thalassa static generator description and re-read the section “How comments are stored”.

It is important to keep in mind that a separate comment collection (effectively, a directory with headed text files, one file per comment) corresponds to every generated page (in terms of the generator) that has a comment section.

For the generator, it doesn't matter what particular format and encoding the comment files have, the thing that matters is that format-related information is correctly reflected in each file's header. It must, however, know the target encoding and the list of allowed HTML tags to generate everything correctly.

As for the CGI program, it always stores comments in the (presumably) same encoding in which everything works, so it just doesn't emit the encoding: header field. However, in case a comment file is not created by the Thalassa CGI program, and the user requests to view or edit the comment, the file can have its own encoding, so the CGI program needs to know to which encoding to convert it — that is, what is the main working encoding here. Furthermore, such file can contain HTML tags, and the program might need to filter them, hence it must know which tags are allowed.

To provide all the format-related information, a [format] ini section should be added into the configuraiton file. It has the same parameters as the [format] ini section in the static content generator configuration files; actually, the section should be copied from there, altogether. The only difference is that, unlike the generator, the CGI program configuration doesn't have the [options ] section group, so [format] section parameters here are not macroprocessed at all.

Premoderation queue implementation

Within the session database, a subdirectory named _premod_queue is created to hold your premoderation queue; if the subdirectory isn't there, it means no comments were ever added to the queue on your site.

The directory contains symbolic links to the comment files that reside within the content database. However, the most of information needed by the Thalassa CGI resides within names of these symlinks. Each name consists of three parts: realm ID, page ID (within the realm) and comment ID. The term realm here corresponds to something within which there are pages with IDs; in the present version of Thalassa, it is either a page set, or a list that has item pages. It is important to note that the realm ID may, but is not obliged to match the list or set ID. The actual mapping from the realm IDs to lists and sets is done by various configuration file parameters, and is never done explicitly, as Thalassa CGI has no direct support for it. Consider the realm ID as something that you use to tell one page ID space from another, in case you have more than one of them.

The three components of the symlink name are joined with the “=” char, like pages=pg37=233. Here “pages” is the realm ID, pg37 is the page ID within the realm, and 233 is the comment ID.

When a user leaves a comment on your site and according to the configuration the comment is to go through the premoderation procedure, Thalassa CGI saves the comment with premod and hidden flags set, and makes the appropriate symlink. The premod is not used by the present version of Thalassa, but it still can be used by third party software; Thalassa itself only uses the contents of the _premod_queue directory.

The [comments] ini section

Comment database path

To be able to work with comments, the CGI program needs to know where they (that is, the “database” files that store comments) are located. Thalassa CGI assumes by design there's a common directory under which all comments are found, and these “comment tree” directories, which correspond to every page with comments, are subdirectories of the common one.

The [comments] ini section recognizes two parameters to deal with directories:

The dir and subdir are concatenaded as path parts (in contrast with as strings), so if neither the dir ends with “/”, nor subdir begins with one, then the “/” is inserted between them forcibly. Hence, the subdir is always used as relative to the dir.

If you really want it to be an absolute path, then set the dir to “/” (root directory). Since the %[cmtinfo:topics] function appeared, this practice can no longer be recommended.

Access control

Various aspects of creating and managing comments are subject to access control. The authorization scheme is based on a fixed list ofi permissions, for each of which a list of authorized user roles may be specified.

Thalassa CGI knows the following permissions related to comments:

It is very important to understand the difference between permissions and roles. There is a fixed set of permissions, hardcoded into the Thalassa CGI program; in the present version, comments is the only facility that uses the permission system, so the permission identifiers listed above are the only existing ones.

Roles, in contrast, are intended to be introduced by the site administrator. There are three “special” roles (the all role is always there, even if there's no user, the anon role is there for sessions that aren't logged in, and the auth role belongs to every logged in user), but that's all; you decide which other roles to use, how to name them etc.

To assign permissions to roles, the access parameter (within the [comments] section) is used. Its value must consist of stanzas, separated with the semicolon “;”. For each stanza, any leading and trailing whitespace is stripped, then the first word (delimited by whitespace) is considered the permission name, and the rest is a comma-separated list of roles. For example:

  access = post            all;
           post_visible    superposter, moderator, admin;
           see_hidden      moderator, admin;
           see_own_hidden  auth;
           moderation      moderator, admin;
           edit            admin;
           edit_own        superposter, admin;
           edit_own_recent auth;

In this example, three “custom” roles are used: admin, moderator and superposter; users with the admin role effectively can do everything Thalassa CGI is capable of, while moderators can do, well, moderation, and to do so, they are permitted to see hidden comments (otherwise it will be hard to moderate). Besides that, for moderators the premoderation queue is bypassed, which is logical enough, as they can go and let their comments pass anyway. The last custom role in the example, superposters, also can post comments without anyone's approval, and besides that, they are granted the permission to edit (and delete) their own comments. All registered (authenticated) users are permitted to see their own comments, even in hidden state, and to edit their own recent comments; anonymous users can only post comments, but once posted, they are unable to view their comments and to do anything else with the comments (which is obvious: there's no way to tell which comments are “owned” by a particualar anonymous user).

Certainly your permission system can be much simpler, e.g.:

  access = post            all;
           edit_own_recent auth;
           post_visible    admin;
           see_hidden      admin;
           moderation      admin;
           edit            admin;

In this example, only one custom role (admin) is used. Everyone can post, authenticated (registered) users enjoy the possibility to edit/delete a recently posted comment, and everything else is only available to admins. If you replace the “all” at the top line with “auth”, users will have to sign up in order to make comments.

One can even go with something like this:

  access = post_visible auth

Here, all authenticated users are able to comment without premoderation approval, and nothing else can be done through the web interface; all administration tasks and everything else is done manually.

Thalassa CGI decides if a particular comment is “recently posted” or not, basing on the value of the recent_timeout parameter. The unit here is a minute, so

  recent_timeout = 30

means half an hour.

In the present version there's no web interface for manipulating user roles, so if you need to grant a user some custom roles, let the user sign up for an account, then go to your database directory, change to _users/NNNNN subdirectory (where NNNNN is the user login name), open the file _data in your favourite editor and add a line like this:

  roles = moderator, superposter

On pages created to adding and editing comments, it is useful to have links back to the page where the comment belongs to. To make such links possible, two parameters must be properly filled:

Both URLs must either be full URLs, or local URIs, as used in the a href attribute. It is recommended to make local URIs absolute, that is, starting with “/”. Both parameters are usually created with request arguments.

The values set here are available through the respective functions of the %[discuss: ] macro.

Original page text display

Users generally prefer to see the text they are replying to. When the text being commented on is another comment, this doesn't make any problems as Thalassa CGI is fully capable of working with the comment part of the content database, so it can just get the text of the comment in question by means it has to have anyway. However, the things aren't that easy when the comment being composed is supposed to be a “top-level”, so the text being commented doesn't belong to any other comments — it is the text of the page.

Things remain relatively easy if the page comes from a page set, so it is represented with a headed text file. The CGI program is linked with the modules supporting this format anyway, so it is not too complicated to extract the necessary information. The only problem is the file's location. It could be deduced from the static generator's configuration, but to achieve that, the CGI problem would need to have all the code the generator uses to access its database, which is a bit too much. So instead of this, Thalassa CGI simply uses another parameter within the [comments] section, named page_source. Its value, if defined and not empty, must be the filesystem path (either absolute, or relative to the CGI's working directory) of the headed text file which is the source of the page being discussed. Certainly, the value should in most cases be created with request arguments, typically it is something like

  page_source = %[reqarg:origpgpath]

and the origpgpath is set completely within the respective [page ] section.

In case the page actually comes from a list which has item pages, not from a page set, there's actually no source file, and, furthermore, despite it is still possible to deduce the page text from the content database, a significant part of the static generator's code (if not all) would be necessary for that. So, for this case the text is rather taken right from the generated page, that is, the HTML file.

To enable the CGI program to do this, first of all, two marks must be inserted into every such page: the text begin mark and the text end mark. Both marks are whole lines and perpaps should be HTML comments; you can peek any strings for this purpose, for example, <!--THALCGI-BEGIN-MARK--> and <!--THALCGI-END-MARK-->. The lines containing the marks may also contain any amount of leading and trailing whitespace, but no other chars besides the marks themselves. The static generator has no special means for this, from its point of view these marks are just a part of the content being generated. For the CGI program, these marks must be set explictly with the page_html_marks parameter. The parameter's value must consist of exactly two strings, the first for the begin mark and the second for the end mark, like this:

  page_html_marks = <!--THALCGI-BEGIN-MARK-->
                    <!--THALCGI-END-MARK-->

All leading and trailing whitespace is trimmed off in both lines.

The page_source parameter must be left empty for this case, which means its value may only contain whitespace (but any amount of it).

The path to the HTML file is set with the page_html_file parameter; it may be both an absolute or a relative path, but as the HTML file typically resides within the web site content tree, just like the CGI program itself, in most cases this path should be relative. Like for all similar cases, this parameter's content is almost always created using request arguments.

In both cases, the text extracted from either source will be available with the body function of the %[discuss: ] macro. If the page comes from a page set and hence is extracted using the page_source parameter from its source (headed text) file, the page's title will also be available, with the title function. Thalassa CGI is unable to extract the title separately from the body in case the HTML file is used as the source (page_html_file and page_html_marks parameters), so if the HTML file is used, the title function will return an empty string.

Rebuilding commented pages

Whenever a comment is added, deleted or changes its visibility status, the page containing the comment needs to be rebuilt. This is typically done running the thalassa program (that is, the static content generator) with command line parameters that tell it to rebuild a particular page using the spool facility (the spool is necessary because several copies of CGI program may run simultaneously and run several copies of the generator).

The page_regen_command parameter sets the command line (name and arguments) for the external program (presumably thalassa) to launch to regenerate the current page, that is, the page whose properties are set by request arguments. As usual for command lines set by ini file parameters, arguments are split down to words, using the apostrophe “'” and the doublequote “"” as grouping symbols (both an apostrophe within doublequotes and a doublequote within apostrophes are considered as plain chars).

In most cases, there should be a dedicated request argument that sets the string to be passed as the -g parameter to thalassa. A name like gentarget might be a good choice for this request argument.

Displaying premoderation queue

Every time a comment is added, removed, hidden or revealed, the HTML page that contains the comment must be regenerated. To do so, the CGI program runs the static generator; however, the CGI program itself doesn't know how to run the generator (is this surprising?), so you need to tell it. This is done with the premodq_page_id parameter. The parameter's value is a command line (name and arguments) for the external program to regenerate the “current” page; arguments are split down to words, using the apostrophe “'” and the doublequote “"” as grouping symbols (both an apostrophe within doublequotes and a doublequote within apostrophes are considered as plain chars). Obviously, this parameter's content is almost always created using request arguments.

The command name, as usual with execve(3) argument, may be an absolute path, a relative path and a short name — if it contains no slashes “/”, the PATH environment variable will be searched for the binary to run. However, it is strongly recommended not to use short names here, even if the thalassa binary is “installed” in yor system (in a directory like /usr/local/bin) and is available through the PATH.

New comment form

The comment_add action lets the user to add a new comment on your site. The action accepts one argument, the ID of the comment we're replying to. In case the argument is not present or empty, it is assumed the user adds a new “top-level” comment, that is, replies to the text of the page, not to another comment.

This action expects two or three input field values. The subject input (containing the title for the new comment) and cmtbody input (containing the body) are expected always, and the name input is only expected if the user is not logged in, otherwise the input is not extracted (and ignored), and the user's visible name is used instead.

The permissions are checked; the user must have the post permission in order to post comments. In case the user is not permitted to bypass the premoderation (doesn't have the post_visible permission), the comment is added with flags hidden and premod, and the comment is added to the premoderation queue.

Comment text and status modification form

The comment_edit action lets the user (provided that the user has the appropriate permissions) to edit a comment, delete a comment, hide and unhide a comment, and to remove comments from the premoderation queue. The action accepts one mandatory argument, which is the ID of the comment to be altered.

This webform is multifunctional, so it is assumed to have several submit buttons. Thalassa CGI extracts and handles input field values according to the following procedure.

First of all, the moderation input field is checked. If it is present, its value must be one of “regenerate” (just regenerate the page), “hide” (hide the comment), “unhide” (make the comment visible) or “dequeue” (remove the comment from the premoderation queue, not changing its visibility). For all these values except “dequeue”, the page is regenerated after the comment file is saved with new flags. The form handling is stopped after this, no more input fields are checked.

In case the moderation input field is absent, the delete input field is checked. If it is present, its value must be “yes”, otherwise an error will be reported to the user. Furthermore, there must also be the really input field, and its value must be “really” (the form should let the user type “really” into the form field to confirm the action is intended). If the checks are successful, the comment is deleted (which means its source file is literally deleted from the content database), and the page is regenerated. The form handling is stopped after this.

The last possibility is that neither moderation nor delete are present. In this case the subject input (containing the title for the new comment) and the cmtbody input (containing the body) are expected; the comment is saved with new content, and in case it is visible, the page is regenerated. Please note the present version of Thalassa doesn't allow to change the “visible” user name for a comment through the web interface, this can only be done by editing the comment source files in the database manually.

The %[discuss: ] macro

The %[discuss: ] macro provides access to properties of the text being replied to (either a comment or a page) and the comment being edited. Actually, it is possible to access any of the comments on the current page with this macro, as the comment ID is passed to the macro as one of its arguments.

It is important to understand that the notion of the current page is implemented by parameters of the [comments] ini section, primarily the subdir parameter — having its value and the comment ID, the macro can access all the comment's properties. For the case when the macro is requested to access the page's properties (in contrast with properties of a comment on the page), it uses either the page_source parameter, or the page_html_file and page_html_marks parameters (see the Original page text display section for details).

Furthermore, some functions of the %[discuss: ] macro provide direct access to the [comments] section parameters' values.

The macro accepts at least two arguments; the first argument is the name of the desired function, and the second argument (for all functions) is the comment ID, or an empty string in case the properties of the page, not a comment on it, are requested. Conditional checkers accept two additional arguments, the then and the else values, and all the other functions need only the two arguments. Some functions ignore the second argument, but nevertheless accept it. The list of the macro's functions follows:

The %[cmtinfo: ] macro

The %[cmtinfo: ] macro is primarily intended to be used handling the moderation queue, where it is often necessary to access properties of comments not from the current page, and, furthermore, the current page (the one being commented) may be undefined just because the user didn't try to leave a new comment nor to modify an existing one.

The macro accepts arguments, and the first of them, as usual, must be a supported function name.

The %[cmtinfo:tags] function takes no arguments; it returns a space-separated list of allowed HTML tags as configured in the [format] section.

The %[cmtinfo:topics] function takes no arguments; it returns a space-separated list of existing comment topics, which means all currently existing values for the subdir parameter. Technically, in the present version the macro traverses recursively the comment base directory (the one specified as the dir parameter), pics directories that have no subdirectories, and returns the list of their paths relative to the base directory.

The %[cmtinfo:iftopic:PageID:THEN:ELSE] function accepts three arguments: PageID, which is effectively used instead of the subdir parameter value (a.k.a. topic id), THEN and ELSE; it checks if the topic for the given page ID exists and returns THEN or ELSE accordingly.

The %[cmtinfo:list:PageID] function needs only one argument, PageID. The function returns the list of comment IDs for the given page (comment topic), space-separated, sorted, leading zeroes stripped.

The rest of the functions accept at least two additional arguments, first being the PageID just like for the iftopic and list functions, and the other being the comment ID. Conditional checkers accept two additional arguments, for the then and the else values. The list of the functions follows:

In the present version of Thalassa, the cmtinfo macro doesn't provide access to the comment's body, because no need for that ever arose. This can be more or less easily changed, so in case you really need it, contact the authors.

The %[ifperm: ] macro

The %[ifperm: ] macro is used to check if the current user has a certain permission.

%[ifperm:post:THEN:ELSE] checks whether the current user (or the anonymous, in case there's no logged in user) is permitted to post comments.

%[ifperm:seehidden:USER:THEN:ELSE] checks if the current user can see hidden comments left by the given USER (of if the current user can see all hidden comments, in case the USER argument is left empty).

%[ifperm:moderate:THEN:ELSE] checks if the current user is allowed to perform moderation, that is, to change comments' visibility status and to remove the comments from the moderation queue. The “moderation” function name is a synonymous for “moderate”.

%[ifperm:edit:USER:TIME:THEN:ELSE] checks if the current user can edit a comment left by the given USER at the given TIME. If the TIME argument is empty, the function checks if the current user can edit comments left by the given USER regardles of when they were left. If both USER and TIME are empty, the check is performed if the current user has the right to edit all existing comments.

It is important to understand that the respective POST action implementations do (on their own) perform checks for permissions needed to perform each action. The ifperm macro is only used to decide what content to display to the user — it allows to display different things depending on what permissions the user has.

The %[justposted: ] macro

Whenever a new comment is successfully posted, it is a good idea to send the user (as a response to the successful POST request) a page that provides some information regarding the comment. The %[justposted: ] macro provides such information.

%[justposted:if:THEN:ELSE] checks whether we really have the “just successfully posted a new comment” situation. The “ifhave” function name is a synonymous for “if”.

%[justposted:ifhidden:THEN:ELSE] checks if the new comment has the hidden flag (which basically means is is placed to the premoderation queue).

%[justposted:comment] returns the comment ID for the new comment.

The %[commentmap: ] macro

As it was discussed earlier, the thalassa program generates comment maps for “pages” that end up generated as several HTML files because of the number of comments. The %[commentmap: ] macro is used to access the data from these map files.

The macro is called as follows:

  %[commentmap:PATH:CmtID:DEFAULT]

where PATH is the map file path, which should be relative to the CGI location (well, it can be an absolute path as well, but in a sane environment you won't need to use an absolute path); CmtID is the comment ID, without leading zeroes; and DEFAULT is the value to return in case either the map file or the comment ID within the file aren't found.

The value returned by the macro is the local part of the URL (well, URI) of the HTML file, always starting with the “/”. It can be used both as a part of the href attribute and as a path to a file, relative to your web tree root (despite of the leading “/”).

Premoderation-related functions of the %[sess: ] macro

We already explained most of the %[sess: ] macro's functions, related to sessions and user accounts. The last group of its functions, related to handling of the premoderation queue, follows.

%[sess:premodq] returns the premoderation queue content. The result is a space-separated list of elements, each of them being a triple consisting of a realm ID, page ID (within the realm) and comment ID, joined by the “=” char, as it was explained in the section on premoderation queue; actually, these triples are just file names of the symlinks found within the premoderation queue directory. See also the splitpremodq macro description to get an idea what to do with these triples.

%[sess:pqprev:RealmID=PageID:CmtID] and %[sess:pqnext:RealmID=PageID:CmtID] return, respectively, the previous and the next item of the premoderation queue, or an empty string if either the current (given) item isn't found or there's no previous/next item for it.

It might look weird that one has to pass the first two elements of the triple (RealmID and PageID) as a single argument, while splitting off the third element of the same triple and passing it separately. However, this is for a reason; it is possible that in future version of Thalassa, these premoderation queue elements will no longer have to be triples.

The %[splitpremodq: ] macro

Once the premoderation queue is retrieved with %[sess:premodq], it needs to be handled somehow, and, in particular, these triples must be split down to values of the realm ID, page ID and the comment ID. This is done with the %[splitpremodq: ] macro.

The macro accepts a variable number of arguments. The first argument is considered a name of a macro, typically it is html. The macro will be called to produce the result. Then, an arbitrary amount of arguments follow; these are to be passed to the macro as they are. The last argument must be a premoderation queue triple; it will get split, and each of its elements will be passed to the macro as an additional argument.

For example, %[splitpremodq:html:foo:bar:buzz:pages=pg37=233] will call the html macro as if the following expression expanded: %[html:foo:bar:buzz:pages:pg37:233].

© Andrey V. Stolyarov, 2023, 2024