thalassa program invocation and command line options

Contents:

Available commands and common options
Printing built-in texts

The help command
The show command

Content generation (the gen command)

Choosing the generation mode
Target specification
Spool directory

Listing configured objects
Content database manipulation

The inspect command
The update command
File specification parameter

Available commands and common options

Generally, the thalassa program invocation has the following form:

  thalassa [common_options] <command> [options]

where command is one of:

help — get help;
show — show the program's properties;
gen — generate content;
list — list configured/existing objects (PARTIALLY IMPLEMENTED);
update — update a page or comment file;
inspect — show a page or comment file's content.

The following common options are recognized:

-c dir — change to the given directory before start;
-o selector — force option selector (overrides the opt_selector parameter);
-i ini_file — load the given ini file; the “-i” option may be given multiple times; in case none given, the default “thalassa.ini” will be used.

Command-specific options are described together with each command, below.

Being run with no arguments, thalassa prints the same message as thalassa help does, but to stderr rather than stdout.

Printing built-in texts

The help and show commands don't actually do anything, they only print text messages built into the thalassa binary. Common options are still accepted but fully ignored by these commands; being run with one of the commands, thalassa doesn't attempt to load any ini (nor any other) files.

The `help` command

Being invoked without additional arguments, thalassa help prints the program name, version and the general help message, which includes information on available commands and common options.

The command accepts exactly one additional argument — a command name; being run with such argument, like thalassa help gen, it displays the command-specific help text.

The `show` command

Being run with no additional arguments, the command displays the same help message as thalassa help show does. Two subcommands are recognized:

thalassa show version prints version information;
thalassa show encodings (or just enc) displays supported encoding names and aliases.

Content generation (the `gen` command)

thalassa gen does what thalassa is made for — generates the site's content. Command-specific options are required for the gen command; being run with no additional arguments, thalassa gen prints an error message, displays the command-specific help text (the same as for thalassa help gen, but to stderr instead of stdout) and exits.

Choosing the generation mode

One and only one of the following options must be given:

-a (for all) — generate everyting;
-r — regenerate the site;
-g target_list — generate the given targets; see the Target specification section below.

For both the -a and -g modes, generated content is placed into the target directory as configured, and is written over existing files if any, without any warnings; however, as existing files within the target tree are not checked in any way, all files not (re)written by thalassa remain unchanged. This means, in particular, that if you change your ini files and/or the content database so that some of previously-existing content is no longer to be generated, it will remain in the target directory (from previous generator runs). If this is not what you want, use the -r mode instead.

For the -r mode, a temporary directory is created in the same upper-level directory where the configured target directory resides, and all the content is generated into this temporary directory. After that, the target directory is renamed to have a suffix .1; if a directory with this suffix already exists, it is in turn renamed to have the .2 suffix, and so on. Finally, the temporary directory is renamed to the name configured as the target.

Using the -r is strongly recommended for running sites, as it reduces the time of possible unavailability and content inconsistencies to the absolute possible minimum (the time between two subsequent rename syscalls).

In the present version, both -a and -g modes write all files they generate in the most straight-forward way: the files are open with O_WRONLY|O_CREAT|O_TRUNC flags and then written with the write syscall. In theory, this leaves the possibility for the HTTP server to see either an empty file, or a partially-generated version of it. This shouldn't really be a problem for low-load sites, but the higher the load is, the higher becomes the probability of such inconsistencies. This problem will likely be fixed in future versions of thalassa by generating every file under a temporary name and then renaming it to the desired name.

The following two additional options are recognized by the gen command:

-t target_dir overrides the target directory configured in the ini files;
-s turns on the spool directory use and the locking (see the spool directory section below).

Target specification

For the -g generation mode, a space- and/or comma-separated list of individual targets is accepted. Be sure to use quotes to make it a single argument if you use spaces.

Each individual target specification consists of a target type, target ID and (optionally) the item spec, separated by “=”. Some examples follow:

set=nodes=n35 means to generate the page n35 defined in the pageset named nodes;
set=blog means to generate all pages of the blog pageset;
list=videos — generate the whole list named videos (both list pages and item pages, if/as configured);
list=videos=v1 for the videos list, generate the item page v1 only (list pages are not generated at all).

The following target types are supported: list, set, page, collection, genfile, binary, aliases. Also pageset is recognized as an alias for set, and bin may be used instead of binary.

Target type alone means to generate all targets of the type (e.g. all lists, or all collections).

For the page, genfile and binary types, no item specification is allowed because they don't contain any items; each aliases section is only generated as a whole, too. In the present version of thalassa, an item specification for a collection is allowed, but is completely ignored; this is subject to change in future versions.

Spool directory

Whenever the Thalassa CGI program needs to regenerate any of the site's pages, it launches the thalassa program, usually with the -g key, so that only the pages that has really changed, actually get regenerated. The problem is that users may change the same page (e.g., adding comments to it) so that two copies of thalassa will try regenerating it.

In order to avoid conflicts, locking can be used. This is what the -s flag does.

If this flag is used together with the -g, thalassa acts as follows. First, it adds all targets from the -g's parameter to the spool directory; this doesn't require any mutual exclusion, because every target is added as a separate file, and the file creation operation is atomic. After that, thalassa tries to lock the spool directory by creating a lock file in it. In case the lock can't be set, the program simply exits; the targets should then get processed either by the instance that locked the directory, or by another instance that will run later.

In case locking is successful, thalassa does its best to process all the targets found in the spool directory; even if any new targets are added by other instances during this process, chances are (although there's no guarantee) they'll get processed as well.

The -s key may also be used together with both -a and -r. For -a, things are relatively simple: thalassa tries to lock the spool directory, and exits with the error message in case of failure. If the lock is successfully set, the program performs the requested full content regeneration, then it processes also the targets from the spool directory, if any, and only after that it releases the lock.

It may look a bit stupid to process any targets when the site has just been fully regenerated; however, the problem is that other instances of thalassa could add some targets to the spool directory after the corresponding pages were processed in the cource of the full regeneration.

For the -r mode, the process is a bit more complicated. First, thalassa creates the temporary directory and generates all the content, placing it there. This doesn't require any locking. The program tries to establish the lock only right before it starts with directory renaming, and in case acquiring the lock fails, the program quits with the appropriate error message. Please note the temporary directory is not cleared in this case; it remains where the program created it, so you might want to remove it manually.

In case the locking is successful, thalassa performs all the directory renaming as necessary, then processes also the targets from the spool directory (just like it does so for the -a case) and releases the lock.

Listing configured objects

The thalassa list command is not yet fully implemented, so don't expect much from it.

We intentionally don't document it here as it is obviously not ready for practical use. This situation is to be fixed in the near future.

Content database manipulation

Headed text files used in the content database can always be edited manually, with your favorite text editor (well, we like vim, what about you?) However, it is convenient to have a tool to fill up the fields like id:, unixtime: and specially teaser_len: for you; also it is often needed to set or remove a tag, a label or a flag, individually or in bulk, and sometimes things like this are to be performed from a script.

This is what the inspect and update commands are for.

The `inspect` command

The thalassa inspect command prints a headed text file header and optionally some lines of its body. The command is used like this:

    thalassa [...] inspect <filespec> [<options>]

The <filespec> argument is mandatory and may be either a file path or a Thalassa object specification; see the File specification parameter section below for the parameter's description.

The command recognizes the following options:

-b N — show the first N lines of the body;
-B — show the whole body;
-v — output verbose messages.

The `update` command

The thalassa update command allows to make some changes to the header field values of a headed text file. The command is used like this:

    thalassa [...] update <filespec> [<options>]

The <filespec> argument is mandatory and may be either a file path or a Thalassa object specification; see the File specification parameter section below for the parameter's description.

If no options are given, the command does the following:

sets the id: field value according to the <filespec> argument; this may not be done only in case the program can't guess the ID value from the path you gave, but it is a rare situation;
checks for the unixtime: field and if it is not there or has the zero value, sets it to the current time.

The following options are recognized:

-n — dry run: show what would be done but don't do;
+T <tag> — add the tag;
-T <tag> — remove the tag;
+F <flag> — add the flag;
-F <flag> — remove the flag;
+L <label> — add the label;
-L <label> — remove the label;
-D <unixtime> — use the <unixtime> instead of the current time;
-d — update the “unixtime:” field even if it is already there;
-t — update the “teaser_len:” field (see below);
-i — suppress updating of the “id:” field;
-b <suffix> — set the backup file suffix (default: “~”);
-B — don't create backup files;
-v — print verbose messages.

The “teaser_len:” update works as follows. If the “descr:” field is set and is not empty, the “teaser_len:” header field is removed if it was there, nothing else is done. If there's no “descr:” field or it is empty, the body of the file is searched for the “” string (literally that, no extra spaces or anything else is allowed) and its position is used as the new value for the “teaser_len:” field. In case the string isn't found within the body, the field is set to the length of the body.

File specification parameter

Both inspect and update commands require the file to be specified either by its path, or as a Thalassa object.

The <filespec> may be either a file path or a Thalassa object specification. In case the argument contains at least one “/”, it is taken as a path to an existing file; the Thalassa database is not even loaded in this case. If the argument contains at least one “=”, it is considered to be a Thalassa object specification, which must either specify a pageset page ID, or a comment ID for either a list item page, or a pageset page. In case there's neither a “/” nor a “=” in the argument (which is not recommended, it is better to use ./name for files and setID=pgID for pageset pages), Thalassa will first try a file in the current working directory, and if there's no such file or it is not a regular file, will try to load the database and use the given name as a pageset page ID (which only works if you've got exactly one pageset).

Thalassa object specifications have the form

[<type>=[<realmID>=]]<pageID>[=<commentID>]

where <type> is one of “list”, “set” or “pageset” (the latter two are equal), but is in most cases simply omitted; <realmID> is an ID of either a list or a pageset; <pageID> is either a pageset page ID, or an item ID within a list with item pages; <commentID> is, well, the comment ID. Please note that in case the <commentID> is omitted, then the type must not be “list” and the realmID must not be a list ID because list item pages don't have source files, and both update and inspect only work with headed-text source files. The <realmID> may only be omitted in case you only have one realm: either exactly one pageset and no lists, or no pagesets and exactly one list. If you specify a pair like XXX=YYY, it is considered to be pageID=commentID if and only if you have exactly one realm, otherwise it is interpreted as setID=pageID.

Thalassa CMS