Contents:
gen
command)Generally, the thalassa
program invocation has the following
form:
thalassa [common_options] <command> [options]
where command
is one of:
help
— get help;show
— show the program's properties;gen
— generate content;list
— list configured/existing objects (PARTIALLY IMPLEMENTED);update
— update a page or comment file;inspect
— show a page or comment file's content.The following common options
are recognized:
-c dir
— change to the given directory
before start;-o selector
— force option selector
(overrides the
opt_selector
parameter);-i ini_file
— load the given ini file;
the “-i
” option may be given multiple times; in case none
given, the default “thalassa.ini
” will be used.Command-specific options are described together with each command, below.
Being run with no arguments, thalassa
prints the same message
as thalassa help
does, but to stderr
rather
than stdout
.
The help
and show
commands don't actually do
anything, they only print text messages built into the
thalassa
binary. Common options are still accepted
but fully ignored by these commands; being run with one of the commands,
thalassa
doesn't attempt to load any ini
(nor any
other) files.
help
commandBeing invoked without additional arguments, thalassa help
prints the program name, version and the general help message, which
includes information on available commands and
common options.
The command accepts exactly one additional argument — a command name;
being run with such argument, like thalassa help gen
, it
displays the command-specific help text.
show
commandBeing run with no additional arguments, the command displays the same help
message as thalassa help show
does. Two subcommands
are recognized:
thalassa show version
prints version
information;thalassa show encodings
(or just
enc
) displays supported encoding names and aliases.gen
command)thalassa gen
does what thalassa
is made for
— generates the site's content. Command-specific options are
required for the gen
command; being run with no additional
arguments, thalassa gen
prints an error message, displays
the command-specific help text (the same as for
thalassa help gen
, but to stderr instead of stdout)
and exits.
One and only one of the following options must be given:
-a
(for all) — generate everyting;-r
— regenerate the site;-g target_list
— generate the given
targets; see the Target specification section
below.For both the -a
and -g
modes, generated content
is placed into the target directory as configured, and is written over
existing files if any, without any warnings; however, as existing files
within the target tree are not checked in any way, all files not
(re)written by thalassa
remain unchanged. This means, in
particular, that if you change your ini
files and/or the
content database so that some of previously-existing content is no longer
to be generated, it will remain in the target directory (from previous
generator runs). If this is not what you want, use the -r
mode instead.
For the -r
mode, a temporary directory is created in the same
upper-level directory where the configured target directory resides, and
all the content is generated into this temporary directory. After that,
the target directory is renamed to have a suffix .1
; if a
directory with this suffix already exists, it is in turn renamed to have
the .2
suffix, and so on. Finally, the temporary directory is
renamed to the name configured as the target.
Using the -r
is strongly recommended for running sites, as it
reduces the time of possible unavailability and content inconsistencies to
the absolute possible minimum (the time between two subsequent
rename
syscalls).
In the present version, both -a
and -g
modes
write all files they generate in the most straight-forward way: the files
are open with O_WRONLY|O_CREAT|O_TRUNC
flags and then written
with the write
syscall. In theory, this leaves the
possibility for the HTTP server to see either an empty file, or a
partially-generated version of it. This shouldn't really be a problem for
low-load sites, but the higher the load is, the higher becomes the
probability of such inconsistencies. This problem will likely be fixed in
future versions of thalassa
by generating every file under a
temporary name and then renaming it to the desired name.
The following two additional options are recognized by the gen
command:
-t target_dir
overrides the target directory
configured in the ini files;-s
turns on the spool directory use and the
locking (see the spool directory section
below).For the -g
generation mode, a space- and/or comma-separated
list of individual targets is accepted. Be sure to use quotes to make it a
single argument if you use spaces.
Each individual target specification consists of a target type, target ID
and (optionally) the item spec, separated by “=
”. Some
examples follow:
set=nodes=n35
means to generate the page n35
defined in the pageset named nodes
;set=blog
means to generate all pages of the
blog
pageset;list=videos
— generate the whole list named
videos
(both list pages and item pages, if/as configured);list=videos=v1
for the videos
list, generate
the item page v1
only (list pages are not generated at
all).The following target types are supported: list
,
set
, page
, collection
,
genfile
, binary
, aliases
. Also
pageset
is recognized as an alias for set
, and
bin
may be used instead of binary
.
Target type alone means to generate all targets of the type (e.g. all lists, or all collections).
For the page
, genfile
and binary
types, no item specification is allowed because they don't contain any
items; each aliases
section is only generated as a whole,
too. In the present version of thalassa
, an item
specification for a collection
is allowed, but is completely
ignored; this is subject to change in future versions.
Whenever the Thalassa CGI program
needs to regenerate any of the site's pages, it launches the
thalassa
program, usually with the -g
key, so
that only the pages that has really changed, actually get regenerated.
The problem is that users may change the same page (e.g., adding comments
to it) so that two copies of thalassa
will try regenerating
it.
In order to avoid conflicts, locking can be used. This is what the
-s
flag does.
If this flag is used together with the -g
,
thalassa
acts as follows. First, it adds all targets from the
-g
's parameter to the spool directory; this doesn't require
any mutual exclusion, because every target is added as a separate file, and
the file creation operation is atomic. After that, thalassa
tries to lock the spool directory by creating a lock file in it. In case
the lock can't be set, the program simply exits; the targets should then
get processed either by the instance that locked the directory, or by
another instance that will run later.
In case locking is successful, thalassa
does its best to
process all the targets found in the spool directory; even if any new
targets are added by other instances during this process, chances are
(although there's no guarantee) they'll get processed as well.
The -s
key may also be used together with both -a
and -r
. For -a
, things are relatively simple:
thalassa
tries to lock the spool directory, and exits with the
error message in case of failure. If the lock is successfully set, the
program performs the requested full content regeneration, then it
processes also the targets from the spool directory, if any, and
only after that it releases the lock.
It may look a bit stupid to process any targets when the site has just been
fully regenerated; however, the problem is that other instances of
thalassa
could add some targets to the spool directory
after the corresponding pages were processed in the cource of the
full regeneration.
For the -r
mode, the process is a bit more complicated.
First, thalassa
creates the temporary directory and generates
all the content, placing it there. This doesn't require any locking.
The program tries to establish the lock only right before it starts with
directory renaming, and in case acquiring the lock fails, the program quits
with the appropriate error message. Please note the temporary
directory is not cleared in this case; it remains where the program
created it, so you might want to remove it manually.
In case the locking is successful, thalassa
performs all the
directory renaming as necessary, then processes also the targets from
the spool directory (just like it does so for the -a
case) and releases the lock.
The thalassa list
command is not yet fully
implemented, so don't expect much from it.
We intentionally don't document it here as it is obviously not ready for practical use. This situation is to be fixed in the near future.
Headed text files used in the
content database can always be
edited manually, with your favorite text editor (well, we like
vim,
what about you?) However, it is convenient to have a tool to fill up the
fields like id:
, unixtime:
and specially
teaser_len:
for you; also it is often needed to set or remove a
tag, a label or a flag, individually or in bulk, and sometimes things like
this are to be performed from a script.
This is what the inspect
and update
commands are
for.
inspect
commandThe thalassa inspect
command prints a headed text file header and
optionally some lines of its body. The command is used like this:
thalassa [...] inspect <filespec> [<options>]
The <filespec>
argument is mandatory and may be
either a file path or a Thalassa object specification; see the
File specification parameter section
below for the parameter's description.
The command recognizes the following options:
-b N
— show the first N lines of
the body;-B
— show the whole body;-v
— output verbose messages.update
commandThe thalassa update
command allows to make some changes
to the header field values of a headed text file. The command is used like
this:
thalassa [...] update <filespec> [<options>]
The <filespec>
argument is mandatory and may be
either a file path or a Thalassa object specification; see the
File specification parameter section
below for the parameter's description.
If no options are given, the command does the following:
id:
field value according to the
<filespec>
argument; this may not be done only
in case the program can't guess the ID value from the path you gave, but
it is a rare situation;unixtime:
field and if it is not there
or has the zero value, sets it to the current time.The following options are recognized:
-n
— dry run: show what would be done but don't do;+T <tag>
— add the tag;-T <tag>
— remove the tag;+F <flag>
— add the flag;-F <flag>
— remove the flag;+L <label>
— add the label;-L <label>
— remove the label;-D <unixtime>
— use the <unixtime>
instead of the current time;-d
— update the “unixtime:
” field even
if it is already there;-t
— update the “teaser_len:
” field
(see below);-i
— suppress updating of the “id:
”
field;-b <suffix>
— set the backup file suffix
(default: “~
”);-B
— don't create backup files;-v
— print verbose messages.The “teaser_len:
” update works as follows. If the
“descr:
” field is set and is not empty, the
“teaser_len:
” header field is removed if it was there,
nothing else is done. If there's no “descr:
” field or it is
empty, the body of the file is searched for the
“<!--break-->
” string (literally that, no extra spaces
or anything else is allowed) and its position is used as the new value for
the “teaser_len:
” field. In case the string isn't found
within the body, the field is set to the length of the body.
Both inspect
and update
commands require the file
to be specified either by its path, or as a Thalassa object.
The <filespec>
may be either a file path or a
Thalassa object specification. In case the argument contains at least one
“/
”, it is taken as a path to an existing file; the Thalassa
database is not even loaded in this case. If the argument contains at
least one “=
”, it is considered to be a Thalassa object
specification, which must either specify a pageset page ID, or a comment ID
for either a list item page, or a pageset page. In case there's neither a
“/
” nor a “=
” in the argument (which is not
recommended, it is better to use ./name
for files and
setID=pgID
for pageset pages), Thalassa will
first try a file in the current working directory, and if there's no such
file or it is not a regular file, will try to load the database and use the
given name as a pageset page ID (which only works if you've got exactly one
pageset).
Thalassa object specifications have the form
[<type>=[<realmID>=]]<pageID>[=<commentID>]
where <type>
is one of “list
”,
“set
” or “pageset
” (the latter two are
equal), but is in most cases simply omitted;
<realmID>
is an ID of either a list or a
pageset; <pageID>
is either a pageset page ID,
or an item ID within a list with item pages;
<commentID>
is, well, the comment ID. Please
note that in case the <commentID>
is omitted,
then the type must not be “list
” and the
realmID
must not be a list ID because list item pages
don't have source files, and both update
and
inspect
only work with headed-text source files. The
<realmID>
may only be omitted in case you only
have one realm: either exactly one pageset and no lists, or no pagesets and
exactly one list. If you specify a pair like
XXX=YYY
, it is considered to be
pageID=commentID
if and only if you have
exactly one realm, otherwise it is interpreted as
setID=pageID
.