Contents:
Generally speaking, macro is a kind of rule for replacing one text with another; a macro call (which is often confused with the macro itself) is a (small?) portion of text which gets automatically replaced with some other text according to the rule. The process of this replacement is called macro expansion, and the piece of program code which performs the expansion is called macroprocessor.
Macros are used heavily in Thalassa CMS configuration files; it is no surprise because, actually, what Thalassa does is turning one set of text files (your sources) into another set of text files (your site content).
From the very start it is important to keep in mind that macro expansion is only done in the ini files, and not everywhere, but only within values of some (honestly, most of) parameters. It is explicitly mentioned in documentation for every ini file section whether macroexpansion is done in all its parameters, only some of them or none.
Strictly speaking, all macros in Thalassa are built-in in the sense that you can't add new macros without hacking the source code of Thalassa itself. However, you can add various snippets, templates, options and other things accessible via the existing macros, and some of them accept parameters so effectively you can achieve all the same goals as if you could invent user-defined macros.
There are basic macros, which are available both for thalassa
and thalcgi.cgi
; there are as well macros specific for each of
them; and there are even some macros local for a particular configuration
file section or a particular parameter.
Wherever macro expansion is performed, it uses the percent char
“%
” as the “escape” character. This means that once the
macroprocessor sees the percent char, it expects there will be a macro call
right after it. So, if you just need the percent char itself, you must
double it, like this: %%
— but only in case you write a
text which will be passed through the macroprocessor; remember, not
all parameter values go through the macroprocessor, so in case of
any doubts be sure to take a look at the documentation for a particular
configuration section.
Thalassa uses the macroprocessor implemented by ScriptPlusPlus library (see
http://www.croco.net/software/scriptpp/).
This library provides three flavors of macro calls: simple, nesting and
lazy. For a macro named foobar
, the simple form of the call
is written as %foobar%
, the nesting form will be
%[foobar]
, and the lazy form will be %{foobar}
.
Before we discuss the difference between them, we need to introduce
macro arguments. It is relatively a rare case when a macro is
just called by its name without any additional information. Sometimes this
thing happens; for example, macro named now
returns the
current time as a Unix datetime value (a decimal integer equal to the
amount of seconds passed since Jan 01, 1970). We can call it both with
%now%
and %[now]
, in this (very simple) case it
makes no difference. Another example of such a “parameter-less” macro is
message
, which is only available in thalcgi.ini
and represents a short message describing the result of the action the
user requested and the server has just performed (or at least tried to).
Again, we can write either %message%
, or
%[message]
, there's no difference.
It is also possible to write %{now}
or
%{message}
, and both will (highly likely) work, but this is
strongly discouraged and may lead to serious security-related problems.
Please don't use the “lazy” flavor of macro calls unless you're
absolutely sure you know what you do.
In the whole Thalassa CMS there are no more macros accepting no arguments,
only these two (strictly speaking, there are also some such macros specific
to particualr configuration parameters, but let's ignore them for now, as
they even aren't listed in the macro references, they are only mentioned in
descriptions of the respective parameters). All the other macros accept
one or more arguments. For example, macro named ltgt
accepts
exactly one argument — an arbitrary string, and returns the same
string, but with characters “<
”, “>
”
and “&
” replaced with, respectively,
“<
”, “>
” and
“&
”; so, the macro call
%[ltgt:3 < pi < 4]
will be replaced with
“3 < pi < 4
”. Exactly the same will happen if
we write the call in the simple form: %ltgt:3 < pi < 4%
In this example, a colon “:
” is used as a delimiter between
the name of the macro and its argument, but this is not necessarily so. In
the macro system we discuss here, a macro name can only consist of
alpanumeric chars ('a'..'z', 'A'..'Z', '0'..'9'), the underscore
“_
” and the asterisk “*
”. When the
macroprocessor analyses a macro call, the first char it sees which doesn't
belong to this set is taken as the delimiter. In examples all through this
documentation, as well as in example sites provided along with Thalassa, we
usually delimit with colon, but sometimes we have to use something
different, like “|
” (in case the colon is to be present
within one of arguments). However, you can use whatever punctuation
character you like.
WARNING: don't use non-ascii characters as delimiters, specially if you use UTF-8. Problems are almost guaranteed if you do. Also it is not a good idea to use whitespace chars in this role.
It is a very common situation when a result of one macro needs to be passed
to another macro as one of its arguments. For example, macro named
rfcdate
turns a Unix datetime value into a human-readable form
as defined by rfc2822, something like “29 Mar 2023 19:15:00
+0000
”. So, to get the current date and time, we need to write
“%[rfcdate:%[now]]
”. It is obvious that the simple form
doesn't work here: we can write “%[rfcdate:%now%]
” and it
will work, but if we try “%rfcdate:%now%%
”, the macro
rfcdate
will receive empty argument (and hence will fail),
while the now
macro will not be called at all.
Macro calls in their simple form work a bit faster, so sometimes it makes sense to use the simple form, but it obviously doesn't work even for shortest superpositions, like the one we've just discussed. This is why the nesting flavor of macro calls is there.
Before we start discussing the last macro call flavor — the lazy one — we'd like to repeat one more time our warning. It is almost always possible to avoid lazy macro calls, and it is highly recommended that you don't use them at all. You can just skip the text until the next heading, and stay safe. If you don't understand why we give this recommendation, it means you can accidentally introduce a security hole into your site's implementation and remain unaware until a catastrophe happens. You've been warned.
When the macroprocessor performs macro expansion, it does precisely the following. First, it analyses the macro call, determines where the call starts and where it ends, and, BTW, for nested calls this is not as easy as it can sound. The next step is no break the macro call down using the delimiter chars, so that the macroprocessor knows the name of the macro and all its arguments. For the simple flavor of macro calls, we're almost done: superposition is impossible here, so the macroprocessor just calls the function corresponding to the extracted macro name and passes all the arguments to it as an array of strings. The function computes the desired result of macro expansion, returns it, and macroprocessor appends this result to the text being composed.
For the nested flavor, things are different, as each of the arguments may contain further macro calls. So in this case the macroprocessor has to effectively create another instance of itself, use that instance to process each of the arguments, and only then it can call the corresponding function, passing it this time an array composed of processing results for each argument, instead of arguments themselves.
Lazy calls follow completely another way. Once they have the arguments, they immediately call the coresponding function, passing it the array of raw (unprocessed) arguments. But once the function returns the result, this result is processed again before it goes to the target text.
Consider for example the macro named readfile
.
It takes a file name as the argument and reads the file; being used with
simple or nesting flavor of calls, this allows to insert a whole file's
content into your parameter (e.g. into an html page being generated).
If you use readfile
in a lazy macro call, you'll be unable to
compute the name of the file, so you have to know it in advance, but this
might be not a problem. The real problems may arise out of the fact that
in this case the file's content will get processed by the
macroprocessor, so, should there be pieces of text looking like
macro calls, they will get macro-expanded. Again, this might be no
problem if the file is written by you and you're sure there's nothing wrong
in it. It is even possible you decide to do this intentionally. But in
case the file you read this way may (at least in theory) be modified by
someone else, then your security hole is ready to serve: everyone who can
technically supply such a file to your site's implementation, can do
whatever can be done with Thalassa CMS macros, and that's a lot of things.
Definitely it is more than you'd want to allow arbitrary people to do on
your server.
Once again: if all this sounds complicated to you, simply don't use the lazy flavor of macro calls, and that's all.
As it is mentioned above, for a nesting macro call, the process of macro expansion involves applying the macroprocessor to every argument of the macro. In other words, every argument of a nesting call gets computed as a separate text to be macroprocessed. In terms of functional programming, all arguments are first evaluated, and only then the actual macro is applied to the results. This is exactly what is usually called eager evaluation model.
It is important to understand that this happens to every argument of the call, every time the call is processed. The macro system used in Thalassa CMS is not a programming language, and it doesn't provide “special” macros that could skip evaluation of some of their arguments depending on values of other arguments. Such “selective” evaluation is simply impossible in this implementation.
Thalassa CMS provides some “conditional” macros, which allow to choose one of two or more variants. What is critically important to understand here is that all variants will be computed every time a call to such a macro is processed. The macro implementation will choose one of the variants afterwards, but it is only after all the variants are computed.
Let's consider a more or less simple example. The opt
, gives access to various options given in the
options
section group. The last macro for our example is
known to us already, it is readfile
we explained earlier. Now look at the following:
%[ifeq:%[opt:scheme:lights]:night :%[readfile:night.txt]:%[readfile:default.txt]]
Well, the result of this is more or less obvious: if the option
scheme/lights
is set to night
, then the whole
construction will be replaced with the night.txt
file's
contents, otherwise the contents of default.txt
will be used.
What is not so obvious is that both files will be read.
Sometimes this can be a problem, specially in case one of the files
actually doesn't exist and an unintentional attempt to read it produces an
error.
For this particular example, it is easy to avoid the problem:
%[readfile :%[ifeq:%[opt:scheme:lights]:night:night.txt:default.txt] ]
Being used this way, ifeq
doesn't choose between two
readfile
call results (after doing them both), as it was done
in the previous example; instead, it
just chooses one of the two file names, and the choice is used as
the readfile
's argument.
In fact, Thalassa provides ways to always avoid unnecessary computations, but attention must be paid to this, and, first things first, the person who writes configuration files should at least understand what the problem is.