Contents:
[page ]
ini section group%[page: ]
macroFirst things first: if you haven't read the short introduction to pages given within the Thalassa CGI overview, please be sure to read before you continue.
When you see an URL like
http://www.example.com/foo/thalcgi.cgi/bar/buzz.html
you might suspect that the page available there is not a real HTML file but
something generated by a “CGI script” (which has nothing to do with
scripts) named thalcgi.cgi
, but this is only because you know
what thalcgi.cgi
actually is. From the point of view of an
unaware observer, this is just an URL of another HTML page,
nothing special. Furthermore, client software, including browsers, is not
aware of CGIs, too, so it doesn't handle such URLs in any special way.
Browsers are even sure that a trailing “/
” in an URL means
it is a directory. Suppose you've got a link on your page, given as an
unqualified file name, like <a href="buzz.html">
.
If your page is requested as
http://example.com/foo/thalcgi.cgi/bar/
, the browser will
assume the link points to
http://example.com/foo/thalcgi.cgi/bar/buzz.html
, but
in case the same page is requested as
http://example.com/foo/thalcgi.cgi/bar
(note that the trailing
slash is absent), the same link will be assumed to point to
http://example.com/foo/thalcgi.cgi/buzz.html
, because the
“bar
” path component is now considered a file name rather
than a directory name. It doesn't matter that in reality there's no file
nor directory.
Hereinafter, we'll use the term “path” to denote the part of an URI which stands to the right from the name of the CGI, starting with the leading slash — unless it is explicitly stated that the word “path” is used in some other meaning.
You can configure your thalcgi.cgi
installation to serve as
many different pages as you want. They are configured with ini file
sections of the page
group, headed like
[page ID]
. For the ID, there are
exactly two possibilities: it must be either a “full path”, starting with
the “/
”, like /bar/buzz.html
, or a “tree
id”, like “bar
”, which must
contain no slashes at all. Sections identified with a full path serve that
exact path only, while sections that have a tree id as their IDs serve all
possible paths with the given first component. E.g., the [page
bar]
section will serve all paths like /bar
,
/bar/buzz.html
, /bar/abra/schwabra/cadabra.html
and so on. In this case, macros will provide you with access to other path
components, and there's even a parameter that allows to choose which paths
to accept and which to reject (with either “404 path not
found
” or “403 forbidden
” error). Such
page
sections are called multipath hereinafter.
It is undefined what will happen in case the ID contains
slashes but doesn't start with one (like foo/bar
). In the
present version such page sections will be silently ignored, but this is
likely to change in the future.
[page ]
ini section groupThe following parameters are recognized by Thalassa GCI within
sections that belong to the [page ]
group:
session_required
, embedded
,
post_allowed
, path_predicate
,
template
, selector
, check_fnsafe
,
post_content_limit
, post_param_limit
,
action
, reqargs
and reqard
.
Besides that, parameters with arbitrary names can be added; their
values will be accessible with the
%[page: ]
macro.
The session_required
,
embedded
and post_allowed
are effectively boolean
values, they may be set to yes
or no
(actually,
anything but yes
is considered to be equal to
no
); they specify, respectively, does the page require a work
session to be established, is the page intended to be embedded somewhere
(e.g., into an iframe
tag on a statically-generated page) and
does the page accept POST requests. Macroprocessing is
not done on these three parameters.
In the present version the embedded
parameter's value is only used by the %[page:ifembedded:]
macro
function and doesn't affect the functioning of the CGI program in any other
way. It may me useful if you prefer to have common templates for your HTML
header and footer, but don't want to display some elements, such as the
site's heading, on embedded pages.
The path_predicate
parameter is only used for
multipath page sections, that is, sections headed
like [page name]
where name
doesn't contain any
slashes. Let's recall that such sections serve all paths with the given
first component, such as /name/foo
, /name/bar
,
/name/foo/bar/buzz.html
and so on. In case the
path_predicate
parameter is specified, its value is passed
through the macroprocessor, and it should result in either
“yes
”, “no
” or “reject
”.
If the macroprocessing results in any other string, it is considered equal
to “no
”. If the value is “yes
”, the request
is processed; for any other values the CGI program rejects the request,
which effectively means it displays the
error page; the error is
“403 forbidden
” if the macroexpansion resulted in the
“reject
” string, otherwise it is
“404 path not found
”.
Within the path_predicate
parameter's value, actual path
components may be referred to as %1%
, %2%
etc.;
e.g., for the /name/foo/bar/buzz.html
path the
%1%
will expand to foo
, %2%
to
bar
and %3%
to buzz.html
(the first
path component, which is the name of the section, name
in this
example, may be accessed as %0%
, but it is not recommended to
rely on this). The same is true for all other parameters within
multipath [page ]
ini sections.
There's another check for acceptability of a particular path, activated by
setting the check_fnsafe
parameter. It is useful in case some
of the path components are going to be used as file names or file name
parts. The value of the parameter, if it is set, gets passed through the
macroprocessor; the result is broken down to words, using the apostrophe
“'
” and the doublequote “"
” as grouping
symbols (both an apostrophe within doublequotes and a doublequote within
apostrophes are considered as plain chars), much like in the Shell command
line. Every “word” is then tested if it can safely serve as a filesystem
path component, which means it doesn't contain any whitespace nor control
characters, any characters from the
!"#$%&'()*,/<>=?[\]^`{|}~
set, any characters
with codes greater than 126, and doesn't start with either
“.
” or “-
”. In case the test is failed, the
CGI refuses to continue, and the error page is displayed with the
“406 path not acceptable
”.
WARNING: Setting the check_fnsafe
parameter correctly
is critical for security; failure to do so can result in someone getting
unauthorized access to your server's filesystem, which might have horrible
consequences.
The template
parameter is perhaps the simplest to explain: it
defines the page's content to be displayed to the user. The value is
passed through the macroprocessor.
The selector
parameter's
value, after macroexpanding, is used as a
specifiers for
values of “unrecognized” parameters available through the
%[page: ]
macro; see the
description of the macro for details.
The rest of the recognized parameters are only used with POST
requests. The post_content_limit
and
post_param_limit
are relatively easy to explain: they set
page-specific values for the same POST data size limits for which the
default (global) values are set by parameters of the same names within the
[global]
ini
section (follow the link for the explanation).
The parameters reqargs
and reqarg
will be
described later in a separate section.
The action
parameter is hard to explain without the
introduction to POST
requests handling in general, so we
postpone its description for the webforms
handling documentation.
%[page: ]
macroOnce the CGI program analysed the path and selected which
particular [page ]
ini section to use, the
%[page: ]
macro becomes available.
The macro requires at least one argument. If the first argument is one of
ifsessionrequired
, ifpostallowed
or
ifembedded
(which are the names of the three supported
functions), the macro takes two more arguments, for then and
else. All the three functions check a certain condition and
return the then argument if the condition is met, the
else argument otherwise. The conditions are, respectively,
whether the page being served is configured to require an active work
session, does it accept POST requests and is it marked as embedded.
See the session_required
,
embedded
and post_allowed
page section
parameters' description for details.
If the first argument is not one of the three function names mentioned
above, it should be the only argument (the rest of arguments, if there are
any, will be ignored). The argument is then used as a name of a parameter
within the same [page ]
section. Briefly (and
incompletely) speaking, the macro call in this case expands to the value of
that parameter, after the value is passed through the macroprocessor. This
allows one to have more HTML snippets right within the page section.
In contrast with %[html:]
snippets, these
can not be used as parametrized templates. This is because within a
[page ]
ini section, these “positionals” (that is,
numerical macros like %1%
, %12%
etc.) are used to
refer to the path components so no way is left to refer to template
arguments if they were there.
However, these parameters are there not because we want to have more
snippets, although that can be convenient in itself. Their primary purpose
is to make it easier to specify different versions of the same
page, and it is achieved with the help of the
selector
parameter.
Once the [page ]
ini section is choosen basing on the path
from the request being served, the selector
parameter's value,
if any, gets passed through the macroprocessor and the result is memorized.
The result should be a simple identifier, or else something may go
wrong. This identifier is used by the %[page: ]
macro when accessing arbitrary parameters of the [page ]
ini section, as the
specifier.
For example, if the following parameters are set:
[page /foo.html] selector = buzz motto = Whatever hits the fan will not be distributed evenly motto:buzz = It is not cheating, it's a team work slogan = Life is short, smile while you still have teeth slogan:usb = Life is too short to remove USB safely slogan:lazy = I am not lazy, I am just on my energy saving mode
and the /foo.html
path is requested,
%[page:motto]
will expand to “It is not cheating, it's
a team work
”, while %[page:slogan]
will become
“Life is short, smile while you still have teeth
”.
Please note this example is just to show how the things work. In real life
configurations, a constant selector
(like as in this example)
makes no sense at all; to get any profit, it must be an expression
containing macros, specially
conditionals, and sometimes
the expression becomes pretty complicated.
This section's content may be hard to understand just because it is unclear what it's all about and how to use it. You will not need all this stuff until you begin with the user comments facility. It is absolutely safe to skip this section now and return to it right before reading about user comments.
The so-called “request arguments” are actually values one can set within a
[page ]
ini section and access within other sections,
typically the [comments]
section. The most obvious use for
them is when outside of the [page ]
section (typically a
multipath one) you need to access some information passed via the
URI: within the [page ]
section that information is
available via the %1%
, %2%
and their family, but
these “positional arguments” are not available in other places of the
configuration file.
A [page ]
ini section may include two parameters related
to the “request argument” facility: reqargs
and
reqarg
.
The reqargs
parameter controls which request
arguments are to be set; its value is a whitespace-separated list of
identifiers you'd like to use to identify your request arguments.
The value for each of the arguments is set by adding a reqarg
parameter with the argument's identifier as the
specifier.
For example:
reqargs = alpha beta gamma reqarg:alpha = the value for the “alpha” request arg reqarg:beta = this one is for “beta” reqarg:gamma = and “gamma” goes here
In a real-life situation, these values will perhaps use macros, specially
these “positionals” (%1%
, %2%
etc.)
%[reqarg: ]
macroThe %[reqarg: ]
macro is used to access the request
arguments. The macro always accepts exactly one argument — the
identifier of the request argument to be queried. It returns the
respective value, or an empty string in case there's no request argument
with such name.