Contents:
There's one thing you should keep in mind configuring the CGI program. The
web server runs CGIs in directories where they are, so whenever the CGI
receives control, its current working directory is the directory where the
binary resides. This fact is actively used from the very start:
the program expects to find its configuration file
(thalcgi.ini
) in its current working directory, which
effectively means it must be placed in the same directory with the
thalcgi.cgi
binary itself.
As usual, the file is in the ini format.
In the present version, the configuration file name is hardcoded. If you
really need it to either have another name or be located
elsewhere, then either edit the file thalcgi.cpp
(see the
THALASSA_CGI_CONFIG_PATH
macro near the start of the file) and
rebuild, or simply build with
-DTHALASSA_CGI_CONFIG_PATH="your/path/file.ini"
.
WARNING: the configuration file must be kept private as it contains
a secret for checking the captcha. Okay, the most serious harm
you can get by leaking your thalcgi.ini
is that someone
circumvents your captcha, well, unless you place something more sensitive
in your configuration file on your own, but you don't, do you? Anyway,
please don't underestimate the possible consequences. Unfortunately
enough, the file has to be placed inside your web space, so certain care
must be taken in order not to expose it to arbitrary people through your
web server. It is strongly recommended to use suexec
so that
your CGIs are executed under UID/GID other than your web server's, and keep
the configuration file readable to its owner only (e.g., mode
0600
).
The CGI program even performs a check at every start.
In case the current UID differs from the UID of the configuration file's
owner, it assumes there's no suexec
(perhaps we're running
under the UID of the HTTP server) and aborts the check letting the things
go. It doesn't mean everything's fine, but hardly the simple program can
do anything about it. If, however, the UIDs are equal, the program checks
if there are any permissions for the configuration file for
others, and refuses to continue if so. A built-in error page is
sent to the client. In case you see it, do something like
chmod o-rwx thalcgi.ini
to fix the problem. If you
place your thalcgi.ini
into your web space with the help of
the thalassa
program (e.g., as a binary object), which is
recommended, be sure to place “chmod = 600
” into
the binary configuration section.
Just like the thalassa
program, the CGI program actively uses
macros in its configuration file. Besides
the common macros, there are some
CGI-specific macros, only available in the thalcgi.ini
file,
and, certainly, in some contexts there are context-specific macros as well.
Some of these are directly related to particular features such as pages, sessions, comments and so on; such macros will be documented along with the respective features. However, some of the CGI-specific macros don't fall into any of the feature-specific categories. To keep the picture consistent, we'll document them right here.
The %[req: ]
macro provides access to certain properties
of the HTTP request being handled. As usual, its first parameter must be
the name of a function; the following functions are supported:
%[req:method]
returns the request method, which is either
GET
or POST
; other request methods s are not
supported by the Thalassa CGI program, so the control will not reach the
program parts that expand this macro;%[req:document_root]
is the filesystem path to the
server's Document Root, as it is thought of by your HTTP
server;%[req:host]
, %[req:script]
and
%[req:path]
return the respective parts of the URL from the
request being served; e.g., if the user requests something like
http://www.example.com/cgi/thalcgi.cgi/foo/bar
, then the three
functions will return www.example.com
,
/cgi/thalcgi.cgi
and /foo/bar
, respectively;%[req:port]
returns the TCP port number being used,
usually this is 80 for HTTP and 443 for HTTPS, but not necessarily, because
any TCP server, including a web server, technically can be run on any TCP
port;%[req:param:NAME]
returns the “query parameter”
of the given name, or an empty string if there's no such parameter;%[req:cookie:NAME]
returns the given cookie's
value or an empty string if there's no such cookie; note that Thalassa CMS
only sets and handles one cookie, named thalassa_sessid
, but
this function still can be useful in case you run other CGI scripts on your
site and they set/handle something else.
The %[getenv: ]
macro gives access to the environment
variables; actually it is just a wrapper for a well-known
getenv
function from the standart C library. It accepts
exactly one argument, the name of the variable, and returns the respective
value, or an empty string in case the variable is empty or unset; there's
no way to tell these apart. As the CGI protocol makes heavy use of the
environment, this macro becomes an important tool, specially in case some
information you need for some reason is not available through the
%[req:]
macro.
The global parameters are set within the [general]
section.
As of the present, three parameters are expected:
userdata_dir
, post_content_limit
and
post_param_limit
.
The userdata_dir
sets the path to the
database directory, where
sessions, user accounts and other dynamic information is stored.
The path must be either absolute (not recommended), or relative to the
CGI's working directory (which is, let's repeat it, the directory where the
CGI binary is placed).
It's better to place the database directory outside of your web tree, so
the path will start with one or more “../
”. Make sure the
directory is available for “write” and “execute” for the CGI program
(that is, for the UID/GID under which the CGI runs). The directory itself
must exist, which means you must make it, the program won't do this for
you; however, it will create all necessary subdirectories and all work
files inside.
The other two parameters, post_content_limit
and
post_param_limit
, are intended to limit the size of what the
client can send within a POST request; both parameters set the default
value which can be overriden on a per-page basis (with parameters with the
same name within the [page ]
configuration section), and
both are given as integers, in kilobytes (damn, a kilobyte is 1024 bytes,
don't trust anyone who tries to convince you otherwise). The
post_content_limit
is the maximum size of the POST request
body, which means that in case the value of the CONTENT_LENGTH
environment variable is greater than this value times 1024, the CGI refuses
to work and displays the respective error page.
The funny thing about the post_param_limit
is that it is
completely ignored by the current Thalassa CGI version, because the present
version doesn't handle the multipart/form-data
POST body
format (which is used for forms that upload files).
Just like the thalassa
program configuration, the
thalcgi.ini
can (and perhaps should) contain the
[html]
section, in which HTML snippets and simple templates
are defined. The snippets and templates can be accessed by
%[html:snippet_name]
macro calls.
Both the ini file section and the macro work exactly the same way as in the
thalassa
program; please refer to the
[html]
section description for details and examples.
In case something goes wrong, the CGI program has to display an error page. For the case there are problems with the configuration file, e.g., it doesn't exist, unreadable or is in wrong format, the program uses a built-in template for the error page. The same is true for the case you didn't specify your custom template for the error page, but it is strongly not recommended to rely on that.
The custom error page template is set by creating a
[errorpage]
section, which is presently supposed to contain
only one parameter, named template
. The parameter's value is
passed through the macroprocessor to make the actual HTML code. Two
context-specific macros are available within the template:
%errcode%
expands to the 3-digit error code (like 404 for
“page not found”), and %errmessage%
expands to the error
message, as composed by the program.
In the present version all messages are hardcoded, which is likely to change in the future.
Error page is one of the three special pages supported by
Thalassa. In contrast to ordinary pages, special pages are not bound to
particular paths; by they nature, they can be displayed in place of any
other page in case Thalasa CGI can't (or is not allowed to) display what the
user requests. The two other special pages are
[nocookiepage]
, which is displayed instead of the requested
page in case the page requires a work session to be active but the user
doesn't have one, and [retrycaptchapage]
, displayed when the user
tried to solve the CAPTCHA test but failed. Both pages are supposed to
contain the CAPTCHA test challenge and the input form to submit the answer.
The two pages will be documented along with session handling.