The C++ and plain C subsets used for Thalassa

Contents:

Preamble
#include"" vs. #include<>
Restrictions for C++

No C++ standard library
No standard-invented “features”
No loop var definition within for head
No exception handling
No RTTI nor dynamic_cast
No member pointers
No GNU extensions
Default argument values
Templates are okay, but not for containers

Restrictions for plain C

No features from C99 and later "standards"
Allowed subset of the standard library
No GNU extensions

Conclusion

Preamble

Technical standards are used to be commitee-made. From the other hand, it is well-known that commitees, by their very nature, are unable to produce anything useful. In the very best case, commitee-made things are useless, but far more often they are seriously harmful.

Speaking particularly about programming languages, the commitees are used to issue a command to the whole world — from now on, the language that people know becomes totally different and everyone must agree.

It looks obvious that no one in the world can have powers to command to the whole world. No one, never ever. If governments let various “standard bodies”, such as ISO, do what they do, this only means the governments went far beyond the limits of acceptable.

C99 and later “standards” explain languages which are totally different, and they has nothing to do with the C language; only C90 was more or less close to what the C language is. The same is true for the C++ language, but starting with the very first “standard”, issued in 1998. None of the so-called “C++ standards” ever had anything in common with what C++ really is.

Honest behaviour would be to give these specifications different names. There are examples of such well behaviour in history, such as the language named Scheme, despite everyone understand it is just another Lisp dialect. Even C++ itself is an example of a more or less honest naming: although the name obviously suggests that it is the same C but better, it is still a different name, and Stroustrup never tried to convince the world that plain C is from now on obsoleted and C++ should be used instead.

Having said this, it becomes clear that what standard commitees do is, plainly speaking, fraud. Industry is conservative, and it is very hard to convince it to adopt a completely new language; so the commitees use the names of well-known languages to endorse what they create.

BTW, even if we take these standards simply as specifications for new languages, they are horrible. It's because the commitee members hold no responsibility for what they do. They even don't have to implement their own ideas, they only need to vote for it, and other people will have no other choice but to implement what the commitee voted for.

We can't stop the commitees, at least right now. We've got no power to scatter them. But there's one thing we can do right now: namely, we can boycott everyhting the damn commitees do. So, let's do at least what we can.

#include"" vs. #include<>

Before the commitees touched all this with their dirty hands, the difference between #include "" and #include <> was obvious: the form #include "" was used for the headers that belong to your program itself, while #include <> was for the header files external to your program, that is, headers from libraries, no matter whether it is the so-called “standard” library or any other.

In particular, for Thalassa, all the libraries included within the source tarball are still libraries, so their headers MUST be included with #include <>.

The #include "" form is only to be used for headers that reside within the cms/ subdirectory.

Restrictions for C++

No C++ standard library

The so-called standard library of C++ must not be used in any form. Simply speaking, you must not include any header files that have no “.h” suffix and you must not use any names from that damn “namespace std”, neither with explicit std:: prefix, nor with the using namespace directive (as we'll see in the next section, the using directive is prohibited as such, because namespaces themselves are prohibited).

Compilers usually provide “legacy” headers such as iostream.h, vector.h and the like. These are prohibited, too.

Definitely it is strictly forbidden to include “plain C compatibility” headers such as cstdio, cstring and so on. As it will be mentioned later, some header files from the plain C standard library are not allowed, but many of them are allowed — and they must be included exactly as you would do it in plain C.

People often ask one (stupid) question: okay, but what to use instead of the standard library? Sometimes it is possible to answer this question correctly. Instead ot iostream, either plain C “high-level” input-output may be used (that is, the functions and types declared in stdio.h), or you can use i/o system calls such as open, close, read, write etc. directly. Instead of the (very stupid and ugly) string class, the particular project (Thalassa CMS) uses the class named ScriptVariable, defined by scriptpp library.

It is very important to understand (and accept!) that no replacement for STL templates (containers and algorithms) allowed, neither from other libraries, nor written by yoursef. Data structures are built exactly the same way as in plain C, manually, for every particular task. There is no such thing as “generic” data structures. Period.

No standard-invented “features”

Actually, none of the “features” introduced by the so-called “standards” is allowed here. Just to start with, your code must not contain any of the following keywords: using, namespace, typename (the class keyword is to be used in template arguments that represent types), nullptr (the numerical 0 must be used to denote the null address), auto (damn if you feel you need it, you're wrong: don't prevent the compiler from detecting your errors!), constexpr, consteval, constinit, noexcept, final, override, import, module, requires, export, co_yield, co_return, co_await, wchar_t, char8_t, char16_t, char32_t, alignas, alignof, register, static_assert, thread_local...

Something may be missing from this list, but you've got the idea: if a new keyword is “added” by another damn “standard” or its meaning changed, then the keyword is prohibited here.

Not only keywords are prohibited; all the “concepts” introduced in these “standards” are prohibited, too. Don't even think about all these lambdas, coroutines, structured bindings, move semantic (it is when a type for constructor's argument is declared with &&), variadic templates etc.

Once again: everything that comes from C++ “standards” is prohibited. Period.

No loop var definition within for head

Please don't do this:

    for (int i = 0; i < 10; ++i) {

Instead, define your loop variable right before the loop head, like this:

    int i;
    for (i = 0; i < 10; i++) {

No exception handling

This is a decision made for this particular project. Actually, exceptions don't come from “standards”, and despite their terrible inefficiency, sometimes it is okay to use them as they really allow to save a lot of programmers' time. However, in Thalassa CMS exceptions are not used.

No RTTI nor dynamic_cast

Those are not from standards, too, but RTTI is too monstrous, and dynamic_cast is too slow. Don't use them both.

No member pointers

There is absolutely nothing wrong with member pointers, except for one thing: if you really need a member pointer, then your class (or structure, heh?) is overcomplicated. Instead of member pointers, better try to refactor your classes and methods so that they stop being that complicated.

No GNU extensions

Well, do we really have to mention that horrible things like nested functions, VLAs and lots of other monsters that gcc supports as “extensions” are not to be used? Heh, actually they are so ugly that even damn standard commitees don't want to adopt them.

Default argument values

Default values for arguments of functions and methods are discouraged but still allowed. However, we restrict what can serve as the default value. The standards are too liberal on this. Within Thalassa code, only explicit compile-time constants may be used in this role. These include explicit numerical, char and string literals, macros that are known to expand to such literals, and enum constants. Nothing else is allowed.

Templates are okay, but not for containers

Many similar guides prohibit templates altogether. This might surprize you, but here we disagree: within Thalassa CMS, templates as such are allowed. What is not allowed is using templates to create generic container classes and “algorithms” for them, like STL does.

People often ask smth. like "hey, but what else templates can be used for?!" Okay, if you don't know the answer, then simply don't use templates at all. However, once you encounter a small task where templates can make things better (not being used to create another damn generic container), then, well, recall this section.

Restrictions for plain C

Thalassa CMS is mainly implemented in C++, with only some smaller modules written in plain C. However, there actually is some plain C code, so we have to explain what's okay and what's not okay in such code.

There's one trivial thing we must always remember: C and C++ are two different programming languages.

No features from C99 and later "standards"

First of all, it is prohibited to use VLAs.

Once again, it is prohibited to use VLAs.

For those who didn't understand: it is prohibited to use VLAs.

But, well, VLAs are not the only thing to be prohibited. As a rule, C99 is not C, and all the later crap such as C14 or C23 has nothing to do with the C language at all; if we write in C, then let's write in C.

There are no "line" comments in C, those that start with “//”. In plain C, only /* ... */ comments are allowed.

A variable or a type in plain C can only be defined (or even just declared) either in global space or at the very start of a block; no declarations and definitions may come after a statement. So, in particular, this is illegal:

  int f(int n)
  {
      int *a;
      a = malloc(n);
      int x;
      /* ... */
  }

This is because the variable x is defined after the statement (the one which contains malloc). No more declarations are allowed in the block once at least one statement is encountered.

Well, remember we're discussing plain C now. In C++, declarations may be placed wherever you want, but plain C is not C++.

Certainly, all these ugly things like complex numbers, wide chars, L-strings, are not allowed.

However, there are not so obvious limitations. First of all, there is no bool type in plain C. Arithmetic zero stands for false, any arithmetic non-zero is true, and in case you need to explicitly specify boolean truth, use 1.

Please don't even think about using various “optimization hints” such as the restrict keyword and also these likely and unlikely macros. Simply forget about them.

Also, in plain C there are no designated initializers, nor compound literals. So, in the following code everything is wrong:

  struct mystr s1 = { .name = "John", .count = 5, .avg = 2.7 };
  s2 = (struct mystr) { .name = "John", .count = 5, .avg = 2.7 };

You may feel pity for these as they are convenient. The problem is that they come from “standards”.

And one more thing: there are no inline functions in C.

Allowed subset of the standard library

If something is included into the standard C library, this doesn't automatically mean you should use it. Tendencies are that one day we'll have to create our own library to be used instead of the “standard” one, and the less we use of the libc now, the less effort it will take to get rid of it.

Unfortunately, it is a bit hard to give a complete answer on what is okay and what is not okay here; may be such answer will be given later. As of now, the following is definitely allowed:

system call wrappers and their infrastructure such as constant definitions;
library functions of the exec* family;
the higher-level input/output functions declared in stdio.h, except for fread and fwrite which are senseless (use syscalls instead), and the gets function which must never be used for obvious security reasons;
the errno variable;
the functions malloc, free, getenv, setenv, unsetenv, exit and _exit from the stdlib.h header file;
the string manipulation functions declared in the string.h header file;
the math functions declared in the math.h header file and available with -lm.

From the other hand, the following is definitely not allowed:

any “extensions” such as GNU extensions;
everything that depends on locales, such as functions from the ctype.h header file; we have to make an exception for the [fvs]printf function family here, and that's a problem, but perhaps our own version of these functions will not depend on locales;
everything that ruins the possibility to build statically, such as getpwnam and the company;
everything that uses or depends on threads;
functions invented specially to support threads, usually named with the _r suffix, such as strtok_r.

For all the features not included into either of these lists, the situation is subject to discussion. For features that “formally” appear on both lists, such as the strtok_r function which is also “a function from string.h”, the list of prohibitions has the precedence.

No GNU extensions

We already discussed this for C++, and we have to repeat it for plain C, too: horrible things like nested functions and lots of other monsters invented by gcc team as “extensions” are not to be used: they are so ugly that even damn standard commitees don't want to adopt them.

Conclusion

This text is not an invitation for any discussion. If you think we're wrong, feel free to keep that opinion, but please don't waste your and our time trying to convince us. Actually, we tend to consider any attempt to do so as a reason for immediate and permanent ban.

Furthermore, the list of prohibitions given here is thought of as incomplete, which means more restrictions can be added to it in future. From the other hand, none of the restrictions will ever be removed.

Having said all this, we warmly welcome typo reports and corrections for the language this text is written in. The author is not a native English speaker, so any feedback on particular wording, grammatics, all these tenses etc. is appreciated, specially when it comes from native English speakers.