C++11 generalized attributes

Generalized attributes are a new C++11 core language feature that is not much talked about. The reason for this is because attributes are not easily extendable by the user. In their current form, one cannot define custom attributes without also having to modify the C++ compiler in one way or another to support them (more on this later). Rather, their official purpose is to allow future C++ extensions without needing to add new keywords and augment the language grammar. But, I think, it is fairly reasonable to expect that attributes will also be used for proprietary extensions as well as to embed domain-specific languages (DSL) into C++. In fact, even the original proposal for this feature suggested that attributes can be used to handle OpenMP parallelism annotations.

The ability to create a C++-embedded DSL is what got me interested in the generalized attributes in the first place. ODB (project I am currently working on) is a compiler-based C++ ORM that uses a #pragma-based DSL to embed database-related information, such as table and column names, database types, etc., into C++ classes. While the #pragma approach works fairly well, it has its drawback, mainly the fact that pragmas cannot mix with C++ constructs; they always have to appear on a separate line. In this respect, C++11 generalized attributes seemed much more flexible since they are part of the language proper. So I decided to explore this feature in more detail to see if it can be considered as a potential future replacement for pragmas in ODB.

But first, let’s see what the generalized attributes are all about. As I mentioned above, there is not much information on this feature. The two primary sources are the original proposal paper called Towards support for attributes in C++ and the C++11 standard itself. The original proposal paper went through many revisions (the link above is to revision 6, which seems to be the last). While the proposed wording changes for the standard in the latest revision are almost (but not exactly) what ended up in the standard, the rest of the paper (discussion, examples, etc.), hasn’t been updated and in many cases is incorrect per the published standard. The problem with the standard is that it is dry, uses its own terminology (what’s a declarator-id?) and often lacks motivation and examples. In the case of attributes, the changes are spread over many chapters making it difficult to see the whole picture.

I spent quite a bit of time going back and forth between the various revisions of the paper and the standard trying to put the pieces together. The result is what I hope is a more approachable description of the C++11 generalized attributes.

Attributes can be specified for almost any C++ construct. A single attribute or a comma-separated list of attributes must be enclosed into double square brackets. For example:

int x [[foo]];          // foo applies to variable x
void f [[foo, bar]] (); // foo and bar apply to function f

An attribute name can be optionally qualified with a single-level attribute namespace and followed by attribute arguments enclosed in parenthesis. The format of attribute arguments is attribute-dependant. For example:

int x [[omp::shared]];
 
[[likely(true)]] if (x == 0)
{
  ...
}

Let’s now look at how to specify attributes for various C++ constructs. When we introduce or refer to a named entity, then, with a few exceptions, an attribute applies (or appertains, in the C++ standard terminology) to the immediately preceding entity. In other cases, attributes normally apply to the following entity. Not very clear, I know. In my experience, the best way forward is to get an intuitive understanding by looking at enough examples. So let’s start with the named entities, which are variables, functions, and types. For variables and functions, including function parameters, the attribute is specified after the name (called declarator-id in the standard):

int x [[foo]], y [[bar]]; // foo applies to x,
                          // bar applies to y
 
int f [[foo]] (int i[[bar]]); // foo applies to f
                              // bar applies to i

But we can also specify an attribute at the beginning of the declaration, in which case it applies to all the names (this is done to support syntax similar to the storage class specifier, such as static or thread_local). For example:

[[omp::shared]] int x, y; // omp::shared applies to both
                          // x and y

Ok, let’s now look at types. Things are a bit more complicated here and we will start with references to types (just to be clear, by reference here I mean “referring to a previously-declared type using its name” rather that forming a reference, as in “l-value reference”). Similar to variables and functions, an attribute for a reference to a type is specified at the end. For example:

int [[foo]] x; // foo applies to x's type

Note that such an attribute affects a type only for the declaration in which it appears. In other words, the above declaration doesn’t attach attribute foo to the fundamental type int but rather to the type of variable x. Here is another example that illustrates this point:

int [[foo]] x, y; // foo applies to x's and y's type
int z;            // foo does not apply to z's type

If we want to create a type with an attribute, then we can use the typedef or alias declaration:

typedef int [[foo]] int_foo;
using int_foo = int [[foo]];

Interestingly, the above two declarations can also be written like this, which should have the same semantics:

typedef int int_foo [[foo]];
using int_foo [[foo]] = int;

If we are declaring our own class type, then we can also specify its attributes (a round-about way to achieve the same would be to first declare it without any attributes and then use the typedef or alias declaration to add them). You would expect that attributes come after a class name or a class body but here we have another exception: the attributes come after the class keyword and before the class name (or body, if there is no name). Here is an example:

class [[foo]] c
{
};

Putting all of the above together we can have quite an elaborate attribute sequence:

[[attr1]] class [[attr2]] c {...} [[attr3]] x [[attr4]], y;
 
// attr1 applies to variables x and y
// attr2 applies to class c
// attr3 applies to x's and y's types
// attr4 applies to variable x

Ok, those are the rules for named entities such as variables, functions, and types. Attributes can also be specified for simple statements, blocks, as well as selection (if, switch), iteration (for, while, do), and jump (break, continue, return) statements. In all these cases, attributes come first. For example

[[attr1]] for (...) // attr1 applies to for-loop
[[attr2]] {         // attr2 applies to block
  [[attr3]] f ();   // attr3 applies to statement
  [[attr4]] break;  // attr4 applies to break
}

Finally, attributes can appear on their own in the namespace scope, for example:

[[attr1]];
 
namespace n
{
  [[attr2]];
}

What do such attributes apply to? The standard seems to indicate that this is attribute-dependant. For example, we can have attributes that always apply to the translation unit, or to the current namespace, or even to the declaration immediately following them.

These are the C++ constructs for which you will most likely encounter or want to add an attribute. There are other less-likely places, such as the base class specification, as well as the case and default labels. Interestingly, one feature that was left attribute-unaware is namespaces. In particular, we won’t be able to write something like this:

namespace [[visibility(hidden)]] n
{
}

My guess is that the standard committee felt that namespace-scope attributes will be sufficient to provide a similar functionality:

namespace n
{
  [[visibility(hidden)]];
}

Let’s now get back to the idea of using attributes to embed a domain-specific language (DSL) into C++. To add support for an attribute we would need to modify the compiler. While this probably won’t be possible any time soon for, say VC++, things are much more promising in the open-source world. GCC had support for compiler plugins since version 4.5.0 (if you are interested, I wrote a series of posts on parsing C++ with GCC plugins). Besides other things, a plugin can register custom pragmas and GCC attributes. I am sure when GCC supports C++11 attributes, it will be possible to register custom ones as well. Clang will probably have a similar mechanism.

Note that this doesn’t mean our DSL has to be limited to just these compilers. If we do some kind of static analysis, then nothing prevents us, for example, from using a GCC-based analyzer in a VC++-based project. Even if the purpose of our DSL is to provide some additional functionality to the resulting application (e.g., persistence or extended type information), then it can be portable if we make it a pre-compiler with standard C++ as its output. This is exactly how ODB works. The GCC-based ODB compiler compiles a header that defines a class and generates a set of C++ files that contain database persistence code for this class. This code can then be compiled with any C++ compiler, including VC++.

Finally, to get a sense of attributes’ usability, I compared a pragma-based DSL example that is currently implemented in ODB to a hypothetical C++11 attribute-based version. A typical persistent class declaration in ODB might look like this:

#pragma db object
class person
{
  ...
 
  #pragma db id auto
  unsigned long id_;
 
  std::string first_;
  std::string last_;
};

And an attribute based approach could look like this:

class [[db::object]] person
{
  ...
 
  unsigned long id_ [[db::id, db::auto]];
  std::string first_;
  std::string last_;
};

It definitely looks tidier. And here is how they compare if we add a bit more customization:

#pragma db object table("people")
class person
{
  ...
 
  #pragma db id auto type("INT")
  unsigned long id_;
 
  #pragma db column("first_name")
  std::string first_;
 
  #pragma db column("last_name")
  std::string last_;
};

And the attribute based approach:

class [[db::object, db::table("people"))]] person
{
  ...
 
  unsigned long id_ [[db::id, db::auto, db::type("INT")]];
  std::string first_ [[db::column("first_name")]];
  std::string last_ [[db::column("last_name")]];
};

Which one do you like better?

Comments are closed.