Coding Standards

Style

Clang-Format

clang-format is the code formatter of choice for Yc. It is to be run on every C and header file in the repo, with the exception of files directly copied in from other sources, such as src/ur/seedrng.c, which should be left untouched. This is so that when those files are updated from upstream, it is easier to see what changed.

Since clang-format changes, including how it does formatting, the only version of clang-format that should be run on the repo is the latest version.

New Files

New C files must use docs/template.c from this repo as a starting point, new public headers must use docs/template.h from this repo as a starting point, and new private must use docs/template_internal.h from this repo as a starting point with the following changes:

  • <year_range> must be replaced by the range of years that copyright applies to.

  • <file_description> must be replaced by a 1-2 sentence description of the content of the file.

  • <GROUP> must be replaced by the name of the module the file is a part of (see below) in all uppercase.

  • <group> must be replaced by the name of the module the file is a part of in all lowercase.

  • <file_path> must be replaced by the full path to the file from the root of the repository.

  • <include_file_path> must be replaced by the full path to the file from the include/ directory.

  • <group_description> must be replaced by the Doxygen description for the module.

  • <internal_group_description> must be replaced by the Doxygen description for the internal portions of the module.

  • The // <stuff> comment should be removed.

The position of the // <stuff> comment is where code, definitions, and declarations should be placed.

In addition, if code in the file is copied and/or modified from some other code, then the following must be put after the Apache License header and before ****** END LICENSE BLOCK ******:

 *	****************************************************************************
 *
 *	Copyright <year> <copyright_holders>
 *
 *	<original_license>

Whitespace

Trailing and other extraneous whitespace is not allowed.

Indent

Indents must be tabs, set to a width of 4 characters, although that setting only applies in clang-format; programmers can set them to whatever they wish in their editors, as long as the style is as though clang-format had been run.

Alignment

After indent to match the line above, continuation lines must be aligned using only spaces. This applies to all alignments.

Continuation width must be 4 spaces in all cases that it is not explicitly specified.

Lines

There must be no more than one empty line between source code lines.

Length

Lines must be no longer than 80 characters.

Braces

Opening braces must be wrapped to their own line. They must not be indented. In addition, there must be no blank line after opening braces.

Expressions

Operands

Operands of an expression that is split over multiple lines must be aligned.

Operators

Operands on a continuation line must be aligned.

There must be one space between an operator and an operand, in every case. If necessary to fit the line length, expressions must be broken after operators. This is to ensure that continuation lines always begin with an operand, since operands are aligned.

If there are parentheses in an expression, then if the expression inside continues on another line, it must line up one space after the opening bracket.

This is bad:

stuff=more_stuff+way_more_stuff-you_know_who_is_going_to_get_you*there_is_a_sad_sort_of_clanging*(he_who_must_not_be_named/infinity_and_beyond+wont_you_be_my_neighbor*star_wars_vs_star_trek)

This is good:

stuff = more_stuff + way_more_stuff - you_know_who_is_going_to_get_you *
        there_is_a_sad_sort_of_clanging * (he_who_must_not_be_named /
                                           infinity_and_beyond +
                                           wont_you_be_my_neighbor *
                                           star_wars_vs_star_trek)

Array Indices

There must be no space between array index square brackets and operands. There must be spaces between operands and operators in square brackets.

Booleans

All Yzena code should use the <stdbool.h> header for booleans, not ints.

This allows for undefined behavior sanitizers to check for undefined behavior.

In addition, only logical operations should be performed on types that are booleans; math operations are not permitted.

String Literals

String literals can be broken across multiple lines unless they are error messages.

Pointer Alignment

Pointer alignment must be with the type, not the name. In addition, to make that rule not cause problems, pointer and non-pointer declarations must be on separate lines, even when they are the same type.

Declarations

All declarations needed in a block must be at the beginning of the block.

Casts

Casts must have a space after them and must not have spaces in the parentheses.

Includes

Includes must be grouped in the following priority order:

  1. Public headers for the repo.

  2. Private headers for the current module in the repo.

  3. Private headers for parent modules in the repo.

  4. Private headers for all other modules in the repo.

  5. Headers for dependencies.

  6. System headers.

The .clang-format file includes this sorting order.

Within groups, #includes must be sorted alphabetically.

If needed, #define values that affect headers right before including those headers, or if there are multiple, right before the first of the headers.

This is bad:

#define stuff
#include <not_affected_by_stuff.h>
#include <affected_by_stuff.h>

This is good:

#include <not_affected_by_stuff.h>
#define stuff
#include <affected_by_stuff.h>

Procedures

Declarations

Procedure declarations must have the return type on the line before the procedure name. There must be no space before the opening bracket. The parameters must be on the same line as the name, unless they do not fit in the line limit, in which case the overflowing parameters must be wrapped, aligning to the opening bracket.

There must be no space between the opening bracket and the first parameter, as well as no space between the last parameter and closing bracket. There must be no space or void in between parentheses if the parameter list is empty.

Calls

Actual parameters to procedures must be aligned after the opening bracket. There must be no space before the opening bracket. There must be no space between the opening bracket and the first argument, as well as no space between the last argument and the closing bracket. There must be no space in between parentheses if the argument list is empty.

Preprocessor

Macros

All preprocessor macros that do not take arguments must have their definitions surrounded by parentheses.

Any macros that are meant only for use in #if and other preprocessor constructs can use the YC_ prefix. All other macros, ones that are meant to be used as though they are actually C code, should use the y_ prefix.

Any macros that have the YC_ prefix must be defined in all cases. They must be defined to 0 when not active, and they must always be used with #if. No #ifdef’s or #ifndef’s are allowed for Yc macros. They are allowed for system-defined macros.

Escaped newlines in preprocessor macros must be aligned as far to the left as possible while leaving at least one space between the last significant character and the backslash (\). Also, if a macro goes beyond one line, the macro definition must be on a different line than the macro name and the definition must be indented.

Continuation lines in macros must be aligned in the same way as expressions.

This is bad:

#define SOME_LONG_MACRO SOME_REALLY_LONG_VALUE_WHICH_I_DO_NOT_REALLY_CARE_ABOUT + SOME_OTHER_REALLY_LONG_VALUE_THAT_I_DO_NOT_CARE_ABOUT_BUT_MUST_HAVE_4_CHAR_COUNT + ANOTHER_VALUE_WHOSE_SOLE_PURPOSE_IS_TO_ADD_A_THIRD_LINE

This is good:

#define SOME_LONG_MACRO                                                                 \
	(SOME_REALLY_LONG_VALUE_WHICH_I_DO_NOT_REALLY_CARE_ABOUT +                          \
	 SOME_OTHER_REALLY_LONG_VALUE_THAT_I_DO_NOT_CARE_ABOUT_BUT_MUST_HAVE_4_CHAR_COUNT + \
	 ANOTHER_VALUE_WHOSE_SOLE_PURPOSE_IS_TO_ADD_A_THIRD_LINE)

Typically, macros must only return values; they cannot create control flow, which includes returning from the current function. The only exceptions to this are meant to be in yc system code.

Comments

The purpose of comments in code is to aid understanding of the code itself.

Comment Style

Trailing comments must never be used, except to label what an #endif is closing. Otherwise, comments must always come on the previous line, including functional comments like // NOLINTNEXTLINE.

Comments must use C+±style comments (//). All comments with prose must have a space after the comment characters. All code that is commented out must not have spaces after the comment characters.

Types of Comments

There are 9 types of comments (including doc comments):

  • Function comments.

  • Design comments.

  • Why comments.

  • Teacher comments.

  • Checklist comments.

  • Guide comments.

  • Trivial comments.

  • Debt comments.

  • Backup comments.

“Function comments” are doc comments and will be talked about in their own section below.

“Design comments” are about design, and this post says about design comments:

The design comment basically states how and why a given piece of code uses certain algorithms, techniques, tricks, and implementation. It is an higher level overview of what you’ll see implemented in the code. With such background, reading the code will be simpler. Moreover I tend to trust more code where I can find design notes. At least I know that some kind of explicit design phase happened, at some point, during the development process.

Design comments are required, insofar as it is possible to write them. Any bit of design that is not in the design documents should be in comments in the code.

“Why comments” are about the why:

Why comments explain the reason why the code is doing something, even if what the code is doing is crystal clear.”

Why comments are allowed and are encouraged. Be as detailed as you wish and maybe a lot more.

“Teacher comments” are about the domain:

They teach…the domain (for example math, computer graphics, networking, statistics, complex data structures) in which the code is operating, that may be one outside of the reader skills set, or is simply too full of details to recall all them from memory.

Teacher comments are allowed and encouraged.

“Checklist comments” are meant to give a checklist of something to do when certain things change:

Specifically a checklist comment does one or both of the following things:

  • It tells you a set of actions to do when something is modified.

  • It warns you about the way certain changes should be operated.

Checklist comments are required.

“Guide comments” are about the “what” in that they describe what code is doing:

Guide comments’ sole reason to exist is to lower the cognitive load of the programmer reading some code.

They do not add anything to the code. However, they are allowed and, in fact, required, so long as they do not drop to the level of trivial comments (see below).

Guide comments are headers to every code “paragraph,” or set of expressions and statements that logically go together. Unless such a guide comment would become a trivial comment, every code paragraph should have a guide comment header.

“Trivial comments” are guide comments that do not lower the cognitive load of the programmer reading the code. They are not allowed.

An example:

// Increment the length of our array.
array_len++;

“Debt comments” are statements about technical debt such as the existence of known bugs (// XXX: ... or // FIXME: ...) or TODO’s (// TODO: ...). They are required, but the problem should be fixed as soon as possible. At that point, the debt comment can be removed.

“Backup comments” are when people comment out code because they are too insecure to delete it when a change happens. Backup comments are not allowed; Yc is under version control.

Instead, put “version comments,” which are comments that are placed in the code where the old code was deleted that has a reference to the commit where the code was deleted, such as:

// VERSION: Code here was deleted in commit 189eg1890egab.

This allows the deleted code to be easily found, if necessary.

Doc Comments

This repo has Doxygen enabled. There is a build option to enable generating documentation from doc comments. The option, if enabled, must make a separate build target that will only build documentation. In other words, building documentation is not required. Also, the option must be enabled by default.

There is a separate option for building documentation from private doc comments. This is for development only, and it should be disabled by default.

When building documentation for public use, documentation from private doc comments should be disabled, and documentation from public doc comments must be enabled.

All struct definitions (including fields), enum definitions (including each value unless the enum is for assert messages), union definitions (including each field), function definitions, and preprocessor macro definitions must be fully documented. In other words, with documentation generation on for both public interface and private data/code (if applicable), Doxygen must return no errors or warnings.

Everything but struct fields, enum values, and union fields must be documented using this style of Doxygen comment:

/**
 *
 */

The remaining ones must be documented with “three-slash” doc comments (i.e., the doc comment begins with three slashes like this: ///). In that case, there must be a space after the three slashes.

Control Flow

There must be a space after a control flow keyword and the opening bracket.

Single Lines

Loops and if statements can elide braces if the statement is only one line, if there are no other control flow statements within, and if the complete statement can fit into 80 characters. If so, it must be put on the same line as the control flow statement.

If an if statement has an else, both must be only one line to elide the braces. This includes every part of an else if chain. In addition, the statement must be on a separate line from the if or else.

An else if is an exception to the requirements to be on the same line; an else if should always be set together.

Switch

All case statements must be indented one level beyond the switch statement. All case statements’ bodies must be indented one level beyond the case statement itself. There must be a blank line between case statements, except in the case of a fallthrough which must not have a blank line between case statements.

If more than one case has the exact same code, it must not be duplicated, and the cases shall be stacked on top of each other without braces. No fallthrough comment is required.

All case statements in a switch must have braces. The break statement (if it exists) must be inside the braces. This is to prevent compile errors when variables are assigned inside a case statement.

If there is no break statement, there must be the following comment and statement before the closing brace and indented to one level more than the closing brace:

// Fallthrough.
y_fallthrough;

This is to tell compilers and other developers that the omission is intentional. y_fallthrough uses compiler directives where possible as well.

This is bad:

switch(stuff)
{
case 0: return 1;
case 1:
	printf("stuff");
	break;
case 2:
	printf("stuff1");
case 3:
	printf("stuff2");
	break;
case 4:
case 5:
case 6:
	printf("stuff again");
	break;
}

This is good:

switch(stuff)
{
	case 0:
	{
		return 1;
	}

	case 1:
	{
		printf("stuff");
		break;
	}

	case 2:
	{
		printf("stuff1");
		// Fallthrough.
		y_fallthrough;
	}
	case 3:
	{
		printf("stuff2");
		break;
	}

	case 4:
	case 5:
	case 6:
	{
		printf("stuff again");
		break;
	}
}

Standards

Modules

Code must be split up into modules. Generally, a module consists of one source file, and all of the code in the source file must be related. Module names must only be one word, and are used in naming.

Names

These naming rules are here to prevent name clashes within the repo and also when the libraries are used in outside code.

Public data structures (structs, enum, and unions) must all be named in the following manner:

<Module>_[<Submodule>_](<Type>)*

where <Module> is usually y_ but can be the names like Yao, Yvm, and Rig, and must start with a capital letter, and <Type> is the type name with a capital letter and in CamelCase.

Examples:

y_Vector
y_Map
y_Multiplexer
rig_Rig

Private data structures must use the same prefix as public ones. They can be capitalized however would be best, with a bias towards lowercase snake case.

Examples:

y_vec
y_map
rig_rig

All preprocessor macros must be of the form:

<MODULE>(_<SUBMODULE>)*_<MACRO_NAME>

where <MODULE> is the name of the module with lowercase letters and <SUBMODULE> is the name of submodules with all capital letters, and <MACRO_NAME> is the macro name with all capital letters.

<MODULE> is almost always going to be y.

All procedure names must be of the form:

<module>(_<submodule)*_<procedureName>()

where <module> and <submodule> are names of the module and submodules with all lowercase letters, and <procedureName> is the name of the procedure in camelCase with a lowercase first letter.

Typedefs

Data structures (structs, enum, and unions) must all be defined in headers. If the data structure is private, it must be defined in a private header in the source directory.

Struct, enum, and union definitions must always be typedefed as follows:

typedef (struct|enum|union) <name>
{
	...

} <name>;

This is to prevent proliferation of the struct, enum, and union keywords throughout code.

Repo Structure

This repo must be structured with at least the following subdirectories:

  • docs (build system and Doxygen code to generate docs)

  • include (public headers)

  • lib (git submodules or copies of dependencies)

  • src (source code)

  • tests (tests)

  • tools (tools and scripts for the repo)

In addition, the include directory must include a directory that is the same name as the repo. This is to prevent header clash with the C standard library and other standard headers.

Cross-Platform Builds

All code must be buildable on any supported platform at all times (except in early development, which only includes development time before any beta is released).

Dependencies

This repo must have minimal dependencies. If dependencies are necessary, they must be Yzena projects as much as possible.

Dependencies are allowed in the following cases:

  1. Security (use audited libraries).

  2. Code that is not a “core business function” for which there are well-maintained and regularly released alternatives with a known person to contact and pay if necessary.

As an example, Yar will use SQLite, Curl, and BearSSL.

Submodules

Modules that are applications must have all of their dependencies as git submodules to enable building without dependency failures.

Libraries must only have dependencies as git submodules if the library also encapsulates the dependency.

Error Checking

All error checking must be explicit and must never be skipped, even for malloc() and its companions. There is one exception to this: if calling a function that will check for an error and jump out if one exists. In that case, the called function will not return an error, and the calling function will not have to check for one.

When checking for an error, it is not allowed to take a shortcut assuming the value of either the error or success value.

In addition, when checking a value for an error, the macro y_err must be used. This makes it clearer that it is checking for an error, and on certain compilers, it generates much more efficient code for the non-error case.

This is bad:

y_Status status = y_vec_push(vector, &value);
if (status) return status;

This is good:

y_Status status = dvec_push(vector, &value);
if (y_err(status != y_STATUS_SUCCESS)) return status;

Libraries

The contents of this section only applies to libraries; repos that build only an application do not have to follow these guidelines.

Encapsulation

If the library deals with any data that clients (users of the library) must not touch directly, the library must encapsulate all of its data and expose a handle to that data behind a typedef to an opaque pointer.

As an example from vectors:

typedef struct y_vec* y_Vector

where y_vec is not defined (it must be defined in private headers).

Public procedures must (usually) then take the opaque pointer (y_Vector in this case) as the first argument, and non-public procedures must (usually) take the defined pointer (y_vec* in this case) as the first argument. This allows developers to easily differentiate between the two when looking at source code.

Creating a typedef with a void pointer (void*) is not allowed; it requires superfluous casting and the compiler cannot warn clients of the library when they are accidentally passing the wrong type (like passing a y_Vector to procedures that take a y_NVector).

If there is any data that clients must have access to, it must be accessible through getters and setters, whose names must be of the following forms:

// Getter
<module>(_<submodule)*_<dataName>()

// Setter
<module>(_<submodule)*_set<DataName>()

where <module> and <submodule> are the names of the module and submodules in all lowercase, <dataName> is the name of the data in camelCase with a lowercase first letter, and <DataName> is the name of the data in CamelCase with a capital first letter.

Exceptions to the setter rule can be made where they make sense. For example, if there is a boolean field called enabled in a struct called layout, such layouts might be enabled, disabled, and queried for their enabled status with these procedures:

bool
y_layout_enabled();

void
y_layout_enable();

void
y_layout_disable();