Lexical aspects

Encoding§

The code is made of ASCII characters only. Multi Bytes Characters (only UTF8-encoded) are allowed within string literals and comments. This allows to lex styx code without decoding.

SheBang line§

A Styx source file may start with a unix-style script line

#! /usr/bin/styx
unit u;

Keywords§

The following identifiers are reserved and cannot be directly used as identifier:

Keywords used for declarations or statements§

Keywords used for the built-in types§

Keywords used for special values§

Identifier§

Identifier ::= [a-zA-z_][\w]*

Identifiers start with an alpha character or an underscore and is followed by any alphanum or underscore. While keywords can not be used as identifiers, a special syntax exists to denote that an identifier would be a keyword. This work by prefixing the keyword with a $. While in the code the keyword is still not usable as identifier this helps to have to write it when RTTI or serialization is involved.

_identifier // ok
identifier  // ok
false       // error
$false      // ok
5th_wheel   // error

IntegerLiteral§

Grammar§

Note that the RHS is a regex.

IntegerLiteral ::= [1-9][0-9_]*

Description§

Integer litereals are made of decimal digits, optionally separated by an underscore passed the the first decimal digit.

1000000     // ok
1_000_000   // ok
_1_000_000  // error

HexLiteral§

Grammar§

Note that the RHS is a regex.

HexLiteral ::= (0x|0X)[0-9A-Fa-f][0-9A-Fa-f_]*

Description§

Hex literals are made of a prefix, either 0X or 0x, followed by hex digits, optionally separated by an underscore passed the first hex digit.

0xDEADBEEF      // ok
0X1111_F0F0     // ok
0x_B            // error

BinLiteral§

Grammar§

Note that the RHS is a regex.

BinLiteral ::= (0b|0B)[01][01_]*

Description§

Bin literals are made of a prefix, either 0B or 0b, followed by bin digits, optionally separated by an underscore passed the first bin digit.

0b1001001   // ok
0B10_10_10  // ok
0b_10       // error

FloatLiteral§

Grammar§

Note that the RHS is a regex.

FloatLiteral ::= [0-9][0-9_]*.[0-9][0-9_]*+((E|e)(\+|-)[0-9_]+)?

Description§

Float literals begins with an optional integral part, followed by a dot and then the fractional part. Each part can be separated by an underscore passed to first digit of the part. If the inegral part is present then the fractional part becomes optional. The fractional part my ends by an exponent, that is e or E, an optional sign and a an inegral number.

0.0     // ok
.0      // ok
0       // ok
_0.0    // error
._0     // error
0.0E+1  // ok

Sign prefixes§

There is no sign prefixes, however the NegExpression can be used to negate literals, which get evaluated during compilation.

Suffixes§

There is no suffixes to denote the type. This is done using a CastExpression, which is always interpreted at compile-time when following a literal and if the cast is for a basic type.

StringLiteral§

Grammar§

Note that the RHS is a regex.

StringLiteral ::= "([\w\s](\\")*)*"

Description§

String literals

"foo bar"           // foo bar
"foo\tbar"          // foo    bar
"\"foo\" \"bar\""   // "foo" "bar"

Recognized escape sequences are

RawStringLiteral§

Grammar§

Note that the RHS is a regex.

RawStringLiteral ::= `\w*`

Description§

Raw string literals

`raw "string"\0 literal`  // raw "string"\0 literal

Character literals§

There is no character literal. String literals containing a single character are used for this purpose. When inference is involved, s8* is prefered.

const s8 = "a";     // s8
const auto = "a";   // s8*.

Comments§

Line comments§

Line comments start with double slashes and ends at the end of the line.

no_comented1 // commented
no_comented2

Multi line comments§

Multi line comments start with a slash followed by an asterisk and ends with an asterisk followed by a slash. They may spread on several lines.

They can contain UTF8-encoded multi byte characters, excepted directional overrides.

not_commented1 /* commented
commented
*/ not_commented2