Lexical Structure
Lexemes are recognized as the longest sequences of characters matching a token description. If one wants two lexemes that would be interpreted as one when written one after another, they must be separated by whitespace.
Whitespace
<whitespace> ::= <whitechr>+
<whitechr> ::= | \n | \r | \f | \t | \v | <unispace>
Legal whitespaces of ZK-SecreC consist of one or more whitespace characters. Blank space, line feed, carridge return, form feed, tab and vertical tab are interpreted as whitespace characters. In addition, every Unicode space character fits.
Comments
ZK-SecreC supports line comments and block comments. Line comments start with two slashes (//
) and end with a line break. Block comments are opened with /*
and closed with */
. Block comments can be nested.
Names
Identifiers
<varid> ::= <idfirst> <idother>*
<idfirst> ::= <large> | <small>
<idother> ::= <idfirst> | <digit> | '
<large> ::= <asclarge> | <unilarge>
<asclarge> ::= [A-Z]
<small> ::= _ | <ascsmall> | <unismall>
<ascsmall> ::= [a-z]
<digit> ::= <decdigit> | <unidigit>
<decdigit> ::= [0-9]
An identifier is a Unicode string that can contain alphanumeric characters, decimal digits, underscores and apostrophes. Identifiers must begin with an alphanumeric character or an underscore.
Operators
ZK-SecreC allows defining new operators. Operators consist of one or more characters in the following supply:
!#$%&*+/<=>?@\^|-~
Special character sequences
Some character sequences are reserved for special purposes and cannot be used as names of user-defined identifiers or operators.
Keywords
ZK-SecreC has the following keywords:
_
arr
as
bool
break
Challenge
continue
Convertible
dbg_assert_eq
dbg_assert
default
Domain
eff
else
ExtendedArithmetic
extern
false
Field
fn
for
forall
if
impl
in
inf
infix
infixl
infixr
let
list
mut
Nat
PermuatationCheck
post
pre
prover
pub
public
Qualified
rec
ref
return
self
Self
sieve
store
Stage
string
struct
trace
true
tuple
type
uint
unchecked
unit
Unqualified
use
Vectors
Vectorization
verifier
where
while
wire
with
witness
zip
Keywords self
and Self
have no purpose in the current version of the language. They are reserved for prospective future extensions. Keyword witness
is no more used in the language but it is still reserved for giving a proper error message in case it is used in a program.
Special symbol sequences
There are also symbol sequences that cannot be used as operators since they have other meaning in ZK-SecreC programs:
->
=>
@
$
;
::
:
,
=
Symbol sequences +
, *
, <=
, !
also have predefined meaning but still serve as builtin operators. In addition, special symbol sequences .
and ..
are used in ZK-SecreC.
Parentheses
The following symbol sequences are used as opening delimiters:
(
.(
[
{
{#
{##
Closing delimiters always consist of a single symbol:
)
]
}
When matching the delimiters, all symbols in the opening delimiter other than parentheses are ignored. For instance,
{ ... .( ... {# ... } ... [ ... ] ... ) ... }
has balanced delimiters (assuming that there are no delimiters other than those explicitly shown): Delimiters 1 and 2 match with delimiters 8 and 7, respectively, whereas delimiters 3 and 4 match each other and delimiters 5 and 6 match each other.
Literals
Unsigned integer literals
Unsigned integer literals can be written as binary, octal, decimal or hexadecimal numbers. Binary, octal and hexadecimal numbers must start with corresponding prefix:
Prefix | Radix |
---|---|
0b | binary |
0o | octal |
0x | hexadecimal |
Binary, octal and decimal numbers are written using the usual digit characters. Hexadecimal numbers can also contain letters a-f, either capitalized or not. An integer literal may contain any positive number of digits.
Boolean literals
There are two built-in boolean literals: true
and false
.
String literals
String literals are delimited by the double quote ("
) characters. Any double quote characters within the string must be escaped by a backslash character. A backslash itself must be escaped, too. One can use \n
and \t
to encode the newline and the tabular character, respectively. It is also allowed to encode characters via their ASCII/Unicode codes in decimal like in \65
, as well as in binary, octal or hexadecimal notation like in \b1000001
, \o101
and \x41
.