GP1 is a statically typed, multi-paradigm programming language with an emphasis on brevity and explicitness. It provides both value and reference types, as well as higher-order functions and first-class support for many common programming patterns.
This document serves as a quick, informal reference for developers of GP1 (or anyone who's curious).
A given "variable" is defined with either the var
or
con
keyword, for mutable and immutable assignment
respectively, alonside the assignment operator, <-
. An
uninitialized variable MUST have an explicit type, and cannot be
accessed until it is assigned. A variable that is initialized in its
declaration may have an explicit type, but the type may be inferred
here, when possible, if one is omitted. Normal type-coercion rules apply
in assignments, as described in the Coercion and Casting
section.
Non-ascii unicode characters are allowed in variable names as long as the character doesn't cause a parsing issue. For example, whitespace tokens are not allowed in variable names.
Some examples of assigning variables:
var x: i32; // x is an uninitialized 32-bit signed integer
var y <- x; // this won't work, because x has no value
x <- 7;
var y <- x; // this time it works, because x is now 7
con a: f64 <- 99.8; // a is immutable
a <- 44.12; // this doesn't work, because con variables cannot be reassigned
The following lines are equivalent,
con a <- f64(7.2);
con a: f64 <- 7.2;
con a <- 7.2; // 7.2 is implicitly of type f64
con a <- 7.2D; // With an explicit type suffix
as are these.
var c: f32 <- 9;
var c <- f32(9);
var c: f32 <- f32(9);
var c <- 9F;
Variable assignments are expressions in GP1, which can enable some
very interesting code patterns. For example, it allows multiple
assignments on one line with the following syntax.
con a <- var b <- "death and taxes"
assigns the
string "death and taxes"
to both a
and
b
, leaving you with one constant and one variable
containing separate instances of identical data. This is equivalent to
writing con a <- "death and taxes"
and
var b <- "death and taxes"
each on their own line.
Assignment as an expression also eliminates much of the need to define
variables immediately before the control structure in which they're
used, which improves readability.
u8
u16
u32
u64
u128
u256
usize
byte
i8
i16
i32
i64
i128
i256
isize
f16
f32
f64
f128
f256
GP1 has signed integer, unsigned integer, and floating point numeric
types. Numeric types take the form of a single-letter indicator followed
by the type's size in bits. The indicators are i
(signed integer), u (unsigned integer), and
f (floating point). usize
and
isize
are pointer-width types. For example, on a 64-bit
system, usize
is a 64-bit unsigned integer. However, it
must be cast to u64
when assigning to a u64
variable. The type byte
is an alias for u8
.
Numeric operators are as one expects from C, with the addition of
**
as a power operator.
Numeric literals have an implicit type, or the type can be specified by a case-insensitive suffix. For example:
var i1 <- 1234; // implicitly i32
var f1 <- 1234.5; // implicitly f64
var i3 <- 1234L; // i64
var u3 <- 1234ui; // u32
var f2 <- 1234.6F; // f32
The complete set of suffixes is given.
suffix | corresponding type |
---|---|
s | i16 |
i | i32 |
l | i64 |
p | isize |
b | byte |
us | u16 |
ui | u32 |
ul | u64 |
up | usize |
f | f32 |
d | f64 |
q | f128 |
bool
is the standard boolean type with support for all
the usual operations. The boolean literals are true
and
false
. Bool operators are as one expects from C, with the
exception that NOT is !!
instead of !
.
Bitwise operators can be applied only to integers and booleans. They
are single counterparts of the doubled boolean operators, e.g. boolean
negation is !!
, so bitwise negation is !
.
char
is a unicode character of variable size. Char
literals are single-quoted, e.g. 'c'
. Any single valid char
value can be used as a literal in this fasion.
string
is a unicode string. String literals are
double-quoted, e.g. "Hello, World."
.
GP supports typical array operations.
var tuples : (int, int)[]; // declare array of tuples
var strings : string[]; // declare array of strings
var array <- i32[n]; // declare and allocate array of n elements
// n is any number that can be coerced to usize
con nums <- {1, 2, 3}; // immutable array of i32
Use the length
property to access the number of elements
in an allocated array. Attempting to access length
of an
unallocated array is an exception.
var colors <- {"Red", "White", "Blue"}; // allocate array
var count <- colors.length; // count is usize(3)
Arrays can be indexed with any integer type (signed or unsigned). Negative values wrap from the end (-1 is the last element). An exception occurs if the value is too big, i.e.no modulo operation is performed.
var w <- {1, 2, 3, 4, 5, 6, 7};
w[0] // first element, 1
w[-1] // last element, 7
var x <- isize(-5);
w[x] // 5th to last element, 3
Tuples group multiple values into a single value with anonymous,
ordered fields. ()
is an empty tuple.
("hello", i32(17))
is a tuple of type
(string i32)
. Tuple fields are named like indices,
i.e.(u128(4), "2").1
would be "2"
.
The unit type, represented as a 0-tuple, is written
()
.
regex
is a regular expression. GP1 regex format is
identical to that of .NET 5 and very similar to that of gawk.
Some examples of defining named functions:
fn sum(a: f32, b: f32): f32 { a + b } // takes parameters and returns an f32
fn twice_println(s: string) { // takes parameters and implicitly returns ()
println("${s}\n${s}");
}
fn join_println(a: string, b: string): () { // takes parameters and explicitly returns ()
println("${a} ${b}");
}
fn seven(): u32 { 7 } // takes no parameters and returns the u32 value of 7
There are a number of syntaxes allowed for calling a given function.
This is because the caller is allowed to assign to zero or more of that
function's parameters by name. Parameters assigned by name are freely
ordered, while those assigned normally bind to the first parameter
ordered from left to right in the function definition that is
unassigned. With regard to the join_println
function
defined above, this means that all of the following are valid and behave
identically.
join_println(a <- "Hello,", b <- "World.");
join_println(b <- "World.", a <- "Hello,");
join_println(b <- "World.", "Hello,");
join_println("Hello,", "World.");
Function names may be overloaded. For example,
join_println
could be additionally defined as
fn join_println(a: string, b: string, sep: string) {
println("${a}${sep}${b}");
}
and then both join_println("Hello,", "World.", " ")
and
join_println("Hello,", "World.")
would be valid calls.
Functions may be defined and called within other functions. You may be familar with this pattern from functional languages like F#, wherein a wrapper function is often used to guard an inner recursive function (GP1 permits both single and mutual recursion in functions). For example:
fn factorial(n: u256): u256 {
fn aux(n: u256, accumulator: u256): u256 {
match n > 1 {
true => aux(n - 1, accumulator * n),
_ => accumulator,
}
}
aux(n, 1)
}
Arguments are passed by value by default. For information on the syntax used in this example, refer to Control Flow.
Closures behave as one would expect in GP1, exactly like they do in most other programming languages that feature them. Closures look like this:
var x: u32 <- 8;
var foo <- { y, z => x * y * z}; // foo is a closure; its type is fn<u32 | u32>
assert(foo(3, 11) == (8 * 3 * 11)); // true
x <- 5;
assert(foo(3) == (8 * 3 * 11)); // true
con bar <- { => x * x }; // bar is a closure of type `fn<u32>`
assert(bar() == 25); // true because closure references already-defined x
They are surrounded by curly braces. Within the curly braces goes an
optional, comma-separated parameter list, followed by a required
=>
symbol, followed by an optional expression. If no
expression is included, the closure implicitly returns
()
.
The reason the match-expression uses the same =>
symbol is because the when
section of a match arm is an
implicit closure. The reason =>
in particular was chosen
for closures is twofold. One, arrows are conventional for expressing
anonymous functions, and two, the space between the lines of an equals
sign is enclosed by them.
Lambdas are nearly identical to closures, but they don't close over
their environment, and they use the ->
symbol in place
of =>
. A few examples of lambdas:
con x: u32 <- 4; // this line is totally irrelevant
con square <- { x -> x * x }; // this in not valid, because the type of the function is not known
con square <- { x: u32 -> x * x }; // this if fine, because the type is specified in the lambda
con square: fn<u32 | u32> <- { x -> x * x }; // also fine, because the type is specified in the declaration
Functions are first-class citizens in GP1, so you can assign them to
variables, pass them as arguments, &c.However, using the function
definition syntax is suboptimal when using function types. Instead,
there is a separate syntax for function types. Given the function
fn sum(a: f64, b: f64): f64 { a + b }
the function type is
expressed fn<f64 f64 | f64>
, meaning a function that
accepts two f64 values and returns an f64. Therefore,
fn sum(a: f64, b: f64): f64 { a + b }
con sum: fn<f64 f64 | f64> <- { a, b -> a + b };
con sum <- { a: f64, b: f64 -> a + b };
are all equivalent ways of binding a function of type
fn<f64 f64 | f64>
to the constant sum
.
Here's an example of how to express a function type for a function
argument.
fn apply_op(a: i32, b: i32, op: fn<i32 i32 | i32>): i32 {
op(a, b)
}
The above example provides an explicit type for the argument
op
. You could safely rewrite this as
fn apply_op(a: i32, b: i32, op: fn): i32 {
op(a, b)
}
because the compiler can safely infer the function type of
op
. Type inference only works to figure out the function
signature, so fn apply_op(a:i32, b:i32, op):i32 { . . . }
is not allowed.
Refer to Variables and Constants for information on the syntax used in this section.
Numeric types are automatically coerced into other numeric types as long as that coercion is not lossy. For example,
var x: i32 <- 10;
var y: i64 <- x;
is perfectly legal (the 32-bit value fits nicely in the 64-bit variable). However, automatic coercion doesn't work if it would be lossy, so
var x: i64 <- 10;
var y: i32 <- x;
doesn't work. This holds for numeric literals as well.
Unsurprisingly, var x: i32 <- 3.14
wouldn't compile. The
floating point value can't be automatically coerced to an integer type.
So what does work? Casting via the target type's pseudo-constructor
works.
con x: f64 <- 1234.5; // okay because the literal can represent any floating point type
con y: f64 <- f16(1234.5); // also okay, because any f16 can be losslessly coerced to an f64
con z: i32 <- i32(x); // also okay; uses the i32 pseudo-constructor to 'cast' x to a 32-bit integer
assert(z == 1234)
con a: f64 <- 4 * 10 ** 38; // this value is greater than the greatest f32
con b: f32 <- f32(a); // the value of b is the maximum value of f32
This approach is valid for all intrinsic types. For example,
var flag: bool <- bool(0)
sets flag
to
false
and var txt: string <- string(83.2)
sets txt
to the string value "83.2"
. Such
behavior can be implemented by a programmer on their own types via a
system we'll discuss in the Interfaces section.
Every GP1 program has an entry-point function. Within that function,
statements are executed from top to bottom and left to right. The
entry-point function can be declared with the entry
keyword
in place of fn
and returns an integer, which will be
provided to the host operating system as an exit code. Naturally, this
means that the handling of that code is platform-dependent once it
passes the program boundry, so it's important to keep in mind that a
system may implicitly downcast or otherwise modify it before it is made
available to the user. If no exit code is specified, or if the return
type of the function is not an integer, GP1 assumes an exit code of
usize(0)
and returns that to the operating system.
The following program prints Hello, World. and exits with an error code.
entry main(): usize {
hello_world();
1
}
fn hello_world() {
println("Hello, World.");
}
The entry function may have any name; it's the entry
keyword that makes it the entry point. The entry function may also be
implicit. If one is not defined explicitly, the entire file is treated
as being inside an entry function. Therefore,
println("Hello, World.");
is a valid and complete program identical to
entry main(): usize {
println("Hello, World.");
}
This behavior can lend GP1 a very flexible feeling akin to many scripting languages.
In a program where there is an entry-point specified, only expressions made within that function will be evaluated. This means that the following program does NOT print anything to the console.
entry main(): usize {
con x: usize <- 7;
}
println("This text will not be printed.");
In fact, this program is invalid. Whenever there is an explicit entry point, no statements may be made in the global scope.
At this time, GP1 has only one non-looping conditional control
structure, in two variants: match
and
match all
. The syntax is as follows, where
*expr*
are expressions and pattern*
are
pattern matching options (refer to Pattern Matching for more
info).
match expr {
pattern1 => arm_expr1,
pattern2 => arm_expr2,
_ => arm_expr3,
}
The match
expression executes the first arm that matches
the pattern passed in expr
. The match all
expression executes all arms that match the pattern. Both flavors return
their last executed expression.
The when
keyword may be used in a given match arm to
further restrict the conditions of execution, e.g.
con fs <- 43;
con is_even <- match fs {
n when n % 2 == 0 => " is "
_ => " is not "
};
print(fs + is_even + "even.")
Several looping structures are supported in GP1
loop
for
while
do/while
along with continue
and break
to help
control program flow. All of these are statements.
loop { . . . } // an unconditional loop -- runs forever or until broken
for i in some_iterable { . . . } // loop over anything that is iterable
while some_bool { . . . } // classic conditional loop that executes until the predicate is false
do { . . .
} while some_bool // traditional do/while loop that ensures body executes at least once
Pattern matching behaves essentially as it does in SML, with support
for various sorts of destructuring. It works in normal assignment and in
match
arms. It will eventually work in function parameter
assignment, but perhaps not at first.
For now, some examples.
a <- ("hello", "world"); // a is a tuple of strings
(b, c) <- a;
assert(b == "hello" && c == "world")
fn u32_list_to_string(l: List<u32>): string { // this is assuming that square brackets are used for linked lists
con elements <- match l {
[] => "",
[e] => string(e),
h::t => string(h) + ", " + u32_list_to_string(t), // the bit before the arrow in each arm is a pattern
} // h::t matches the head and tail of the list to h and t, respectively
"[" + elements + "]" // [s] matches any single-element list
} // [] matches any empty list
Interfaces are in Version 2 on the roadmap.
Enums are pretty powerful in GP1. They can be the typical enumerated type you'd expect, like
enum Coin { penny, nickle, dime, quarter } // 'vanilla' enum
var a <- Coin.nickle
assert a == Coin.nickle
Or an enum can have an implicit field named value
enum Coin: u16 { penny(1), nickle(5), dime(10), quarter(25) }
var a <- Coin.nickle;
assert(a == Coin.nickle);
assert(a.value == 5);
Or an enum can be complex with a user-defined set of fields, like
enum CarModel(make: string, mass: f32, wheelbase: f32) { // enum with multiple fields
gt ( "ford", 1581, 2.71018 ),
c8_corvette ( "chevy", 1527, 2.72288 )
}
A field can also have a function type. For example
enum CarModel(make: string, mass: f32, wheelbase: f32, gasUsage: fn<f32 | f32>) {
gt ( "ford", 1581, 2.71018, { miles_traveled -> miles_traveled / 14 } ),
c8_corvette ( "chevy", 1527, 2.72288, { miles_traveled -> miles_traveled / 19 } )
}
var my_car <- CarModel.c8_corvette;
var gas_used <- my_car.gasUsage(200); // estimate how much gas I'd use on a 200 mile trip
Equivalence of enums is not influenced by case values, e.g.
enum OneOrAnother: u16 { one(0), another(0) }
con a <- OneOrAnother.one;
con b <- OneOrAnother.another;
assert(a != b);
assert(a.value == b.value);
It's important to remember that enums are 100% always totally in every concieveable fashion immutable. To make this easier to enforce, only value types are allowed for enum fields.
Records are record types, defined with the record
keyword. Fields are defined in the record
block and
behavior is defined in the optional impl
block.
For example,
record Something {
label: i32 // field label followed by some type
} impl { . . . } // associated functions. This is different than having functions in the fields section because impl functions are not assignable.
If the record implements some interface, SomeInterface
,
the impl
would be replaced with
impl SomeInterface
, and the functions of
SomeInterface
would be defined alongside any other
functions of the Something
record.
Unions are the classic discriminated sum type.
union BinaryTree {
Empty,
Leaf: i32,
Node: (BinaryTree BinaryTree),
}
Refer to Generics for info on the syntax used in this section.
Type aliasing is provided with the type
keyword,
e.g.
type TokenStream Sequence<Token>
type Ast Tree<AbstractNode>
fn parse(ts: TokenStream): Ast { . . . }
Notice how much cleaner the function definition looks with the aliased types. This keyword is useful mainly for readability and domain modeling.
Generics are in Version 2 on the official GP1 roadmap. They roughly use C++ template syntax or Rust generic syntax.
GP1 has three operators involved in handling references,
#
, &
, and @
. These are
immutable reference, mutable reference, and dereference, respectively.
Some examples of referencing/dereferencing values:
var a <- "core dumped";
var b <- &a; // b is a mutable reference to a
assert(a == @b);
assert(a != b);
@b <- "missing ; at line 69, column 420";
assert(a == "missing ; at line 69, column 420");
b <- &"missing ; at line 420, column 69";
assert(a != "missing ; at line 420, column 69");
var c <- #b; // c is an immutable reference to b
assert(@c == b);
assert(@@c == a);
@c <- &"kablooey"; // this does not work. `c` is an immutable reference and cannot be used to assign its referent.
Naturally, only var
values can be mutated through
references.
The reference operators may be prepended to any type, T, to describe the type of a reference to a value of type T, e.g.
fn set_through(ref: &string) { // this function takes a mutable reference to a string and returns `()`
@ref <- "goodbye";
}
var a <- "hello";
set_through(&a);
assert(a == "goodbye");