Compiler/blang.cfg at main · SteveBeeblebrox/Compiler · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
# Blaster's language is line oriented, one statement per line, although
# empty lines are permitted.  The language is case insensitive, terminals
# may be separated by whitespace or commas.

# "Immediate" notation in blang is used for more than just immediate data, and
# its syntax is designed to be easily verified on sight and straightforward to
# emit from code.  The rest of this file is easier to grok if you know this
# notation up front.
#
# Basically, immediates begin with an octothorpe (#,otrp) have an optional
# number base specifier, n value field, and an optional multiplier field.
#   # (x|o) (digits)+ ((/|)(w|f|i|b)|)
#      | + octal       word + | | +- simple byte count (the default)
#      +-- hex        float --+ |
#                               +--- size of an instruction
#
# (yes, there are architectures where w and i are not the same value)
#
# If no base is specified, digits are considered in base 10. The multipliers
# hold the byte valued sizes of a machine word (w), floating point value (f),
# or instruction (i).  b is a multiplier of 1.  The optional forward slash
# is needed (but not required, so beware!) when specifying values in hex
# with the f multiplier:
#   #xfff  is the value  2^(13)-1
#   #xff/f is the value (2^(9) -1)*(size of a float)
#
# Some more examples:
#   #10    The number ten
#   #o10   The number eight (octal)
#   #x10i  The size of 16 instructions (x10=16)
#   #-3i   The only way to specify a negative value is in decimal
#
# The base and value field may be omitted, in which case the implicit value is
# one, which can be handy:
#   #w => size of a machine word   #f => size of a float
#   #i => size of an instruction

###
# We want these at the top because the rules are scanned only when they are expected,
# and we want to gobble up the largest possible token (otherwise, if after  sz  we get
# sz tokens detected with such things as  label @3w f
###

###
# sz in hexmode can't detect b, f, the slash will turn this off and go to SZ mode
# to detect either [wfib] (in INITIAL state) or szsep
###

ASM       -> Asms $
Asms      -> Asms Asm
           | Asm

# Nine types of operations
# There are data segment instructions, a labeling instructions, register
# operations, PC register operations, two conversion routines for int<->float,
# unary and binary arithmetic operators, and of course emit.

# There are three labeling instructions: label, init, and function.  None
# consume space in the program's memory map.

# A label is used to mark a data address for more human readable output by
# the emulator, the Identifier is in alphabet encoding.

# init  sets the base address of executable code (where execution begins).

# function labels the next instruction as the first operation of a function's
# prologue.

# The emit instruction is for the zlang keyword, it requires an address of the
# string and two general purpose regs for the inital character offset and
# length of output.
# The rand instruction if for the zlang rand keyword, there are three forms
#   rand GReg   rand GReg GReg GReg   and   rand FReg
Asm       -> DATA eol
           | label DataAddr labelname eol
           | init DataAddr eol
		   | function funsig eol
           | REGOP
           | PCOP
           | CONV
           | UNARY
           | BINARY
           | emit Addr GReg GReg eol
           | emit GReg eol
           | emit FReg eol
           | rand_ GReg eol
           | rand_ GReg GReg GReg eol
           | rand_ FReg eol
           | eol

# Use DATA statements to setup global variables in what is often referred
# to as the .data segment. Examples:
#   DATA 0000 #0.314159e1
#   DATA #1w  #-32
#   DATA 8    x41lphabetx45ncoding
# Immediate values used with data statements have no range limits
DATA      -> data_ DataAddr Data
Data      -> alphenc
           | ImmVal
           | NegImmVal
           | ImmFloat

# Standard register manipulation ops as one would expect in a machine level
# instruction set.  Two letter abbreviations and whole words permitted for the
# operation.  Examples
#   MV  R0,R10
#   load  F4,@4123
#   store R3,@fp[#o10w]   // data flows left->right
# to right)
REGOP     -> mv  IReg IReg eol
           | ld  IReg Addr eol
           | ld  IReg ImmVal eol
           | ld  IReg NegImmVal eol
           | st  IReg Addr eol
           | mv  FReg FReg eol
           | ld  FReg Addr eol
           | ld  FReg ImmFloat eol
           | st  FReg Addr eol
           | push IReg eol
           | pop  IReg eol
           | push FReg eol
           | pop  FReg eol

# Integer registers are general purpose regs labeled  R0 to R<Nr>,
# as well as the special purpose regs (program counter, stack pointer, return
# address, frame pointer).
IReg      -> GReg
           | pc
           | sp
           | ra
           | fp

# Values may be expressed in decimal, octal, or hexadecimal
# ===============================
# octal |hexadecimal  |multiplier
# ======+=============+==========
# oh=o  |ecks=x       |sz=w|f|i|b
# -------------------------------
# without an octal or hexadecimal spec, base 10 is assumed, with just a sz
# multiplier the implicit value is 1
NonNegValue -> oh   octval Slashsz
             | ecks hexval Slashsz
             |      posdec Slashsz
             | sz
# negative values have a minus sign and must be expressed in base 10
NegValue    ->      negdec Slashsz
Slashsz   -> slash sz
           | sz
           | lambda

# at=@
# Addresses may be expressed in three ways:
#  - @PosValue, or
#  - @PosOrNegValue(IReg) for register-offset form, or
#  - @IReg use the address stored in a general purpose or specialized register
# Address used for  data  or  label  instructions cannot use register oriented
# forms, since these are evaluated for memory map layout, not during execution.
# %lex% - at terminals switch to a
DataAddr  -> at NonNegValue
Addr      -> DataAddr
           | at NonNegValue oparen IReg cparen
           | at    NegValue oparen IReg cparen
           | at IReg

# otrp=#
# "Immediate" value notation
ImmVal    -> otrp NonNegValue
NegImmVal -> otrp NegValue
# floatvals must have at least one digit and either a decimal point or an [eE]
# in them to be recognized as a floating point literal
ImmFloat  -> otrp floatval

# Program counter (PC) standard manipulations.  CALL stores the PC into the
# return address register (R), and RETURN puts the RA value back into PC.
# IFZ  jumps to Addr if IReg is zero (Ireg is an integer register).
# IFNZ jumps to Addr if IReg is non zero.
PCOP      -> jump Addr eol
           | call Addr eol
           | return_ eol
           | ifz  IReg Addr eol
           | ifnz IReg Addr eol

# The only two instructions that permit integer and floating point registers
# to interact converts from one form to the other.  float -> integer simply
# truncates, it does not round.
CONV      -> int_ GReg FReg eol
           | float_ FReg GReg eol

GReg   -> greg
FReg   -> freg
# Most arithmetic operations (and CONVs) are limited to the general purpose R
# integer registers.  Addition and substraction can be performed on the
# specialized registers as well (PC, FP, ...).

# There are floating point registers F0, F1, ... F<Nf>.  These _might_ share
# bits with their like named integer register (or two, depending on type sizes of
# the architecture.  If bits are shared, then writing two `R0` would likely corrupt
# `F0` --- see a project specific write-up for details on a project by project case.

# There are three number domains:  bitwise, integer, and floating point.
# R registers must be used for bitwise and integer operations,

# Arithmetic operations that apply to more than one (such ADD) use the
# register arguments to dictate the actual operation performed:
#   ADD R13,R4,R15 => Add integers R4 and R15, result into R13
#   ADD F3,F4,F5   => Add floating points F4 and F5, result into F3
#
# All arithmetic binary operations (including bitwise) permit a second immediate
# operand.  So you can write
#   ADD  R7 R7 #1
# instead of
#   LOAD R0 #1
#   ADD  R7 R7 R0

# You can use either the mnemonic term or a symbol term for an opertion.  The
# destination register Rd may be the same as either or both argument registers.

# Unary ops
#  mnemonic symbol(s) example(s)     Domain
#  CHS      -         - R4,R4        integer & float
#  ABS                abs f3 f8      integer & float
#  SIGN               sign r3 r2     integer & float; r3=-1 if R2 < 0, r3=+1 otherwise
#  COMPL    ~         compl Rd,Rx    integer (bitwise)
#  NOT      !         ! Rd,Rx        integer; zero=>one  non-zero=>zero
#  INV                inv Fd,Fx      1/x, float only
# Binary ops
#  ADD      +         + Rd,Rx,Ry     integer & float
#  SUB      -         - Rd,Rx,Ry     integer & float
#  MUL      *         * Rd Rx Ry     integer & float
#  DIV      /         / Fd,Fx,#1.23  integer (truncated fraction), float
#  REM      %         % Rd,Rx,Ry     integer or float domains
#  AND      &         & Rd,Rx Ry     integer (bitwise)
#  OR       |         | Rd,Rx,Ry     integer (bitwise)
#  XOR                xor Rd Rx Ry   integer (bitwise)
#  NAND               nand Rd,Rx,Ry  integer (bitwise)
# Relational ops; resultant register Rd always a GReg, has either a 0 (false)
# or 1 (true) stored in it as result.
#  LT       <         < Rd,Rx,Ry     integer & float
#  LTE      <=        <= Rd,Fx,Fy    integer & float
#  GT       >         > Rd,Rx,Ry     integer & float
#  GTE      >=        >= Rd,Fx,#1.1  integer & float
#  EQ       ==        == Rd,Rx,#5    integer & float
#  NEQ      !=        neq Rd,Fx,Fy   integer & float

UNARY  -> chs GReg GReg eol
        | chs FReg FReg eol
        | abs_ GReg GReg eol
        | abs_ FReg FReg eol
        | sign GReg GReg eol
        | sign FReg FReg eol
        | compl_ GReg GReg eol
        | not_ GReg GReg eol
        | inv FReg FReg eol

INTFLOATBOOLOPS -> lt | lte | gt | gte | eq | neq
INTFLOATBINOPS -> add | sub | mul | div_ | rem_
BWIBINOPS -> and_ | or_ | xor_ | nand_

BINARY -> INTFLOATBINOPS GReg GReg GReg eol
        | INTFLOATBINOPS GReg GReg ImmVal eol
        | INTFLOATBINOPS FReg FReg FReg eol
        | INTFLOATBINOPS FReg FReg ImmFloat eol
		| INTFLOATBOOLOPS GReg GReg GReg eol
		| INTFLOATBOOLOPS GReg GReg ImmVal eol
		| INTFLOATBOOLOPS GReg FReg FReg eol
		| INTFLOATBOOLOPS GReg FReg ImmFloat eol
        | BWIBINOPS GReg GReg GReg eol
        | BWIBINOPS GReg GReg ImmVal eol