I'll be doing today's puzzles in x64 assembler, specifically NASM
running on Linux. Assembler is very simple, to the point that doing
anything with it is kind of complex; it's like C on steroids, in that
regard. x64 has a large number of registers, many of which
overlap. For example, rax
is 64 bits, eax
is the first 32 bits of
rax
, ax
is the first 16 bits of rax
, and al
is the first 8
bits of rax
. The names are not very consistent.
Assembler programmers tend to use macros a lot to make the language bearable, but I won't be bothering with any of those today. We could also push and pop function arguments and results to a stack like a higher-level language would do under the covers, but I'll just be using registers and static variables.
For today's first problem, we need to count the unique spots on a grid
that a guard will visit. We receive the grid as a series of text lines
making a character grid, where the guard's initial position is marked
^
, free spaces are marked .
, and obstructions are marked #
. The
guard moves forward if they can; if they can't, because there's an
obstruction, they turn 90 degrees right.
We can start by defining space to store the grid, and a procedure for
visiting it. Initialised data goes in the .data
section,
uninitialised data goes in the .bss
section, and code goes in the
.text
section. We make use of all three. We also call the read
syscall, which is defined as ssize_t read(int fd, void buf[.count], size_t count);
. On x64 Linux, the first 3 arguments to a syscall are
passed in rsi
, rdi
, and rdx
in that order, and the syscall
number is placed in rax
. We'll write a read_grid
procedure that
populates grid
.
One trick we use throughout is using xor reg, reg
to zero out
reg
. This is the normal way in x64 assembler, as it's both shorter
(in machine code) and faster than mov reg, 0
.
section .data
sys_read equ 0
sys_write equ 1
sys_open equ 2
sys_close equ 3
sys_exit equ 60
height equ 130
width equ 131 ; the last character is the newline
infile db "2024/input/day-06.txt", 0
section .bss
grid resb height * width
section .text
read_grid:
mov rdi, infile
xor rsi, rsi ; 0 for read-only
mov rax, sys_open
syscall
mov rdi, rax ; transfer the input file descriptor to rdi
mov rsi, grid ; use grid as the output buffer
mov rdx, height * width ; number of bytes to read
mov rax, sys_read
syscall
ret
Once the grid is populated, we can just scan through it to find the
starting position. We'll reserve 2 bytes to store the x- and
y-coordinates, and name them g_x
and g_y
(g for guard), and then
another two to track our current direction, g_vx
and g_vy
. We'll
do the work in a procedure called init_position
, which will step
through each cell in search of a ^
, at which point it will
initialise our four g_*
variables (g_vx
and g_vy
could be
handled anywhere, but handling them here would make it very easy to
modify it to support a starting position of >
indicating a starting
direction of right and so on, which are mentioned in the instructions
but not required). init_position
will return 0
in rax
if it
finds the position, and 1
if it doesn't:
init_position:
xor rdx, rdx ; Initialise y-coordinate to 0
dec rdx ; Start at -1 because it's incremented at the top of the loop
.next_row:
inc rdx
cmp rdx, height
jge .not_found
xor rax, rax ; set x-coordinate to 0 at the start of each line
.next_char:
cmp rax, width - 1
jge .next_row
mov rcx, rdx ; start with the y-offset
imul rcx, width ; scale the y-offset by the number of rows
add rcx, rax ; add the x-offset into the given row
mov bl, [grid+rcx] ; load al with the `rax`th byte of `grid`
cmp bl, '^'
je .found
inc rax ; move to next character
jmp .next_char
.found:
;; Initialise position to located spot with direction facing up
mov byte [g_vx], 0
mov byte [g_vy], -1
mov [g_x], al
mov [g_y], dl
xor rax, rax
ret
.not_found:
mov rax, 1
ret
Now that we are loading the grid and finding the starting position, there are two more steps to go: counting the visited spaces and printing the result. We'll start with printing the result, because it's a little more straightforward.
We'll define a procedure called print_int
that takes an integer in
rax
and prints it with a newline to stdout
. We'll need to define
new constants for syscall numbers as well as a buffer to store the
string representation. rcx
will hold the length of the string, and
we'll build it in reverse.
section .data
stdout equ 1
lf equ 0xa ; ASCII linefeed
section .bss
buffer resb 16
section .text
print_int:
mov rdi, buffer
xor rcx, rcx ; initialise character count to 0
mov rbx, 10 ; the base to divide by
dec rdi ; allocate space for trailing newline
inc rcx ; increment character count
mov byte [rdi], lf ; store trailing newline
.loop: xor rdx, rdx
div rbx ; divide rax by 10, leaving quotient in rax and remainder in rdx
add dl, '0' ; convert the remainder to ASCII
dec rdi ; move buffer pointer backwards
mov [rdi], dl ; store character in the buffer
inc rcx ; increment character count
test rax, rax ; check if quotient is 0
jnz .loop ; if not, continue the loop
mov rdx, rcx ; leave string length in rdx for sys_write
mov rsi, rdi ; starting address of string
mov rdi, stdout ; the file descriptor to write to
mov rax, sys_write
syscall
ret
Next, we'll handle tracking visited states. The grids aren't very
large, so it's feasible to just use a 2D array with enough elements
for every space on the grid, indexed like y * width + x
. Because
visited-or-not is a simple boolean, we only really need 1 bit to keep
track of each cell, so we can dramatically cut down on space by making
it a bit array. This just means that once we calculate the index, we
divide it by 8, where the quotient is the index in bytes into the
array (because we can't address individual bits) and the remainder is
the bit position in that byte. We can get the quotient of a division
by 8 (binary 1000) by shifting the bits in r9 right by 3, and we can
get the remainder by bitwise-anding the number with 7 (binary 111):
section .bss
;; +7 to make sure that, rounding down, we'll have enough bytes
visited resb (height * width + 7) / 8
section .text
visit:
mov r9, rdx
imul r9, width
add r9, rax
mov rcx, r9
shr r9, 3 ; Divide index by 8 to get the byte offset
and cl, 7 ; Get the bit position within the byte
mov r8b, 1
shl r8b, cl ; Shift 1 by the bit position
or [visited+r9], r8b ; Set the bit
ret
At the end we'll also need to count the number of bits in this array,
which isn't too complicated. We use rcx
to keep track of a loop that
counts down through all of the bytes in visited
, using the movzx
instruction to load the single bytes into the 64-bit register r8
,
extending zeroes out so as not to change the value. Then we use
popcnt
to count the number of set (populated) bits in r8
, store
the result in r9
, and then add it to our running total in rax
. We
use jz
(jump if zero) to detect when the loop is over and return,
otherwise we unconditionally jump to the top of the loop and continue
with the next byte:
count_visited:
xor rax, rax
mov rcx, (height * width + 7) / 8
.loop: movzx r8, byte [visited+rcx-1]
popcnt r9, r8
add rax, r9
dec rcx
jz .done
jmp .loop
.done: ret
Now we can move onto walk_guard
. It runs in a loop that begins by
loading rax
and rdx
with the guard's x- and y-coordinates, calling
visit
for that space, calculating the next square the guard will
walk onto, and looking that space up in the grid. If it's an
obstruction, we turn, otherwise we save our updated g_x
and g_y
and go back to the start of the loop:
walk_guard:
.loop:
movzx rax, byte [g_x]
movzx rdx, byte [g_y]
call visit
add al, [g_vx]
cmp al, width - 2
ja .done
add dl, [g_vy]
cmp dl, height - 1
ja .done
mov rcx, rdx
imul rcx, width
add rcx, rax
mov bl, [grid+rcx]
cmp bl, '#'
je .turn
mov [g_x], al
mov [g_y], dl
jmp .loop
.turn: call turn_guard
jmp .loop
.done:
ret
Turning is implemented in another procedure, turn_guard
, which just
uses several jumps to determine which direction we're facing and how
to turn clockwise. One of g_vx
and g_vy
will always be 0, which we
can use to break up the branches a little more succinctly:
turn_guard:
cmp byte [g_vx], 0
je .from_vertical
cmp byte [g_vx], -1 ; facing left
je .left_to_up
mov byte [g_vx], 0 ; we were facing right, so face down
mov byte [g_vy], 1
ret
.left_to_up:
mov byte [g_vx], 0 ; we were facing left, so face up
mov byte [g_vy], -1
ret
.from_vertical:
cmp byte [g_vy], -1 ; facing up
je .up_to_right
mov byte [g_vx], -1 ; we were facing down, so face left
mov byte [g_vy], 0
ret
.up_to_right:
mov byte [g_vx], 1 ; we were facing up, so face right
mov byte [g_vy], 0
ret
Finally, we can put it all together in a special procedure called
_start
, where program execution begins (in a C program, _start
's
job would be to call main
and handle its return value). _start
is
honestly simple enough that I'll just show you the completed file
instead:
section .data
sys_read equ 0
sys_write equ 1
sys_open equ 2
sys_close equ 3
sys_exit equ 60
stdout equ 1
lf equ 0xa
height equ 130
width equ 131 ; the last character is the newline
infile db "2024/input/day-06.txt", 0
section .bss
grid resb height * width
visited resb (height * width + 7) / 8
g_x resb 1
g_y resb 1
g_vx resb 1
g_vy resb 1
buffer resb 16
section .text
global _start
_start:
call read_grid
call init_position
call walk_guard
call count_visited
call print_int
mov rdi, 0
mov rax, sys_exit
syscall
read_grid:
mov rdi, infile
xor rsi, rsi ; 0 for read-only
mov rax, sys_open
syscall
mov rdi, rax ; transfer the input file descriptor to rdi
mov rsi, grid ; use grid as the output buffer
mov rdx, height * width ; number of bytes to read
mov rax, sys_read
syscall
ret
init_position:
xor rdx, rdx ; Initialise y-coordinate to 0
dec rdx ; Start at -1 because it's incremented at the top of the loop
.next_row:
inc rdx
cmp rdx, height
jge .not_found
xor rax, rax ; set x-coordinate to 0 at the start of each line
.next_char:
cmp rax, width - 1
jge .next_row
mov rcx, rdx ; start with the y-offset
imul rcx, width ; scale the y-offset by the number of rows
add rcx, rax ; add the x-offset into the given row
mov bl, [grid+rcx] ; load al with the `rax`th byte of `grid`
cmp bl, '^'
je .found
inc rax ; move to next character
jmp .next_char
.found:
;; Initialise position to located spot with direction facing up
mov byte [g_vx], 0
mov byte [g_vy], -1
mov [g_x], al
mov [g_y], dl
xor rax, rax
ret
.not_found:
mov rax, 1
ret
print_int:
mov rdi, buffer
xor rcx, rcx ; initialise character count to 0
mov rbx, 10 ; the base to divide by
dec rdi ; allocate space for trailing newline
inc rcx ; increment character count
mov byte [rdi], lf ; store trailing newline
.loop: xor rdx, rdx
div rbx ; divide rax by 10, leaving quotient in rax and remainder in rdx
add dl, '0' ; convert the remainder to ASCII
dec rdi ; move buffer pointer backwards
mov [rdi], dl ; store character in the buffer
inc rcx ; increment character count
test rax, rax ; check if quotient is 0
jnz .loop ; if not, continue the loop
mov rdx, rcx ; leave string length in rdx for sys_write
mov rsi, rdi ; starting address of string
mov rdi, stdout ; the file descriptor to write to
mov rax, sys_write
syscall
ret
walk_guard:
.loop:
movzx rax, byte [g_x]
movzx rdx, byte [g_y]
call visit
add al, [g_vx]
cmp al, width - 2
ja .done
add dl, [g_vy]
cmp dl, height - 1
ja .done
mov rcx, rdx
imul rcx, width
add rcx, rax
mov bl, [grid+rcx]
cmp bl, '#'
je .turn
mov [g_x], al
mov [g_y], dl
jmp .loop
.turn: call turn_guard
jmp .loop
.done:
ret
turn_guard:
cmp byte [g_vx], 0
je .from_vertical
cmp byte [g_vx], -1 ; facing left
je .left_to_up
mov byte [g_vx], 0 ; we were facing right, so face down
mov byte [g_vy], 1
ret
.left_to_up:
mov byte [g_vx], 0 ; we were facing left, so face up
mov byte [g_vy], -1
ret
.from_vertical:
cmp byte [g_vy], -1 ; facing up
je .up_to_right
mov byte [g_vx], -1 ; we were facing down, so face left
mov byte [g_vy], 0
ret
.up_to_right:
mov byte [g_vx], 1 ; we were facing up, so face right
mov byte [g_vy], 0
ret
visit:
mov r9, rdx
imul r9, width
add r9, rax
mov rcx, r9
shr r9, 3 ; Divide index by 8 to get the byte offset
and cl, 7 ; Get the bit position within the byte
mov r8b, 1
shl r8b, cl ; Shift 1 by the bit position
or [visited+r9], r8b ; Set the bit
ret
count_visited:
xor rax, rax
mov rcx, (height * width + 7) / 8
.loop: movzx r8, byte [visited+rcx-1]
popcnt r9, r8
add rax, r9
dec rcx
jz .done
jmp .loop
.done: ret
I haven't actually done part 2 yet, sorry! I got very busy for a few days and fell quite behind, so rather than spending too much time debugging my solution to this puzzle, I decided to move on for now. Don't worry, I'll come back to it soon.