Little Macros

Macros I made that aren't large enough for one article each

Published on 2022-03-22

Lately, I have been writing a ton of Racket macros to get more familiar with the practices required. I know I have written about them in the past, but I really do have a blast making them. Here's a collection of Racket macros I have made between now and the last post I made.

Hash Accessor Made Easy

Something I deal with pretty frequently is the use of hash type objects, either mutable or immutable. There is a certain kind of... code smell that comes with them. I have written about my choice in using Racket to parse JSON before, so this one may be familiar for any readers out there.

> (define my-hash (make-immutable-hash '((1 . 2) (3. 4))))
> (hash-ref my-hash 1)
2
> (hash-ref my-hash 3)
4
> (hash-ref my-hash 10)
hash-ref: no value found for key
  key: 10

Overall a pretty easy interface to work with, but there is no way to provide a default value without checking if the key already exists or not. To do this we need to use hash-has-key? to make an assertion about a key's existence.

> (hash-has-key? my-hash 1)
#t
> (hash-has-key? my-hash 10)
#f

Instead of a nasty error we get a safe Boolean value we can use for conditional statements. In my days of old when I was still parsing Trello files, a default value when trying to do a hash reference was vital when going through deeply nested JSON data.

(define (get-key-or H K E)
  (if (hash-has-key? H K)
      (hash-ref H K)
      E))

> (get-key-or my-hash 3 "oops")
4
> (get-key-or my-hash 10 "oops")
"oops"

This works and serves as a decent starting point, but let's just say for instance, sometimes we don't always want to provide a default value, and may want to go back to hash-ref shooting out an error. Sometimes it is handy for certain types of programs. In that case, maybe we can convert it to an optional argument.

(define (get-key-or H K #:default [default 0])
  (cond
   [(hash-has-key? H K) (hash-ref H K)]
   [else default]))

This works fine, but it won't bubble up our error unless we write a constant in to check, which can cause issues because what value do we check for to return the error?

(define (get-key-or H K #:default [default 0])
  (cond
    [(hash-has-key? H K) (hash-ref H K)]
    [else (if (eqv? default 0)
              (error "no key")
              default)]))

This only opens up a can of worms, because unless we re-write the code to somehow capture the error instead, pre-defining a constant value like this is not ideal.

Instead what we should consider is the notion that a default argument not passed results in a different code path. If the default value was passed, provide some safe wrapping, if it doesn't, call hash-ref and have it return it's error. For this we use a macro to provide two different code paths based on the existence of arguments.

(define-syntax get-key-or
 (syntax-rules ()
  [(get-key-or H K) (hash-ref H K)]
  [(get-key-or H K E)
    (cond
     [(hash-has-key? H K) (hash-ref H K)]
     [else E])]))

Now if you try to use the macro when not providing a default value, you will receive the hash-ref error when a key is not found. If you provide a default value, the macro will provide said value when the key is not found now. This type of macro switches between an error-free code path or a no-error code path, making it convenient for all cases.

Substring

I hate doing substrings in Racket. It's not fun. The boilerplate you have to write surrounding substrings of any kind in Racket is just plain frustrating.

> (substring "Hello World!" 5)
"Hello"
> (substring "" 3)
substring: starting index is out of range
  starting index: 3
  valid range: [0, 0]
  string: ""

substring on it's own will throw an error for an out-of-bounds value, meaning you cannot index above or below the string range. This is fairly annoying because you are then required to safely wrap your code in a way that checks if your index is within the string's range.

# python
>>> "Hello world!"[:5]
"Hello"

# ruby
irb(main):001:0> "Hello world!"[..5]
"Hello "

Primitive, but simple. In my head there's very little reason you would want substring to fail, it's one of the core string cutting methods that don't involve rolling back into a list by yourself.

The sane way of working around this is by creating a safety check, which involves computing the length of a given string first, then doing math to see if it's greater or less than that of the string.

(define (substr stringy n-chars)
  (let ([len (string-length stringy)])
    (if (< len n-chars)
        stringy
        (substring stringy 0 n-chars))))

Now it's a simple check of if we are over-substringing our string or not, so we can avoid a catastrophic substring failure. Unfortunately there's a number of ways we could do a substring - it could be either one index, or two. This doesn't feel fitting for a normal function as we'll see here.

; two index substring - now we can't one-index substr
(define (substr stringy left-pos right-pos)
  (let ([len (string-length stringy)])
    (if (< left-post len)
        (if (< right-pos len)
            (substring stringy left-pos right-pos)
            (substring stringy left-pos len))
        "")))

While we have the precision of now being able to substring two indices properly, we lose out on the cool functionality of having a one-position substring. It wouldn't make sense to use optional arguments either because then it simply becomes too much typing each time you need to do #:left-pos x #:right-pos y.

A macro makes more sense here as it can expand to two different definitions from one binding.

(define-syntax substr
  (syntax-rules ()
    [(substr S upto)
      (let ([len (string-length S)])
        (if (< len upto) S (substring S 0 upto)))]
    [(substr S L R)
      (let ([len (string-length S)])
        (if (< L len)
            (if (< R len)
                (substring S L R)
                (substring S L len))
            ""))]))

And now you get the best of both worlds.

> (substr "Hello" 2)
"He"
> (substr "Hello there" 3 99)
"lo there"

Timing

Sometimes you want to trace the duration of an evaluation of some block of code. It's pretty simple to calculate based on some math by subtracting one sample time from another.

(define (how-long-to-sleep)
  (define start-time (current-seconds))
  (sleep 5)
  (define end-time (current-seconds))
  (printf "(sleep 5) took ~a seconds" (- end-time start-time)))

That works for timing an ordinary bit of code, but to write that boilerplate timing code each time is just pure frustration. What about passing the evaluation to a function?

(define (time-me code)
  (define start-time (current-seconds))
  code
  (define end-time (current-seconds))
  (printf "?? took ~a seconds" (- end-time start-time)))

Doesn't actually work. We cannot pass code into the function and hope that we're able to get timed, because evaluations of code passed like this are evaluated long before the function receives it. Instead, we must create a macro that can time code for us.

(define-syntax-rule (time-it expr)
  (begin
    (define start-time (current-seconds))
    expr
    (define end-time (current-seconds))
    (printf "Operation took ~a seconds" (- end-time start-time))))

There, that works. However, if we wanted to do something like:

> (define intense-result
    (time-it (ackerman-function 10000 2000)))

We cannot, because:

The last expression of our macro is simply void because of the printf call
We cannot nest define expressions within themselves

The second part means we cannot use define multiple times in a single non-function binding. Doing this is simply not allowed.

(define x (begin (define y 3) y))
string:1:17: define: not allowed in an expression context

To work around this, we instead must use let expressions to nest our results. This below code, while primitive, does get the job done and allows you to capture the result of the expression you are trying to time..

(define-syntax-rule (time-it expr)
  (begin
    (let ([start-time (current-seconds)])
      (let ([result expr])
        (let ([end-time (current-seconds)])
          (printf "Operation took ~a seconds\n"
                  (- end-time start-time))
          (values result))))))

Now the result gets properly timed, evaluated, and is able to be stored as a binding to a define or let, all without compromising on code. Simply wedge your code into a time-it call and boom, plain and simple code timing.

> (time-it (sleep 5))
; pause for 5 seconds
Operation took 5 seconds

> (define (factorial x) (if (eqv? 0 x) 1 (* x (factorial (sub1 x)))))

; do the next one at your own peril - it's intensive
> (define result (time-it (factorial 60000)))
; give it a few seconds
Operation took 2 seconds

(Alternatively, you could use current-inexact-milliseconds)

Subprocessing

One thing I like doing in Racket is working with other programs on a system to invoke them with fixed arguments. This typically involves (at a system level) doing a fork() and setting the parent of the child process running as the Racket VM currently running. Child processes are then entered into the Racket garbage collector system called a Custodian, which if you play with the settings long enough, the processes can outlive the Racket VM, or can be shut down when the Racket VM shuts down, or if you make Racket close the processes.

To work with a subprocess, you need to use the verbatim function subprocess. This creates four different values which you must bind with define-values, namely the process managed by Racket, and it's three data ports for input, output and errors.

(define (start-program bin-name args)
  (define-values (S I O E)
    (subprocess
      (current-output-port)
      (current-input-port)
      'stdout ; error port redirect to stdout
      (find-executable-path bin-name)
      args)))
  (subprocess-wait S))

While it looks like a lot at first, it's a very handy function I use a lot in my work and non-work related code, and I find it to be way easier to use than most other subprocess libraries in other languages.

The subprocess object itself will run, but the Racket VM does not run at the same pace as the subprocess itself. The child process can be faster or slower, and the way to make them synchronize is by using Events and a bit of thread blocking to sync up timings in Racket code. The best way to ensure a process runs completely is by blocking the Racket thread from continuing until the child process has finished, by doing subprocess-wait.

However, subprocess comes with some gotcha's. The final value we passed args, is not wholely expressive for multiple arguments. In fact, the arguments subprocess takes after all the beginning work, is a multi-argument process, not a simple list of arguments we can supply. This code isn't wide-spread enough to take into considering the arguments we would like to supply.

(define (start-program bin-name . args)
  (define-values (S I O E)
    (apply subprocess
	  `(,(current-output-port)
	    ,(current-input-port)
		stdout
		,(find-executable-path bin-name)
		,@args)))
  (subprocess-wait S))

This works perfectly fine and is very useful.

> (start-program "sleep" "5")
; sleeps for 5 seconds
> (start-program "seq" "10" "15")
10
11
12
13
14
15

The new pain is writing code that is actually fun. Right now, this isn't super fun to use, because you constantly have to declare the binary, then also declare the arguments. Wouldn't it be better to simply capture the binary first as a closure, that way we can apply it later?

Thankfully that's called Currying, which exists in Racket. The concept is capturing a function call with partial arguments, and freezing it as a new procedure, so we can use it later. This is a highly useful method of code writing and has been used in computer science for so long I don't think I could count it out.

To curry in Racket, we can import it from racket/function since it isn't a base library function. We can improve on our start-program function by encapsulating it with curry.

> (define sleeper (curry start-program "sleep"))
> (sleeper "5")
; sleeps for 5 seconds

This is still slightly prone to problems. The binary might not exist, and typing out curry each time seems like user overhead for something we aim to simplify. Surely there must be a way we can test for if the program exists in the system, and then generate code that we can use instantly.

Let's define a new keyword to replace define by using a macro. The goal is to provide something that vaguely resembles Lisp macros that exist in Common Lisp (defvar, defun, defparameter, defmacro), so this way we can use it only to create curried processes.

(define-syntax-rule (defproc pname)
 (define pname
  (let ([procpath (find-executable-path (symbol->string 'pname))])
   (if (eqv? #f procpath)
       (error 'defproc "Executable not found, is it installed?")
       (lambda args
        (begin
         (define-values (S I O E)
          (apply
            subprocess
            `(,(current-output-port)
              ,(current-input-port)
              stdout
              ,procpath
              ,@args)))
         (subprocess-wait S)))))))

Oof, that's a mouth-ful, but it works! No longer do we have to explicitly write out a boring define expression, we can just use this macro in it's stead and it will generate the necessary code for us.

> (defproc seq) ; the linux seq command
> (seq "1" "3")
1
2
3
> (defproc ping) ; now for net pinging
> (ping "192.168.1.1")
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.381 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.842 ms

Awesome stuff, right?

I have exhausted all my tiny macros for the moment, so if you made it to the end, thank you for reading and give these macros a try! There is always room for improvement!