web crawler in Racket

I usually like to suggest projects for students as part of their evaluation in the ‘programming language’ course. This course uses Racket language and we follow the SICP book. So the question is always what are the good projects for the students. Getting data from different source and combine then in a flexible user interface is a very common idea. Today I decide to investigate the difficult of developing a simple web crawler in Racket. Since I usually code in Common Lisp, I was looking for something similar of CL libs like drakma and Closure.

Using DrRacket only for teaching is not enough for being confortable with the Racket ecosystem. First I had to discover how to install the HTML parsing lib. This was done with:

raco pkg install html-parsing

After I have decided what libs to use, I had to understand their interfaces. My first code in Racket for retrieve and parse a simple HTML page is:

#lang racket

(require net/http-client)
(require html-parsing)

(let-values (((a b c) (http-sendrecv "arademaker.github.io"
                                     "/about.html")))
  (html->xexp c))

Not sure if this is the most efficiente way to make it, but surelly it is simple enough to start with.