This reminded me of one of my high school computer science assignments- simply to find all words in a single boggle board. And try to optimize your solution a bit. The point was to teach about recursion/backtracking and data structures. The intended solution was roughly: start at a square, check if your current prefix is a valid prefix, move to a neighbor recursively, and emit any words you find. Trying to optimize naturally motivates a trie data structure.
I found it to be at least an order of magnitude faster, though, to invert the solution: loop through each word in the dictionary and check whether it exists in the grid! The dictionary is small compared to the number of grid paths, and checking whether a word exists in the grid is very very fast, requiring not much backtracking, and lends itself well to heuristic filtering.
Sorry, but this doesn’t pass the smell test. The article mentions 200,000 random 4x4 boards/second on a single core on an M2. That’s a ~4GHz chip. So ~20,000 ops/board. There are 200,000 words in the dictionary. You can’t possibly do something for every word in the dictionary, it would be too slow.
It sounds like your Trie implementation had a bug or inefficiency.
My first thought is that there is surely an Integer Linear Programming approach that can solve this within a few seconds using some very advanced solver like Gurobi.
These solvers can very often find the globally optimal solution - and prove that this solution is optimal - much faster than exhaustive search.
Of course they do use a smart exhaustive search by applying branch-and-bound as described in this article, but advanced solvers use, among other things, branch-and-cut where cutting planes in the solution space are generated, as well as a variety of other approaches.
One interesting thing however is that GPUs are still not particularly applicable for solvings Mixed Integer Linear Programs to sufficient accuracy. There are things like PDLP that can use GPUs to solve these problems, but they are restricted to something like 1e-4 accuracy whereas the state of the art is more like 1e-9.
I tried Z3 and OR Tools. I didn't try Gurobi. But this was enough to make me think ILP was a dead end. (There were a lot of dead ends in this project.)
I don't know much about integer programming, though, and I'd love to be proven wrong.
I saw that! In my experience, problems that seem completely intractable using open source tools often get solved in seconds using state of the art commercial approaches.
One interesting thing about Boggle is that the number of variables (16 cells) is very small compared to the number of coefficients on how they combine (the number of possible words).
Fun, love word game computations! Reminds me a bit of the challenge to place the challenge to place all letters in the alphabet in as small a grid as possible, with valid words: https://gamepuzzles.com/alphabest.htm
I made a word game based on a similar concept, featuring different letters every day: https://spaceword.org
Simulated annealing [1] is mentioned but not explained in the list of modifications to hill climbing. The technique roughly is: accept modifications to the board which decrease the score, with a probability inversely related to the magnitude of the decrease, and which decreases as the search progresses. This helps avoid getting stuck in local maximae.
Annealing is mentioned a few times in the post but not discussed in any detail. I found that hill climbing with an expanded "pool" of boards and exhaustive search of neighbors was the most reliable way to get from a random starting point to the highest-scoring board: https://github.com/danvk/hybrid-boggle/blob/main/boggle/hill...
The article actually does mention using this technique, though it doesn't explain it, so thanks for the background from someone who isn't familiar with this space!
This reminded me of one of my high school computer science assignments- simply to find all words in a single boggle board. And try to optimize your solution a bit. The point was to teach about recursion/backtracking and data structures. The intended solution was roughly: start at a square, check if your current prefix is a valid prefix, move to a neighbor recursively, and emit any words you find. Trying to optimize naturally motivates a trie data structure.
I found it to be at least an order of magnitude faster, though, to invert the solution: loop through each word in the dictionary and check whether it exists in the grid! The dictionary is small compared to the number of grid paths, and checking whether a word exists in the grid is very very fast, requiring not much backtracking, and lends itself well to heuristic filtering.
Sorry, but this doesn’t pass the smell test. The article mentions 200,000 random 4x4 boards/second on a single core on an M2. That’s a ~4GHz chip. So ~20,000 ops/board. There are 200,000 words in the dictionary. You can’t possibly do something for every word in the dictionary, it would be too slow.
It sounds like your Trie implementation had a bug or inefficiency.
I think GP mentioned it was on a _single_ boggle board.
Your best bet in that case is to store the dictionary in a Trie or DAWG structure that can be mmapped directly from disk.
My first thought is that there is surely an Integer Linear Programming approach that can solve this within a few seconds using some very advanced solver like Gurobi.
These solvers can very often find the globally optimal solution - and prove that this solution is optimal - much faster than exhaustive search.
Of course they do use a smart exhaustive search by applying branch-and-bound as described in this article, but advanced solvers use, among other things, branch-and-cut where cutting planes in the solution space are generated, as well as a variety of other approaches.
One interesting thing however is that GPUs are still not particularly applicable for solvings Mixed Integer Linear Programs to sufficient accuracy. There are things like PDLP that can use GPUs to solve these problems, but they are restricted to something like 1e-4 accuracy whereas the state of the art is more like 1e-9.
I actually did try ILP, see https://stackoverflow.com/questions/79422270/why-is-my-z3-an...
I tried Z3 and OR Tools. I didn't try Gurobi. But this was enough to make me think ILP was a dead end. (There were a lot of dead ends in this project.)
I don't know much about integer programming, though, and I'd love to be proven wrong.
I saw that! In my experience, problems that seem completely intractable using open source tools often get solved in seconds using state of the art commercial approaches.
If you want to give it a try, I'd love to hear if that's the case! It's deleted in the repo now, but here's code to generate a spec for an ILP solver: https://github.com/danvk/hybrid-boggle/blob/62d3f01aed802734...
One interesting thing about Boggle is that the number of variables (16 cells) is very small compared to the number of coefficients on how they combine (the number of possible words).
I am very intrigued by this. I’ll do something thinking this evening about how a tight Boggle model may look.
Great! Feel free to reach out -- my email isn't hard to find.
My favorite part of the write-up is the first sentence after the "What if there's a bug?" section.
Fun, love word game computations! Reminds me a bit of the challenge to place the challenge to place all letters in the alphabet in as small a grid as possible, with valid words: https://gamepuzzles.com/alphabest.htm
I made a word game based on a similar concept, featuring different letters every day: https://spaceword.org
Somewhat related, I used minisat.js to generate Boggle boards that contain a given list of words!
[0] https://benrbray.com/projects/unboggler
Where can I find the wordlist that they used?
Edit: Found it here: https://coursera.cs.princeton.edu/algs4/assignments/boggle/f...
You can see all the wordlists I used here: https://github.com/danvk/hybrid-boggle/tree/main/wordlists
The proof used ENABLE2K — repeating it for other wordlists would require another ~23,000 CPU hours each.
Simulated annealing [1] is mentioned but not explained in the list of modifications to hill climbing. The technique roughly is: accept modifications to the board which decrease the score, with a probability inversely related to the magnitude of the decrease, and which decreases as the search progresses. This helps avoid getting stuck in local maximae.
[1] https://en.wikipedia.org/wiki/Simulated_annealing
EDIT: Somehow I didn't see that simulated annealing was mentioned by name (but not explained), ha!
Annealing is mentioned a few times in the post but not discussed in any detail. I found that hill climbing with an expanded "pool" of boards and exhaustive search of neighbors was the most reliable way to get from a random starting point to the highest-scoring board: https://github.com/danvk/hybrid-boggle/blob/main/boggle/hill...
Somehow my eyes missed that! Edited my comment.
The article actually does mention using this technique, though it doesn't explain it, so thanks for the background from someone who isn't familiar with this space!
The article does indeed mention simulated annealing though?
Somehow I didn't see that, good catch! (It's mentioned but not explained.) Edited my comment.
oh that's very interesting. I've used this idea before in solvers but did not know that this is what it's called!