## Word chains: Generalizing the chaining rules

The word chain puzzle presented at codekata.com gives defines the following rule.

Successive entries in the chain must all be real words, and each can differ from the previous word by just one letter

Which I intepreted, from their samples, that substituion of one character at each step was allowed. At wordchains.com they define rules that allow more transitions:

A word chain is a simple word game, the object of which is to change one word into another through the smallest possible number of steps. At each step a player may perform one of four specific actions upon the word in play – either

add a letter,remove a letterorchange a letterwithout switching the order of the letters in play, orcreate an anagramof the current word.

It is quite easy to modify the haskell implementation to use this different ruleset instead. The only thing that is needed is another implementation of the neighbours function:

neighbours :: Word -> Word -> Bool neighbours xs ys = case length xs - length ys of 0 -> chardiff xs ys == 1 || anagram xs ys 1 -> elem ys (substrings xs) -1 -> elem xs (substrings ys) _ -> False where anagram xs ys = xs /= ys && sort xs == sort ys chardiff xs ys = length $ filter (\(a,b) -> a /= b) $ zip xs ys substrings xs = unfoldr generate ([],xs) where generate (_,[]) = Nothing generate (xs,(y:ys)) = Just (xs++ys, (xs++[y],ys))

Source code is stored at github.

## Word puzzle in C++

Last post presented a solution to the word chain problem based on basic graph theory and implemented in Haskell. It was concluded that the initialization of the graph from the dictionary was time consuming (about four minutes on my laptop). C++ is known for industrial strength library implementations and strong compilers. Can we improve the performance using C++ then? I decided to try out, equipped with Microsoft Visual Studio 2013 and the boost graph library.

##### Type definitions

Before we begin coding we need a few type definitions.

typedef boost::adjacency_list<boost::listS, boost::vecS, boost::undirectedS> graph_t; typedef std::vector<std::string> dictionary_t; typedef dictionary_t::size_type index_t; typedef std::map<std::string, index_t> index_lookup_t;

Line 1: The graph type. The boost graph library provides (at least) two graph implementation; the matrix graph which is for dense graphs and the adjacency list which is for sparse graphs. For this problem the graph is definitely a sparse one we go for the latter option. The template parameters specifies how nodes are their neighbors are stored. I will not go into detail about this but it is reasonably well documented in the library manual. Next template argument specifies that we want an undirected graph since the neighborhood concept for word chains is symmetric.

Line 2: The dictionary type which is simply a vector of strings.

Line 3: The index type is the index used for the vector.

##### Constructing the graph

Now, we can write some code. First thing is to load the dictionary and construct the graph.

dictionary_t dictionary( istream_iterator<string>(ifstream("dictionary.txt")), istream_iterator<string>()); index_lookup_t word2index; transform( dictionary.begin(), dictionary.end(), counting_range<index_t>(0, dictionary.size()).begin(), inserter(word2index, word2index.begin()), make_pair<const string&, const index_t&>); graph_t g(dictionary.size()); for (auto i = 0U; i < dictionary.size(); ++i) { for (auto j = i + 1; j < dictionary.size(); ++j) { if (neighbours(dictionary[i], dictionary[j])) add_edge(i, j, g); } }

Line 1-3 leverages stream iterators to read the file data and construct the vector.

Line 6-10 uses the the std algorithm transform to construct the mapping from words to indices. The idea is to create pairs of indices and words and the insert the result into the map using the inserter class. boost counting_range comes handy to produce an enumeration from zero to the size of the dictionary. In Haskell we used an infinite list [0..] in this case.

Line 12-19 loops over the words and insert edges for every words that are neighbors.

Implementing the neighbor test is simple.

bool neighbours(const std::string& s1, const std::string& s2) { const size_t length = s1.length(); if (length != s2.length()) return false; int diff = 0; for (auto i = 0U; i < length && diff <= 1; ++i) { if (s1[i] != s2[i]) ++diff; } return diff == 1; }

##### Finding the shortest path

Once we got the graph we want to find the shortest path between to words.

vector<vertex_descriptor_t> predecessors(num_vertices(ctx.g)); dijkstra_shortest_paths( ctx.g, sourceIndex, predecessor_map(&predecessors[0]). weight_map(fixed_weight_pmap()) ); list<string> path; auto index = targetIndex; while (index != sourceIndex) { path.push_back(ctx.dictionary[index]); index = predecessors[index]; } path.push_back(ctx.dictionary[index]); path.reverse(); return path;

The boost graph library comes with the dijkstra_shortest_paths function for this purpose. We pass in:

- the graph
- the source node

and (using the slightly confusing solution for passing named parameters):

- a predecessor map where to store the result
- a weight map that defines the weights for the edges

- The weight map was a bit of a struggle. The standard way is to provide a weight for each edge but that seemed like a waste since every edge had the weight one. I ended up writing a very simple class implementing the read only property map concept.

struct fixed_weight_pmap { }; template <class K> inline int get(const fixed_weight_pmap& pa, const K& k) { return 1; } namespace boost { template<> struct property_traits<fixed_weight_pmap> { typedef int value_type; }; }

##### Conclusions

So… how about performance? Does not still take four minutes to initialize? No. Nope. Not at all. Not even close. It takes less than ten seconds. Pretty impressive. The boost graph library is fast. It is generic. Unfortunately, it is not always easy to use. Partly due to the generality and the amount of concepts/sub-libraries/styles used (one example is the named parameters). Partly due to (imo) cumbersome documentation which too often is missing a good example. I hope this article can (very) slightly improve on the last deficiency.

##### References

The entire source code is stored on github.

The red book. Boost graph library user guide and manual.

## Word chains in Haskell

An example of a word chain puzzle (generally described at codekata.com) is to find words that links “cat” to the word “dog” using words from the dictionary only replacing one character at the time. The shortest chain that solves the example is:

cat –> cot –> cog –> dog

The problem can be seen as a graph problem where each word is a node and all nodes that only differs by one character are joined with an edge. Using such a representation, the solution to the problem can be calculated using a standard shortest path algorithm.

Lets look at how this can be implemented in Haskell. Representing graphs in (pure) functional languages are are tricky but Martin Erwig’s Functional Graph Library comes to rescue. Use cabal to download and install it from hackage.

> cabal install fgl

Once installed we can import it.

import Data.Graph.Inductive.Graph import Data.Graph.Inductive.Tree import Data.Graph.Inductive.Query.SP

Building the graph turns out to be straight forward.

graph :: Dictionary -> Gr Word Int graph dictionary = let nodes :: [LNode Word] nodes = zip [0..] dictionary edges :: [LEdge Int] edges = [(i,i',1) | (i,w) <- nodes, (i',w') <- nodes, i /= i', neighbours w w'] in mkGraph nodes edges

Nodes are pairs of an integer and a generic data type which in this case are strings. One line 5 all nodes are created by zipping an infinite list of integer indexes (who does not love that Haskell actually allows this abstraction) with our dictionary which is simple a list of strings.

Edges are three tuples; index of the first node, index of the second node and the edge weight which is always one in the case of word chains. At line 7 we leverage list comprehensions to construct all the edges. All that is needed above that is a function that checks if two words are “neighbors” meaning that they only differ one character.

neighbours :: Word -> Word -> Bool neighbours xs ys = length xs == length ys && distance 0 xs ys == 1 where distance d [] [] = d distance d (x:xs) (y:ys) = case (x==y, d) of (False, 0) -> distance 1 xs ys (False, _) -> -1 -- no use to continue... (True, d) -> distance d xs ys

The code above tries to be a bit efficient but I am sure it can be expressed more concise. So, now we can build a graph and luckily the graph library also provides a shortest path implementation:

sp :: (Graph gr, Real b) => Node -> Node -> gr a b -> Path

That is all we need to put our solution together into a working program.

main = do contents <- readFile "dictionary.txt" let dictionary = lines contents putStrLn "Initializing..." t1 <- getCurrentTime let !g = graph dictionary t2 <- getCurrentTime putStrLn $ "Done. Took " ++ (show (diffUTCTime t2 t1)) let toIndex w = fromJust $ elemIndex w dictionary let fromIndex i = dictionary!!i let loop = do putStrLn "source:" sourceString <- getLine putStrLn "target:" destString <- getLine let source = toIndex sourceString let target = toIndex destString let path = sp source target g putStrLn $ show $ map fromIndex path loop loop

Line 2 reads the dictionary from a text file and line 3 splits each line into a separate string which produces a list of words.

Line 6 creates the graph. It using the BangPatterns extension (!) to avoid lazy initialization. This is not exactly needed but I find it nice to be able to initialize the graph and then startup.

Line 8 and 9 are converters between word index and actual word.

Line 13 and 15 retrieves source and target words from the user (such as “cat” and “dog”).

Line 18 executes the shortest path search for the words given by the user.

That’s it. Lets compile an optimized executable of this program using ghc.

> ghc -O main.hs

When executing the program it spends a lot of time on the initialization. The graph takes almost four minutes to construct on my laptop for a dictionary of approximately 58000 words. A bit annoying for the user. Having constructing the graph the results are calculated instantly which is nice. In the next blog post, we will implement the same solution using c++ and boost graph library.

(The entire source code is published on github)