In-Depth

Solving Sliding Tiles with Artificial Intelligence (and Some C#)

True artificial intelligence is years away. But we can demonstrate the challenges of AI decisionmaking using C# to derive solutions for the classic sliding tiles puzzle.

[Editor's Note: Several readers have been asking about a sample file, Heap.cs, that was missing from the Code Download. It's there now; apologies for the inconvenience to the author and to our readers.]

 Artificial intelligence refers to the process by which we develop a nonliving rational agent. A rational agent is an entity capable of perceiving his environment and acting rationally to the perceptions obtained on it. Rationality in this sense is given by the appropriate decision-making of the agent, considered appropriated if it's aimed towards maximizing some desired outcome.

Humans are seen as the finest example of living rational agents -- we receive perceptions from our environment all the time and react to them by making decisions that usually go in favor of our standard of living.

The human mind is still the most capable, complex and sophisticated intelligence in the known universe. There's probably not going to be an artificial intelligence in the near future that will achieve the power that the human mind currently possesses. Even if that's the case, if we have such sophisticated, complex, evolved mind, why do we need to create artificial intelligences?

For some domain-specific environments, artificial intelligences provide a better reaction to perceptions than the reaction provided by humans. Artificial intelligences present a feasible, sometimes optimal outcome to extremely complicated problems by taking advantage of their facility to compute millions of instructions in short periods of time. Even though we have a more elaborated intelligence in terms of reasoning, computers actually defeat us in the field of calculations. For calculation-related environments we create artificial intelligences that can speed up the process of obtaining a maximized outcome.

The calculator, that simple machine that we frequently use in our daily life is a mere example of how we exploit an artificial intelligence in a calculation environment; we lack the ability to correctly execute algebraic operations in short periods of time, therefore, we created the calculator to ease our life.


We can demonstrate another example of artificial intelligence, using a rational agent represented by a C# program for solving a sliding tiles puzzle. The rationality of the agent will be provided by an A* search algorithm tied to several heuristics that will help guide the search towards a maximized outcome.

The Puzzle
The Sliding Tiles Puzzle was created by chess player and puzzle maker Sam Loyd (1841-1911) in the 1870s. The puzzle consists of a N x M board shown in Figure 1, where each cell could be represented as a number, a letter, an image or basically anything you can think of.

A 3 x 3 Sliding Tiles Puzzles Board
[Click on image for larger view.] Figure 1: A 3 x 3 Sliding Tiles Puzzles Board

The task in the sliding tiles puzzle is to rearrange the tiles of some starting state in a manner in which all of them end up arranged matching another configuration known as the goal state (see Figure 2).

Moving Tiles from Start State to Goal State
[Click on image for larger view.] Figure 2: Moving Tiles from Start State to Goal State

To be able to reach the goal state a sequence of moves is required; one move consists of swapping the empty tile with some other tile. The solution to the previous example would be obtained as shown in Figure 3.

Moving From Start to Goal Requires Moving Two Tiles
[Click on image for larger view.] Figure 3: Moving From Start to Goal Requires Moving Two Tiles

A logical question probably pops up into the reader's mind when he thinks of the starting state; could I truly find the goal state from this initial state? Sam Loyd once offered $1,000 to anyone who could solve the puzzle in Figure 4.

Can You Solve This 4 x 4 Board?
[Click on image for larger view.] Figure 4: Can You Solve This 4 x 4 Board?

Although several people claimed they found a solution, the reality is that the previous puzzle has no solution. So, how do we know if a given initial configuration is solvable?

It has been proved that a configuration is solvable if the following conditions hold:

  • If the width is odd, every solvable state has an even number of inversions.
  • If the width is even, every solvable state has: a) an even number of inversions if the empty tile is on an odd numbered row counting from the bottom; b) an odd number of inversions if the empty tile is on an even numbered row counting from the bottom.

An inversion occurs when some tile precedes another tile with a lower value; the goal state has zero inversions. For instance, considering a 4 x 4 board the number 12 at the upper left corner will have a total of 11 inversions as numbers 1 through 11 come after it. In general, an inversion is a pair of tiles (a, b) such that a appears before b, but a is greater than b.

The A* algorithm
The agent we'll be developing searches the state space (universe of all possible board configurations) using an A* Informed Search (https://en.wikipedia.org/wiki/A*_search_algorithm Hart, 1968). The search is informed as it selects the next step considering domain knowledge which is represented by the value of a function associated to every state of the problem (see Figure 5). The possible states are: up, down, left and right each, related to moving the empty tile in those directions.

A* Informed Search
[Click on image for larger view.] Figure 5: A* Informed Search

The key element in the A* resides in the value assigned to every state as it gives the possibility of exponentially reducing the time invested in finding the goal state from a given initial state (see Figure 6). The function outputting the value bound to every state s can be decomposed as:

f(s) = g(s) + h(s)

where g is the cost of reaching s from the initial state, usually translated as the number of moves executed from the initial state up to s and h is the rational component of the agent, the component defining the cost of all possible paths starting at s, this component is known as the heuristic.

A* Informed Search
[Click on image for larger view.] Figure 6: Finding Goal State from Given Initial State

The heuristic function is the manner in which we attach our empirical rules to the agent -- therefore, the better rules we add the sooner we'll reach the goal state. The function MisplacedTiles(s) that outputs the number of misplaced tiles for some state s can be considered as a feasible heuristic since it always provides insight on how far we are from reaching the goal state. An important note when creating or including a heuristic is its admissibility, necessary in order to guarantee that the A* algorithm will be complete (it finds a solution) and optimal (it finds an optimal solution).

A heuristic is admissible if it never overestimates the minimum cost of reaching the goal state from some node s, and its value must remain lower or equal than the cost of the shortest path from s to the goal state. The Misplaced Tiles heuristic is admissible since every tile out of place must be moved at least once in order to arrange them into the goal state.

As depicted in the last figure the set of states can be modeled as a graph where the children of a state node are obtained by moving the empty tile in all possible directions. Since the problem can be modeled as a graph it's logical to use a graph search algorithm for locating the goal state.

The basic skeleton of the A* is that of a Breadth First Search. The difference lies on the selection of the next node to enqueue or expand. In a BFS all nodes possess value 1 therefore it does not matter what node to expand -- in the A* we'll always expand the node that maximizes the desired outcome, the node with minimum cost (shortest path to reach the goal state) in the particular case of the Sliding Tiles Puzzle.

Figure 7 illustrates the execution of the A* algorithm using Misplaced Tiles as heuristic function.

Running the A* Algorithm via Misplaced Tiles
[Click on image for larger view.] Figure 7: Running the A* Algorithm via Misplaced Tiles

It's important to note when calculating any heuristic that we never take into account the empty tile -- if we do then we could be overestimating the real cost of the shortest path to the goal state, which makes the heuristic non admissible. Consider what would have happened in the prior example if we would have taken into account the empty tile, as shown in Figure 8.

What Happens When We Account For the Empty Tile
[Click on image for larger view.] Figure 8: What Happens When We Account For the Empty Tile

In the previous node we are just one step from reaching the goal state but the heuristic claims that we are at least two steps (8 and empty misplaced) from the goal state. It's clearly an overestimation of the real cost, thus, non admissible.

To start analyzing part of the code let us examine the StateNode class in Listing 1 that we'll be using to represent states or configurations on the board.

Listing 1: StateNode Class

        class StateNode<T>: IComparable<StateNode<T>> where T: IComparable
        {
            public double Value { get; set; }
            public T[,] State { get; private set; }
            public int EmptyCol { get; private set; }
            public int EmptyRow { get; private set; }
            public int Depth { get; set; }
            public string StrRepresentation { get; set; }

            public StateNode() { }

            public StateNode(T[,] state, int emptyRow, int emptyCol, int depth)
            {
                if(state.GetLength(0) != state.GetLength(1))
                    throw new Exception("Number of columns and rows must be the same");
                
                State = state.Clone() as T[,];
                EmptyRow = emptyRow;
                EmptyCol = emptyCol;
                Depth = depth;

                for (var i = 0; i < State.GetLength(0); i++)
                {
                    for (var j = 0; j < State.GetLength(1); j++)
                        StrRepresentation += State[i, j] + ",";
                }
            }

            public int Size
            {
                get { return State.GetLength(0); }
            }

            public void Print()
            {
                for (var i = 0; i < State.GetLength(0); i++)
                {
                    for (var j = 0; j < State.GetLength(1); j++)
                        Console.Write(State[i,j] + ",");
                    Console.WriteLine();
                }
                Console.WriteLine();
            }

            public int CompareTo(StateNode<T> other)
            {
                if (Value > other.Value)
                    return 1;
                if (Value < other.Value)
                    return -1;

                return 0;
            }
        }

The first point to notice in the StateNode class is its generic feature. We are making it generic since we want to have a sliding tiles puzzle with numbers, letters, images, etc. and the T generic type could embody any of the previous types. We are also requiring that the T type be comparable, this can prove to be useful -- even necessary -- if you want to implement a solvability test.

Remember that you'll need to calculate the number of inversions through elements comparisons. Also note that we are assuming the board will have the same width and height (n=m) and that the empty tile is "0". A description of every field, property is presented in this list:

  • Value: represents the value of f = g + h
  • T[,] State: the configuration or state represented by this node
  • EmptyCol: the column number where the empty tile is located on the board
  • EmptyRow: the row number where the empty tile is located on the board
  • Depth: depth of this state from the initial state
  • StrRepresentation: string representation of the configuration (i.e "1,2,3,4,5,6,7,8,0,". Built upon creation of the object, in the constructor.

comments powered by Disqus

Featured

Subscribe on YouTube