**Implementing A* on iOS**
Kirk Spaziani
# Introduction
## Background (Graph Theory)
[Graph Theory](https://en.wikipedia.org/wiki/Graph_theory) is the study of structures that
model pairwise relations between objects. _Graph Theory_ is applicable to many problems
in Computer Science and aptitude in _Graph Theory_ is positively correlated with passing
those pesky tech interviews. One very common problem in the _Graph Theory_ space is finding
the _Shortest Path_ between two vertices.
A _Vertex_ (sometimes node) is an abstraction of a point. An _Edge_ connects two vertices.
A _Path_ between two vertices is a sequence of edges that connect the vertices. In Figure
[sample-graph] below, there are 3 paths from `V1` to `V3`. One path, let's call it path
`P1` consists of edges `{ e1, e2 }`. Another path, `P2`, consists of edges `{ e3, e7, e8 }`.
The final path, `P3`, consists of `{ e3, e4, e5, e6, e9 }`.
***********************************************************************
* .--. e2
* .----->| V2 +-------------------.
* .--. e1| '--' |
* | V1 +----' v
* '-+' .+-.
* | .------->+ V3 |
* | |e8 '-+'
* | | ^
* e3| .-+. |e9 .--.
* | .---------------->+ V8 | '--+ V9 |
* | |e7 '--' '+-'
* | | |
* | .+-. .--. .--. |
* '--->+ V4 +------->+ V5 +------>+ V7 +---------'
* '--' e4 '--' e5 '--' e6
***********************************************************************
[Figure [sample-graph]: Sample Graph]
Of the paths `{ P1, P2, P3 }`, which is the _Shortest_? Each _Edge_ of a graph has a
_Weight_, which is an abstraction for the time required or difficulty in traversing the
_Edge_. The [Shortest Path](https://en.wikipedia.org/wiki/Shortest_path_problem) then, is
the _Path_ where the sum of all the _Edge Weights_ is the smallest.
## Why is finding the Shortest Path Important?
When you ask Google Maps to get you home, it will guide you through the _Shortest Path_.
In this context, the _Shortest Path_ is the one that will take you the least amount of
time. It will look at traffic, weather, construction, and other road conditions to
determine the _Edge Weights_, the time it will take to traverse each road subsection. As
described above, the _Shortest Path_ is the sequence of roads with the lowest total
_Weight_ (time to get home).
Now think of almost any video game. Bad guys need to get from where they are to where you
are if they want to kill you. They could just wander around randomly but that wouldn’t be
very effective. Bad Guys in “Classic” games often knew their position relative to yours
and would attempt to move towards you. An enterprising gamer would soon be able to trick
them into getting stuck behind an object. The bad guy was too stupid to navigate to your
position because it couldn’t calculate a proper path. (Bad AI is Bad).
Any game where you click on a location and expect a character to move there has to
calculate the _Shortest Path_ to that location so the character can navigate there. Some
games have hundreds or thousands of entities constantly trying to navigate. This forces us
to consider efficiency as well as correctness.
There are many algorithms to find _Shortest Paths_. Some are designed to find the
_Shortest Path_ between two _Vertices_. Some will find all shortest paths from one
_Vertex_ to all others. Some will even find the _Shortest Paths_ from all to all in a
_Graph_.
## A*
I would like to talk a little bit about one algorithm in particular, A* (A Star). One way
to increase efficiency in finding the shortest path is by not exploring edges that are
unlikely to be part of the actual shortest path. A* works by using a heuristic to
determine what vertex to process next. A* is very useful for grid based games because the
heuristic is very easy to write against a grid.
“But Kirk!” you exclaim, “How is a grid like a graph?”
If you consider each square of the grid to be a _Vertex_, the _Edge_ is the infinitely
small border between the square and its neighbor squares. This even allows representation
of edge weights. In a game you might expect walking through mountains to be more expensive
than walking on roads.
I’m going to describe a very high level description of the algorithm. There are many
places on the web to learn about most of the specifics, one purpose of this article is to
elaborate on an important detail that most (even very well written articles) seem to
ignore.
The algorithm traverses the _Graph_ beginning with the start point. For each _Vertex_ it
visits, it adds to the _Closed Set_ and checks to see if it has reached its destination.
If it is at the destination it builds the path and returns. If not it explores all the
neighboring _Vertices_ that are not in the _Closed Set_ and adds them to the _Open Set_.
Then it selects the best option from the _Open Set_ and repeats the visitation process.
When the algorithm needs to choose the next node to visit, it consults the _Open Set_.
A*’s advantage is that it chooses the node that is most likely based on the information
that it currently has to be the next step on the _Shortest Path_. How does it do that?
When the _Vertex_ is added to the _Open Set_, the estimated _Weight_ of the final path
that includes that _Vertex_ is calculated by adding the current path weight with the value
the heuristic returns for the remainder of the trip. When _Vertices_ are being added to
the _Open Set_, the _Open Set_ might already contain that _Vertex_ from a different
origin. At this point it needs to be _Reprioritized_. Conceptually that’s a simple
operation right? You search the _Open Set_ to see if the _Vertex_ is there. You check to
see if this path is better, and you update the relevant values.
“Ok, So are we done?”
Nope. Pathfinding is an expensive operation, obviously accuracy is important, but if the
calculations bring your game to a screeching halt, then we have a problem. Do we want less
accurate faster pathfinding? Or can we get more performance out of the algorithm I
described? Let’s try first to keep the accuracy and find the inefficiency.
# Performance
The _Open Set_ I described supports these operations:
* __Insert Vertex__
* __Remove “Best” Vertex__
* __Update Vertex__ (remove re-add)
### So what happens if we use an NSMutableArray for those operations?
* __Insert Vertex__ is O(1), ok so far, so good.
* __Remove "Best" Vertex__ is a search O(n), and a remove O(n). Actually the search is an
always search everything so best case is n operations to find the index to remove.
Then the removal will copy the items in the array after the removal.
* __Update Vertex__ is a linear search, which means O(n). In our cases however, it is
essentially free because we already located it.
### Can we use CFBinaryHeap?
CFBinaryHeap is a bit unfriendly to use but we can wrap it if necessary to make it feel
like a Cocoa object! Isn’t this a great idea? Let’s see:
* __Insert Vertex__ has an average case of O(1) and a worst case of O(log n).
* __Remove "Best" Vertex__ is O(log n) – that’s way better.
* __Update Vertex__
- Remove items until the item to be modified is located.
- Modify the item.
- Re-add all the items O(n).
This solution seems a lot better than the NSMutableArray, but the __Update Vertex__ is
still a sore spot, being likely effectively slower, if not asymptotically. Can we make it
better?
### Best Way is Best.
Imagine if we could create a data structure for the _Open Set_ that had these properties:
* __Insert Vertex__ has an average case of O(1) and a worst case of O(log n).
* __Remove "Best" Vertex__ is O(log n).
* __Update Vertex__ is O(log n).
Wow! What data structure is it!? It's a custom Priority Queue implemented with a heap and
a dictionary. CFBinaryHeap does not provide functions to do what we need so it's necessary
to implement the heap from scratch. The heap must be implemented using an array so that
each entry can be accessed by index. The dictionary is a mapping of the _Vertex_ (key) to
index (value). Whenever the heap updates the location of a _Vertex_, that information also
needs to be sent to the dictionary.
# Implementations
This article was originally published in August 2014, and it described the A*
implementation used by [The Lonely Dwarves](https://goo.gl/bmtztf) as written in
Objective-C. Times have changed a bit, and this article focuses on the Swift
implementation. The legacy Objective-C Implementation may be reposted at a later date.
[Swift A* Implementation](astar-swift.md.html)