What is a Depth-First Search (DFS) Algorithm?
Building on our previous story about graphs and graph traversal algorithms, this time we will look into a depth-first search algorithm. A depth-search algorithm also traverses a graph by exploring it vertex-by-vertex, but it does it by following the vertical order of the vertices.
Although the depth-first search algorithm does not guarantee the shortest path between any two reachable vertices in a graph, it is widely used in many applications. Some of those are: finding connected components, performing topological sorting, finding the bridges of a graph, determining the closeness of any two vertices in a graph or a tree, and solving puzzles with a unique solution, such as labyrinths.
Algorithm Overview
The depth-first algorithm begins by denoting the start vertex as visited and placing it into the map of visited nodes.
The algorithm will check if the vertex corresponds to the entity being searched for (in our example below, this is commented as a trivial check).
If the entity being searched for is found, the algorithm will stop executing and it will return the corresponding vertex. Otherwise, the algorithm will loop through its neighboring vertices and recursively descent to each one of them.
This way, the algorithm will:
- a) eventually find the target entity along the downward path;
- b) reach the last (leaf) vertex in the branch, backtrack through the graph (implementation-wise: it will return to the previous caller in the function call stack) and repeat the descent along the next neighboring vertex;
- c) exhaust the graph by marking all the vertices as visited without finding the target entity;
- d) never finish in case of a non-termination, i.e. an infinite graph.
In short, contrary to some other algorithms (see the blog on the breadth-first search algorithm), the depth-first search algorithm will always attempt to go as far and as narrow as possible to find the solution, hence its name.
What Are Properties of DFS?
The depth-first search method is efficient and simple in terms of traversing a graph.
However, it might take a significant amount of time to find the solution in a deep graph even if the solution lies relatively shallow to the starting vertex, but away from the starting path.
Specifically, the next path of the graph can be explored only after the search traverses the entire previous path.
In some cases, this property can be alleviated by limiting the search depth (space complexity) in graphs with familiar structures, i.e., by knowing where the solution can be expected in a graph. Alternatively, the total cost of the search can also be limited (time complexity), allowing a traversal of only a fixed number of vertices.
Implementation DFS Python
The implementation of our depth-first search algorithm by a function DFS()
has four required and one optional parameter.
- The
graph
parameter expects an initialized Graph object (see the blog on the breadth-first search algorithm, the section on graphs). - The
start
parameter takes the starting vertex, which we choose freely (remember, a graph is not a tree, there is no absolute root). - The
visited
parameter references a map, i.e. a dictionary of visited vertices whose values are the edges along the search path. The parameter is defined externally so that we can resume the search at a later moment and construct the search path. - The
target
parameter is the entity we want to find in the graph, enclosed in a vertex. - The
depth
parameter is optional (defaults to 1), and tracks the depth of the currently explored vertex for visualization purposes.
For a better understanding of the algorithm and its implementation, each step is precisely described in the code below.
import graph sep = ' ' # The 'depth' parameter tracks the depth in the call stack # the algorithm is currently at, for visualization purposes. def DFS(graph, vertex, visited, target=None, depth=1): print(sep*depth + f'Exploring vertex {vertex.entity()}') # The starting vertex is visited first and has no leading edges. # If we did not put it into 'visited' in the first iteration, # it would end up here during the second iteration, pointed to # by one of its children vertices as a previously unvisited vertex. visited[vertex] = None result = None # Trivial check #1: searches for None are immediately terminated. if target is None: print(f' The vertex {target} does not exist') return result # Trivial check #2: if the entity is in the starting vertex. elif target == vertex.entity(): result = vertex return result # Otherwise, search through the lower-level vertices for edge in graph.adjacent_edges(vertex): # Gets the second endpoint. v_2nd_endpoint = edge.opposite(vertex) # Examines the second endpoint. if v_2nd_endpoint not in visited: # Keep searching at the lower level, from the second endpoint. result = DFS(graph, v_2nd_endpoint, visited, target, depth+1) print(sep*depth + f'Returning to vertex {vertex.entity()}') # Add the second endpoint to 'visited' and maps the leading # edge for the search path reconstruction. visited[v_2nd_endpoint] = edge # If the search was successful, stop the search if result is not None: break return result
Before we can test the algorithm, we have to initialize a graph and build it by adding vertices and edges to it:
# Initializes an empty graph (object). g = Graph() # Loads the graph with the first ten vertices. for i in range(10): g.add_vertex(i) # Constructs the 'vertices' dictionary for a more # convenient access during the graph construction. vertices = {k.entity():k for k in g.vertices()} # Constructs an arbitrary graph from # the existing vertices and edgs. g.add_edge(vertices[0], vertices[1]) g.add_edge(vertices[0], vertices[2]) g.add_edge(vertices[0], vertices[4]) g.add_edge(vertices[4], vertices[3]) g.add_edge(vertices[3], vertices[5]) g.add_edge(vertices[0], vertices[5]) g.add_edge(vertices[2], vertices[6]) # Initializes the visited dictionary # and the search path. visited = {} path = []
Now that we have prepared everything, we can test the DFS()
and see how it works. Here is the part of the code that runs the algorithm, constructs the search path (if there is one), and shows in a step-by-step manner how it proceeds through the graph:
# Starts the search. result = DFS(g, vertices[5], visited, 6) # If the entity is found... if result is not None: # The search path ends with the found vertex # (entity). Each vertex is a container for # its real-world entity. path_vertex = result # The entity is added to the 'path'. path.append(path_vertex.entity()) # Constructs the rest of the search path # (if it exists)... while True: # Gets a discovery edge # leading to the vertex. path_edge = visited.get(path_vertex) # If the path vertex is the root, # it has no discovery edge... if path_edge is None: break # Otherwise, gets the second # (parent vertex) endpoint. path_vertex = path_edge.opposite(path_vertex) # The entity is added to the 'path'. path.append(path_vertex.entity()) print('Search path found:', end=' ') # The path is reversed and starts # with the root vertex. print(*reversed(path), sep=' -> ') # Otherwise... else: print('\nEntity is not found')
The test run gave us the output:
Exploring vertex 5 Exploring vertex 3 Exploring vertex 4 Exploring vertex 0 Exploring vertex 1 Returning to vertex 0 Exploring vertex 2 Exploring vertex 6 Returning to vertex 2 Returning to vertex 0 Returning to vertex 4 Returning to vertex 3 Returning to vertex 5 Search path found: 5 -> 3 -> 4 -> 0 -> 2 -> 6
Here’s an intermediate state of the algorithm — can you figure out the next steps?
Based on the output, we can see that the search started from vertex 5 and that the DFS()
has found the entity vertex 6. The entire search path is also displayed, however, we should note that the search path is not the shortest one:
5 -> 0 -> 2 -> 6
If we run a search for a non-existing entity, the algorithm will traverse the whole graph and form a traversal tree, showing the order in which the vertices were visited.
# Starts the search. result = DFS(g, vertices[5], visited, 66) … Exploring vertex 5 Exploring vertex 3 Exploring vertex 4 Exploring vertex 0 Exploring vertex 1 Returning to vertex 0 Exploring vertex 2 Exploring vertex 6 Returning to vertex 2 Returning to vertex 0 Returning to vertex 4 Returning to vertex 3 Returning to vertex 5
The entity is not found. Here’s the final state visually:
Efficiency Analysis
Theoretically speaking, the depth-first search algorithm’s time complexity is O(|V| + |E|), where V represents the number of vertices, and E represents the number of edges.
However, the practical time and space complexities depend on a specific implementation, guided by its domain of application. The algorithm will process each vertex once and each edge twice, requiring a constant amount of time in processing an edge.
The algorithm is more space-efficient than some other algorithms, such as the breadth-first search algorithm, because it keeps track only of its current path by relying on the vertex’s neighboring edges. However, it uses recursion and is inherently limited by the maximum depth of the call stack. This property gets very pronounced as the traversal progresses through a very deep graph.
The speed of the algorithm is largely determined by the graph depth and order of the neighboring edges.
Conclusion
In this article, we learned about the depth-first search algorithm.
- First, we explained what a depth-first search algorithm is.
- Second, we got took a look at what are its common purposes and applications.
- Third, we went through an explanation of how the algorithm works.
- Fourth, we examined the algorithm’s main properties.
- Fifth, we went through the implementation of the algorithm, which is based on the Graph abstract data structure (for class implementation, see the blog on the breadth-first search algorithm). We also tested the algorithm by calling its main function, DFS(), and analyzed its steps of execution.
- Sixth, we analyzed the algorithm efficiency and compared it to another domain-representative algorithm.
In the end, we concluded that regardless of its efficiency, if the solution exists, the depth-first search algorithm may not always find it, or may take a practically infinite time before actually reaching the solution. However, we also determined that certain steps can be made to improve the algorithm’s efficiency and applicability, such as limiting the depth or the total number of traversed vertices.