Graph Algorithms — BFS, Dijkstra, PageRank

The Algorithms That Map the World, Rank the Web, and Connect Everyone to Everyone

Google Maps finds the fastest route from your house to the airport in under a second, considering every road, traffic light, construction zone, and real-time traffic jam in the entire city. LinkedIn shows you that your ex-colleague is connected to the CEO of your dream company through exactly 3 people. Netflix knows that people who watched Breaking Bad and The Wire also watched Ozark -- and recommends it to you. These aren't three different problems. They're all graph problems. And the algorithms that solve them were invented in the 1950s.

A graph is a deceptively simple concept: things (called nodes or vertices) connected by relationships (called edges). But that simplicity is what makes graphs so powerful -- they can model almost anything. Social networks, road maps, flight routes, web links, molecular structures, supply chains, electrical grids, neural networks, and recommendation systems all reduce to graphs. The algorithms we'll cover in this topic are the tools for extracting meaning from those connections: finding the shortest path, exploring all reachable nodes, and determining which nodes are most important.

These aren't abstract exercises. Google's $2 trillion valuation traces directly to one graph algorithm. Every GPS device runs another. And the concept of "six degrees of separation" -- the idea that any two people on Earth are connected through at most six intermediaries -- is a graph property that BFS can verify. Let's build up from the basics.

Degrees of separation -- the maximum hops between any two people in most social networks

O(V + E)

Time complexity of BFS and DFS -- proportional to nodes plus edges, not nodes squared

1998

Year Larry Page and Sergey Brin published the PageRank algorithm -- the foundation of Google

200+

Ranking factors in modern Google search -- but PageRank's graph logic remains the conceptual core

Graphs Are Everywhere You Look

Before we dive into algorithms, let's internalize how many things in the world are naturally graphs. Once you see it, you can't unsee it.

A social network is a graph. Every person is a node. Every friendship, follow, or connection is an edge. Facebook's social graph has over 3 billion nodes and hundreds of billions of edges. When LinkedIn shows "People You May Know," it's running a graph algorithm that finds nodes close to you in the network that you haven't connected with yet.

A road map is a graph. Every intersection is a node. Every road segment is an edge. The weight of each edge is the travel time or distance. Google Maps, Apple Maps, and Waze all store the entire planet's road network as a weighted graph and run shortest-path algorithms millions of times per day.

The internet is a graph. Every web page is a node. Every hyperlink is a directed edge (page A links to page B, but B doesn't necessarily link back). The World Wide Web contains tens of billions of nodes, and Google has crawled and indexed a significant fraction of them. PageRank, the algorithm that made Google dominant, works by analyzing the structure of this graph.

Airline routes form a graph. Airports are nodes. Flights are directed, weighted edges (direction: origin to destination; weight: ticket price, flight time, or distance). When you search Kayak for the cheapest route from New York to Tokyo with one layover, the system is running a shortest-path algorithm on this flight graph.

Even molecular structures are graphs. Atoms are nodes. Chemical bonds are edges. Drug discovery algorithms analyze molecular graphs to predict how new compounds will interact with biological systems. The same graph theory that routes your GPS also helps design medicines.

Key Insight

If your data involves connections between things -- and you care about those connections -- you have a graph problem. The power of graph algorithms is that the same techniques work regardless of what the nodes and edges represent. The algorithm that finds the shortest road route is the same algorithm that finds the cheapest flight path, the most efficient network route, or the closest friend-of-friend in a social network. One set of algorithms, infinite applications.

Graph Vocabulary: Directed, Undirected, and Weighted

Graphs come in three main flavors, and the distinction matters for choosing the right algorithm.

Undirected graphs have edges that go both ways. If Alice is friends with Bob, then Bob is friends with Alice. Facebook friendships, road maps (most roads are two-way), and chemical bonds are undirected. One edge represents a mutual relationship.

Directed graphs (digraphs) have edges with a direction. If Alice follows Bob on Twitter, Bob doesn't necessarily follow Alice back. Web links, flight routes (a flight from NYC to LAX is different from LAX to NYC), and river systems are directed. Each edge has a source and a destination.

Weighted graphs assign a number to each edge representing some cost, distance, or capacity. The road from A to B might take 15 minutes while the road from A to C takes 5 minutes. Without weights, we can find the path with the fewest edges. With weights, we can find the path with the lowest total cost -- which is almost always what we actually want.

BFS: Breadth-First Search -- Exploring Layer by Layer

Breadth-first search explores a graph by visiting all neighbors first, then all neighbors' neighbors, then all their neighbors, expanding outward in concentric layers. Think of dropping a stone in a pond -- the ripples spread outward, reaching nearby nodes before distant ones.

BFS uses a queue (first in, first out) to track which nodes to visit next. Start at a source node, add it to the queue, and repeat: take the next node from the queue, visit all its unvisited neighbors, and add each one to the queue. This guarantees that you visit nodes in order of their distance (number of edges) from the source.

BFS explores the graph in concentric layers from the source node A. Layer 0 is A itself. Layer 1 includes A's direct neighbors (B, C, D). Layer 2 includes their unvisited neighbors (E, F, G). Layer 3 reaches H. Because BFS visits nodes in order of distance, the first time it reaches any node, that path is guaranteed to be the shortest.

The critical property of BFS: it finds the shortest path in an unweighted graph (fewest edges). Because it explores layer by layer, the first time it reaches any node, it has found the path with the minimum number of hops. This is exactly what you want for "degrees of separation" calculations. LinkedIn's "how you're connected" feature is BFS: start at your profile, expand to your connections (layer 1), then their connections (layer 2), until you reach the target person. The layer number is the degree of separation.

BFS runs in O(V + E) time, where V is the number of nodes (vertices) and E is the number of edges. It visits every node once and examines every edge once. For a social network with 1 million users and 50 million friendships, BFS processes the entire graph in 51 million operations -- completely manageable on modern hardware.

DFS: Depth-First Search -- Going Deep Before Going Wide

Depth-first search takes the opposite approach. Instead of exploring all neighbors before moving deeper, DFS follows one path as far as it goes, then backtracks when it hits a dead end and tries the next path. Think of exploring a maze by always turning left until you hit a wall, then backing up and trying the next option.

DFS uses a stack (last in, first out) instead of a queue. Start at a source node, push it onto the stack, and repeat: pop the top node, visit it, push all its unvisited neighbors. The stack naturally ensures that the most recently discovered path is followed first.

DFS doesn't find shortest paths -- it finds a path, not necessarily the shortest one. But it excels at other tasks. Cycle detection: DFS can determine whether a graph contains cycles (useful for detecting deadlocks in operating systems or circular dependencies in build systems). Topological sorting: given tasks with dependencies (task B requires task A to finish first), DFS can produce a valid execution order. This is how build systems like Make, package managers like npm, and course prerequisite planners work.

BFS (Breadth-First Search)

Exploration pattern: Layer by layer, like ripples in a pond.

Data structure: Queue (FIFO).

Finds shortest path: Yes, in unweighted graphs.

Memory usage: Stores all nodes at the current layer -- can be large for wide graphs.

Best for: Shortest path, "degrees of separation," level-order traversal, finding nearest targets.

DFS (Depth-First Search)

Exploration pattern: Deep along one path, then backtrack. Like exploring a maze.

Data structure: Stack (LIFO) or recursion.

Finds shortest path: No. Finds a path, not the shortest.

Memory usage: Stores nodes along the current path -- efficient for deep, narrow graphs.

Best for: Cycle detection, topological sorting, maze solving, connected component analysis.

Dijkstra's Algorithm: Finding the Shortest Weighted Path

BFS finds the shortest path when all edges have equal weight (or no weight at all). But in the real world, edges have different costs. The road from A to B might take 5 minutes while the road from A to C takes 20 minutes. A path with more edges might still be faster if those edges have smaller weights. This is where Dijkstra's algorithm comes in -- the algorithm behind every GPS navigation system on Earth.

Dijkstra's algorithm, published by Edsger Dijkstra in 1956, uses a greedy strategy: always expand the unvisited node with the smallest known distance from the source. It maintains a running table of the shortest known distance to every node, updating these distances as it discovers shorter paths through newly explored nodes.

Walk through Dijkstra's algorithm on a small road network, finding the shortest path from node S (start) to node T (target):

Dijkstra's algorithm greedily expands the closest unvisited node at each step. The key insight in Step 2: the path S->B->A (distance 3) is shorter than the direct path S->A (distance 4). Dijkstra's algorithm discovers this by always expanding the node with the smallest known distance, ensuring it never misses a shorter detour. The gold-highlighted edges show the final shortest path: S to B to D to T, total distance 8.

Notice what happened in Step 2: Dijkstra's algorithm discovered that the path through B to A (distance 3) was shorter than going directly from S to A (distance 4). This is why the algorithm works -- by always expanding the cheapest unvisited node, it naturally discovers shortcuts. The greedy choice at each step leads to the globally optimal solution.

The time complexity of Dijkstra's algorithm with a priority queue (binary heap) is O((V + E) log V). For a road network with millions of intersections and roads, this completes in fractions of a second. Real GPS systems use optimized variants (like A* search, which adds a heuristic to guide the search toward the destination) that are even faster in practice.

Real-World Example

Google Maps processes over 1 billion kilometers of directions daily. The road network for the US alone has approximately 60 million nodes (intersections) and 150 million edges (road segments). Dijkstra's algorithm with a priority queue finds the optimal route through this graph in about 100 milliseconds. Variants like Contraction Hierarchies (which precompute shortcuts between distant nodes) can find continent-spanning routes in under 1 millisecond. The edge weights aren't static -- they update in real time based on GPS data from millions of Android phones, reflecting current traffic conditions. The algorithm runs the same way; only the weights change.

PageRank: The Algorithm Worth $2 Trillion

In 1998, two Stanford PhD students named Larry Page and Sergey Brin published a paper describing an algorithm to rank web pages by importance. Their insight: a web page is important if important pages link to it. This circular definition is actually well-defined mathematically, and the algorithm that computes it is called PageRank. It became the foundation of Google.

The idea is elegantly simple. Imagine a "random surfer" clicking links on the web. They start on a random page, follow a random link on that page, follow a random link on the next page, and so on. After surfing for a very long time, the fraction of time spent on each page converges to a stable distribution. Pages that the surfer visits frequently -- because many other pages link to them, or because important pages link to them -- end up with a high PageRank score.

PageRank models the web as a directed graph. Each page distributes its rank equally among its outgoing links. Page A has the highest rank (0.38) because it receives links from all four other pages -- including B and C, which themselves have moderate rank. The key insight: a link from an important page is worth more than a link from an obscure one. This recursive definition is what made Google's search results dramatically better than its competitors in the late 1990s.

Mathematically, PageRank is computed iteratively. Start by assigning every page equal rank (1/N, where N is the number of pages). Then repeatedly update each page's rank based on the ranks of pages that link to it, divided by how many outgoing links those pages have. After enough iterations, the ranks converge to stable values. The random surfer also has a "damping factor" (typically 0.85) -- an 85% chance of following a link and a 15% chance of jumping to a random page, which prevents the algorithm from getting stuck in loops.

PageRank was revolutionary because it treated the web's link structure as a vote of confidence. A link from a high-PageRank page (like a major news site) carried more weight than a link from a random blog. This produced dramatically better search rankings than the keyword-stuffing approaches of 1990s search engines. Google's modern algorithm uses over 200 ranking factors, but the graph-based logic of PageRank remains its conceptual foundation.

Network Flow and Matching: Optimizing Capacity

Beyond path-finding and ranking, graph algorithms solve a class of problems about maximizing flow through networks with limited capacity. Imagine a highway system where each road has a maximum throughput (cars per hour). Given a source (suburb) and destination (city center), what's the maximum number of cars per hour that can travel from source to destination simultaneously, respecting every road's capacity?

This is the maximum flow problem, and it has surprisingly diverse applications. Airport gate assignment: planes are nodes, compatible gates are edges, and the algorithm finds the maximum number of planes that can be assigned to gates simultaneously. Server load balancing: clients are sources, servers are destinations, and edge capacities represent how many connections each server can handle. Kidney donor matching: patients and donors form a bipartite graph, and the algorithm maximizes the number of compatible transplant pairs.

The most famous application is network routing. Internet traffic between two cities must flow through routers and cables, each with bandwidth limits. The maximum flow algorithm determines the theoretical maximum throughput between any two points in the network -- and identifies bottleneck links where adding capacity would increase overall throughput the most.

Key Insight

The Max-Flow Min-Cut theorem, one of the most beautiful results in graph theory, states that the maximum flow through a network equals the minimum cut -- the smallest total capacity of edges that, if removed, would completely disconnect the source from the destination. This means finding the bottleneck and finding the maximum throughput are the same problem. Network engineers use this to identify exactly which links to upgrade to increase overall capacity.

Graph Algorithms at Scale: How Companies Handle Billions of Nodes

The algorithms we've covered work well for graphs that fit in one computer's memory. But Facebook's social graph has 3 billion nodes and hundreds of billions of edges. Google's web graph is even larger. At this scale, specialized systems are required.

Google's Pregel (and its open-source successor, Apache Giraph) pioneered "think like a vertex" distributed computing. Each node in the graph runs a small program: it receives messages from its neighbors, updates its state, and sends messages to its neighbors. The system distributes nodes across thousands of servers, each processing their portion of the graph in parallel. PageRank is a natural fit for this model -- each page receives rank contributions from pages linking to it, updates its rank, and sends its new rank to pages it links to. After enough rounds, all ranks converge.

Neo4j, a graph database, stores graph data natively with each node physically linked to its neighbors on disk. This makes traversal operations (BFS, DFS, shortest path) dramatically faster than running them on a relational database, where joins between tables are required to follow edges. Social networks, recommendation engines, and fraud detection systems increasingly use graph databases because the query patterns naturally match graph traversal.

Facebook social graph3B+ nodes, 200B+ edges

Google web graph (indexed)Tens of billions of pages

LinkedIn connections1B+ members, billions of edges

Google Maps road network (US)60M intersections, 150M road segments

Answers to Questions Students Actually Ask

When should I use BFS vs. DFS? Use BFS when you need the shortest path or when you're exploring outward from a starting point (nearest neighbor search, level-by-level traversal, "degrees of separation"). Use DFS when you need to explore all possibilities (maze solving, cycle detection, topological sorting) or when the graph is very deep and narrow. If the graph is a tree, BFS gives you level-order traversal (left to right, top to bottom) while DFS gives you pre-order, in-order, or post-order traversal depending on when you process each node.

Why can't Dijkstra's algorithm handle negative edge weights? Dijkstra's greedy strategy assumes that once a node is visited with its shortest distance, no shorter path exists. Negative weights break this assumption -- you might find a shorter path later by taking a detour through a negative-weight edge. For graphs with negative weights, use the Bellman-Ford algorithm, which is slower (O(V*E)) but handles negative edges correctly. If the graph has negative-weight cycles (a loop whose total weight is negative), no algorithm can find a "shortest" path because you can go around the cycle infinitely, making the distance arbitrarily negative.

Is PageRank still used by Google? The original PageRank algorithm is just one of over 200 signals Google uses to rank pages. Modern Google search incorporates machine learning (RankBrain, BERT, MUM), user behavior signals, content quality analysis, and many other factors. But the fundamental insight -- that link structure reveals importance -- remains embedded in Google's ranking system. The concept has also spread far beyond web search: citation analysis in academic papers, influence measurement in social networks, and even ranking proteins in biological networks all use PageRank variants.

What's the difference between a graph and a tree? A tree is a special type of graph: it's connected (every node is reachable from every other), has no cycles (no loops), and has exactly N-1 edges for N nodes. Every tree is a graph, but not every graph is a tree. Graphs can have cycles, disconnected components, and any number of edges. Your file system is a tree. A social network is a graph. A road map is a graph. If you can get from any node back to itself by following edges, you have a graph with cycles, not a tree.

How hard is it to implement these algorithms? BFS and DFS are remarkably short -- about 15-20 lines of code in Python. Dijkstra's algorithm with a priority queue is about 25-30 lines. PageRank's iterative version is about 20 lines. The algorithms themselves are not complex to code. The challenge is representing the graph efficiently (adjacency list vs. adjacency matrix, depending on graph density) and handling edge cases (disconnected graphs, self-loops, parallel edges). If you can implement BFS, you have the foundation for most other graph algorithms.

Where Graph Algorithms Take You Next

Graph algorithms are prerequisites for some of the most impactful areas of computer science. Machine learning on graphs -- graph neural networks -- is one of the fastest-growing areas of AI, used for drug discovery, social network analysis, and recommendation systems. Network security uses graph analysis to detect anomalous communication patterns that indicate intrusions. Compiler optimization represents program control flow as a graph and uses graph algorithms to optimize code. Supply chain optimization models factories, warehouses, and transportation routes as a weighted graph and uses network flow algorithms to minimize cost.

The deeper insight is that graphs are the most general data structure -- arrays, linked lists, and trees are all special cases of graphs. Any problem involving entities and relationships between them is a graph problem, even if it doesn't look like one at first. Learning to see the graph hidden inside a real-world problem is one of the most valuable skills in computer science. Once you can model a problem as a graph, you have access to 70 years of brilliantly optimized algorithms ready to solve it.

The takeaway: Graph algorithms are the tools that map the world, rank the web, and connect everyone to everyone. BFS finds the shortest unweighted path by exploring layer by layer. DFS dives deep for cycle detection and topological sorting. Dijkstra's algorithm finds the cheapest weighted path -- powering every GPS device on Earth. PageRank turned the web's link structure into a measure of importance and built a $2 trillion company. These algorithms, most invented in the 1950s and 1960s, run behind the navigation apps, social networks, recommendation engines, and search engines you use every day. Understanding them means understanding the connective tissue of modern technology.

Graph Algorithms: Networks, Routes, and Connections