AI search

AI search

Problem-solving agents meet different environments

#️⃣   ⌛  ~1.5 h 🤓  Intermediate

19.09.2024

upd:

#126

AI search

Problem-solving agents meet different environments

⌛  ~1.5 h

#126

🎓 147/167

This post is a part of the AI theory educational series from my free course. Please keep in mind that the correct sequence of posts is outlined on the course page, while it can be arbitrary in Research.

I'm also happy to announce that I've started working on standalone paid courses, so you could support my work and get cheap educational material. These courses will be of completely different quality, with more theoretical depth and niche focus, and will feature challenging projects, quizzes, exercises, video lectures and supplementary stuff. Stay tuned!

Artificial intelligence search — often referred to simply as "search" — plays a crucial role in the broader field of AI, serving as a core framework for problem-solving agents that need to find solutions systematically in well-defined or sometimes partially defined spaces. The concept of "search" in AI revolves around exploring possible sequences of actions and outcomes, attempting to arrive at a goal that fulfills certain criteria or solves a particular challenge. At its foundation, AI search is about generating and evaluating states — or sets of possible configurations — using a structured process that can be as naive as exploring every possibility or as sophisticated as leveraging heuristics, constraints, or other domain-specific knowledge.

I find AI search especially motivating because of its wide applicability. From routing problems (like shortest paths in transportation networks) and scheduling tasks (like optimizing resource allocation) to game-playing agents and complex constraint satisfaction puzzles, search mechanisms help cut down potentially astronomically large solution spaces to something manageable. In daily practice, search strategies empower advanced software and intelligent systems, allowing them to identify optimal — or near-optimal — solutions in the face of complexity and uncertainty.

This article is intended for readers who have some background in machine learning and data science but wish to dive deeper into the theoretical and practical details of AI-based search strategies. I will walk through fundamental ideas, from the most basic uninformed search techniques to heuristically guided approaches, local search, constraint satisfaction, and adversarial search. I will also highlight relevant insights from research (e.g., from major conferences like NeurIPS or journals like JMLR) that have shaped the modern understanding of AI search and its associated methods. By the end, you should feel confident about where each search strategy fits and how to design or choose the right one for your specific problem context.

Below is the overall structure of this article:

Introduction
— Motivation for AI search
— Definition of search in AI
— Problem-solving agents and their role in search
— Overview of key search paradigms and techniques
Fundamentals of problem-solving by searching
— Defining the search problem (states, actions, goals)
— State space representation
— Search trees vs. search graphs
— Example problems
— Algorithmic problem-solving basics
Uninformed (blind) search strategies
— Characteristics and limitations of uninformed search
— Breadth-first search (BFS)
— Depth-first search (DFS)
— Uniform-cost search (Dijkstra's algorithm)
— Comparison of time and space complexities
Informed (heuristic) search strategies
— Heuristic functions and their role in search
— Heuristic evaluations (admissibility and consistency)
— Greedy best-first search
— A* search algorithm
— Analysis of performance and optimality
Local search and optimization problems
— Motivation for local search (large or infinite state spaces)
— Hill-climbing and variants
— Simulated annealing
— Genetic algorithms and other evolutionary methods
— Local search in continuous spaces
Search in complex and uncertain environments
— Dealing with nondeterministic actions
— Search in partially observable environments
— Online search agents for unknown environments
Constraint satisfaction problems (CSPs)
— Defining CSPs (variables, domains, constraints)
— Constraint propagation and inference in CSPs
— Backtracking search for CSPs
— Local search for CSPs
— The structure of problems (constraint graphs, problem decomposition)
Adversarial search and games
— Motivation for adversarial search (two-player games, competitive environments)
— Optimal decisions in games (minimax)
— Heuristic alpha-beta tree search and pruning
— Monte Carlo tree search
— Stochastic games and partially observable games
— Limitations of game search algorithms
Classical search and its extensions
— Review of classic search algorithms
— When to use informed vs. uninformed search
— Iterative deepening search strategies
— Bidirectional search
Advanced topics
— Algorithmic problem solving at scale
— Enhancing heuristic functions (pattern databases, learning heuristics)
— Incorporating domain-specific knowledge
— Memory-bounded search (IDA*, SMA*)
— Dealing with real-time constraints

In what follows, I will expand extensively on each section, aiming to provide a robust understanding of AI search for experienced data scientists and ML engineers.

fundamentals of problem-solving by searching

defining the search problem (states, actions, goals)

At the heart of any AI search method is the formal definition of a problem as a set of states, a set of actions that cause state transitions, and one or more goal conditions. A "state" is an abstract representation of the configuration of the system at a given point. For example, in a sliding tile puzzle, a state might be an arrangement of the tiles on the board; in a pathfinding scenario, a state might be a particular location on a map along with certain resources or constraints.

An "action" is a choice that transforms one state into another. Actions can have preconditions (requirements that must be met before the action is applicable), and they may have effects (how the state changes once the action is applied). A "goal" condition is a set of constraints or properties that define when the problem is solved. For instance, the goal in the sliding tile puzzle might be to arrange the tiles in ascending numeric order.

The interplay of states, actions, and goals frames the search space. Agents systematically explore this space to find a path from an initial state to a goal state.

Formally, a search problem can be defined as a tuple $(S, A, c, s_0, G)$ where:

$S$ is the set of possible states.
$A$ is the set of actions or operators that transition between states.
$c(s, a, s')$ defines the cost of applying action $a$ in state $s$ that leads to state $s'$ .
$s_0$ is the initial state.
$G \subseteq S$ is the set of goal states.

state space representation

The "state space" is the conceptual or literal graph whose nodes correspond to states, and edges correspond to the application of actions. Often, the major challenge lies in how to encode states in a way that is both expressive enough to capture the problem constraints, yet lean enough to be computationally feasible. If the state representation is too large or complex, the search problem can become intractable due to exponential growth in the number of possible states.

For instance, in a scheduling problem, each state can represent the assignment of tasks to resources. However, enumerating all possible assignments might be infeasible if the number of tasks and resources is large. Hence, effective representation and subsequent search methods are essential for real-world applicability.

search trees vs. search graphs

When an AI agent starts from an initial state $s_0$ and branches out by applying actions, we often visualize a "search tree". Each node in the tree represents a state, and each edge corresponds to an action. A search tree is an excellent conceptual tool because it underscores the branching factor — i.e., the average number of successors for each state — and it helps illustrate naive search expansions. However, some states can be reached by multiple paths, which would be represented multiple times in a pure tree.

A "search graph" is a more compact representation that recognizes repeated states, effectively merging paths that arrive at the same state. While search trees are simpler to illustrate conceptually, many practical algorithms internally keep track of visited states to avoid redundant computations.

example problems

Search as a problem-solving framework spans a wide array of applications:

Pathfinding in grids and graphs: Classic route-finding, navigation, or robotics tasks.
Puzzle solving: Sliding tile puzzles, Rubik's Cube solutions, or Sudoku.
Scheduling and planning: Coordinating tasks under deadlines and resource constraints.
Network optimization: Optimizing flow across a network subject to capacity constraints (sometimes solved by specialized algorithms but grounded in the same conceptual framing of states and transitions).

Historically, the well-known "8-puzzle" or "15-puzzle" has served as a popular test for search algorithms, illustrating the complexity that can emerge even from seemingly small or simple problem definitions.

algorithmic problem-solving basics

A systematic way to solve search problems typically involves the following steps:

Formulate the problem precisely (i.e., define states, actions, costs, and goal conditions).
Define or select an algorithmic strategy for exploring the state space.
Execute the search procedure (which might be BFS, DFS, A*, or a more specialized strategy).
Return the path or sequence of actions once a goal state is reached (if one exists).

One must also be prepared to address practical concerns such as memory usage, computational complexity, suboptimal solutions, or real-time constraints that limit search depth or expansions. In the sections that follow, I'll cover various strategies that differ mainly by how they expand states and in what order, whether or not they use heuristics, and how they cope with large or partially known search spaces.

characteristics and limitations of uninformed search

Uninformed search — also known as "blind search" — refers to a category of algorithms that do not exploit additional knowledge about the domain; they do not know how "good" or "close" a state is to the goal. Instead, they blindly explore the state space in a systematic manner. While uninformed methods guarantee certain completeness properties (given enough time and memory), they can be dramatically inefficient. Situations with large state spaces or complicated constraints might render uninformed approaches unusable in practice. Nonetheless, they form the foundation upon which more sophisticated approaches are built, and they are crucial for understanding key concepts like branching factor, depth, and memory usage.

breadth-first search (BFS)

Breadth-first search systematically explores all states at a given depth before moving to the next depth level. This method ensures that if a solution exists at the shallowest level in the search tree, BFS will find it first.

Algorithm (informal outline):

Begin at the initial state $s_0$ .
Place $s_0$ in a queue (FIFO structure).
While the queue is not empty:
- Dequeue a state $s$ .
- If $s$ is a goal state, return the solution.
- Otherwise, enqueue all unvisited successors of $s$ .

Here is a minimal Python example to illustrate BFS:


from collections import deque

def bfs(start, goal_test, get_successors):
    """
    start: initial state
    goal_test: function(state) -> bool
    get_successors: function(state) -> list of successor states
    """
    visited = set()
    queue = deque([start])
    
    # parent dict to reconstruct path if needed
    parent = {start: None}
    
    while queue:
        current = queue.popleft()
        if goal_test(current):
            # Reconstruct path
            path = []
            while current is not None:
                path.append(current)
                current = parent[current]
            path.reverse()
            return path
        
        visited.add(current)
        
        for succ in get_successors(current):
            if succ not in visited and succ not in queue:
                parent[succ] = current
                queue.append(succ)
                
    return None  # no solution found

BFS is complete if the branching factor is finite. The time and space complexity can be expressed in terms of the branching factor $b$ and the depth of the shallowest goal $d$ . In the worst case, BFS visits on the order of $O(b^d)$ nodes, which can be prohibitive. Nonetheless, BFS remains popular in scenarios where the state space is small or when we prioritize finding the shallowest solution.

depth-first search (DFS)

Depth-first search, in contrast, explores one branch of the tree deeply before backtracking. This approach is easy to implement using recursion or a stack, and it uses far less memory than BFS. However, DFS can get stuck exploring an extremely deep (or infinite) branch, making it incomplete for some unbounded state spaces.

Algorithm (informal outline):

Start at $s_0$ .
Follow one child from $s_0$ to the greatest depth until you reach a goal or can go no further.
Backtrack to the first sibling node and continue.

A simple Python DFS (recursive version) might look like this:


def dfs(start, goal_test, get_successors, visited=None):
    if visited is None:
        visited = set()
    visited.add(start)
    
    if goal_test(start):
        return [start]
    
    for succ in get_successors(start):
        if succ not in visited:
            path = dfs(succ, goal_test, get_successors, visited)
            if path is not None:
                return [start] + path
    
    return None

DFS has a worst-case time complexity of $O(b^m)$ where $m$ is the maximum depth of the search tree. If the search space is deep or unbounded, this can be extremely large. However, DFS typically has more manageable memory requirements (proportional to $O(bm)$ in the worst case).

uniform-cost search (dijkstra's algorithm)

Uniform-cost search expands the node with the lowest path cost $g(n)$ from the initial state to $n$ . It is in fact Dijkstra's algorithm applied to search graphs, commonly used to find the shortest path in a weighted graph. Instead of searching by shallowest depth, it searches by the least cumulative cost so far.

Key idea: Keep a priority queue (often implemented as a min-heap) that orders states by their cumulative path cost. The algorithm always expands the node that has the smallest $g(n)$ . Thus, if costs are non-negative, uniform-cost search is guaranteed to find the optimal solution.

Uniform-cost search often outperforms BFS in cases where the step costs differ because BFS implicitly assumes uniform step cost. However, uniform-cost search can degrade in performance if there are many small-cost edges near the root, as it could expand a large set of low-cost partial paths.

comparison of time and space complexities

Uninformed search strategies often suffer from exponential time complexity relative to the search depth. The choice between BFS, DFS, and uniform-cost search will hinge on whether:

The solution is known to be at a shallow depth (BFS might be favored).
We have a limited memory budget (DFS might be the only feasible option).
We care about path cost optimality (uniform-cost search is essential if costs vary).

A general complexity measure for BFS is $O(b^d)$ , while DFS is $O(b^m)$ where $m$ can be much larger than $d$ . Uniform-cost search complexity depends on the path cost distribution and still might become large, but it guarantees optimality under non-negative edge costs.

informed (heuristic) search strategies

heuristic functions and their role in search

A "heuristic function" $h(n)$ provides an estimate of the cost (or distance) from node $n$ to a goal. It encodes additional domain knowledge that helps guide search toward the goal more efficiently. Well-chosen heuristics can dramatically reduce the number of states visited, often turning an otherwise intractable problem into something solvable in practice.

Heuristics are particularly well-studied in puzzles and pathfinding problems. For instance, in the "8-puzzle," a common heuristic is the sum of Manhattan distances of each tile from its goal position, ignoring potential interactions between tiles. This heuristic is admissible, meaning it never overestimates the actual distance to the goal. Admissible heuristics ensure that certain algorithms (like A*) will find an optimal solution.

heuristic evaluations (admissibility and consistency)

Two critical properties for heuristics in the context of informed search algorithms are:

Admissibility: A heuristic $h(n)$ is admissible if $h(n) \leq h^*(n)$ for all $n$ (where $h^*(n)$ is the true minimum cost to reach a goal from $n$ ). This ensures the heuristic never overestimates the cost.
Consistency (or Monotonicity): A heuristic is consistent if $h(n) \leq c(n, a, n') + h(n')$ for every successor $n'$ of $n$ generated by action $a$ with cost $c(n, a, n')$ . Consistency implies admissibility, and it ensures that the values of the heuristic function do not "jump" in a way that can break search algorithms.

greedy best-first search

Greedy best-first search expands nodes in the order of their heuristic value $h(n)$ alone. In other words, it always pursues the path that appears to lead most directly to the goal, ignoring the cost it took to get there.

Strength: It can be extremely fast in guiding the search toward the goal if the heuristic is strong.
Weakness: It is not optimal and can sometimes behave like a naive depth-first approach, wandering down a path that seems promising according to $h(n)$ but is actually expensive overall.

a* search algorithm

A* is one of the most famous and widely used informed search algorithms, balancing both the cost so far and the heuristic estimate:

f(n) = g(n) + h(n)

where $g(n)$ is the cost from the initial state to $n$ , and $h(n)$ is the heuristic estimate from $n$ to the goal. A* always expands the node with the smallest $f(n)$ . With an admissible heuristic $h$ , A* is guaranteed to find an optimal solution. If $h$ is also consistent, A* is optimally efficient among optimal algorithms (no other optimal algorithm can expand fewer states than A* does with a consistent heuristic).

Pseudocode snippet:


import heapq

def a_star_search(start, goal_test, get_successors, h):
    """
    start: initial state
    goal_test: function(state) -> bool
    get_successors: function(state) -> list of (successor, cost)
    h: heuristic function(state) -> float
    """
    open_list = []
    heapq.heappush(open_list, (h(start), 0, start))  # (f(n), g(n), state)
    
    closed_set = set()
    parent = {start: None}
    
    while open_list:
        f_val, g_val, current = heapq.heappop(open_list)
        
        if goal_test(current):
            # reconstruct path
            path = []
            while current is not None:
                path.append(current)
                current = parent[current]
            path.reverse()
            return path
        
        closed_set.add(current)
        
        for succ, cost in get_successors(current):
            new_g = g_val + cost
            new_f = new_g + h(succ)
            
            if succ in closed_set:
                continue
            
            # You can also keep a visited dict to store best g-values so far 
            heapq.heappush(open_list, (new_f, new_g, succ))
            if succ not in parent:
                parent[succ] = current
    
    return None

A*'s performance depends heavily on the quality of the heuristic. A perfect heuristic that exactly computes the remaining distance (i.e., $h^*(n)$ ) will guide the search in a straight line toward the goal, effectively turning A* into a uniform-cost search in a drastically reduced state space.

analysis of performance and optimality

With an admissible heuristic, A* is guaranteed to find an optimal solution if the branching factor is finite. Its time complexity can still be exponential in the worst case, but in many practical applications (like real-world pathfinding or moderately sized puzzles), it performs impressively. The memory footprint can also become a limiting factor since it stores all generated nodes in memory (open and closed sets). Variants like IDA* (Iterative Deepening A*) address this issue, which I will touch on in the "Advanced Topics" section.

local search and optimization problems

motivation for local search (large or infinite state spaces)

For real-world problems that can involve extremely large or even unbounded state spaces, traditional systematic searches (like BFS, DFS, or A*) can run out of memory or time. Moreover, some problems, such as continuous optimization tasks, do not even have a neatly defined discrete state space to traverse in a graph-like manner. In these cases, local search methods provide a viable solution by iteratively improving (or attempting to improve) a single candidate state (or a small population of states in the case of evolutionary methods) until a stopping criterion is reached.

Local search is also quite practical for problems that do not require an absolutely optimal solution but instead a "good enough" solution. This is typical in many scheduling and design optimization tasks.

hill-climbing and variants

Hill-climbing is perhaps the simplest local search technique. Starting from an initial candidate, it attempts small modifications (moves) that improve the objective or heuristic value, continuing until no neighbor offers improvement. This approach can easily get stuck in local optima, hence the existence of variants:

Stochastic hill-climbing: Randomly chooses among neighboring states that improve the evaluation, introducing some randomness.
First-choice hill-climbing: Evaluates neighbors in random order until it finds a move that results in a net improvement, typically used when the branching factor is large.
Random restarts: Repeatedly restarts hill-climbing from random initial states to attempt escaping local optima.

simulated annealing

Simulated annealing is inspired by the physical process of heating and slowly cooling metal to relieve internal stresses and achieve a strong crystal lattice. In the algorithmic analogy, the "temperature" parameter starts high, allowing random moves that may worsen the objective function, and gradually decreases to reduce the probability of such moves. This mechanism helps the search escape local optima by occasionally allowing a "bad" move with a probability that decreases over time.

A commonly used acceptance probability for a move from state $s$ to state $s'$ with evaluation difference $\Delta E = E(s') - E(s)$ is:

P(\Delta E) = \exp\left(-\frac{\Delta E}{T}\right)

where $T$ is the current temperature. If $\Delta E < 0$ (the move is better), it is always accepted.

genetic algorithms and other evolutionary methods

Genetic algorithms (GAs) operate on a population of candidate solutions (often encoded as "chromosomes") and use biologically inspired operations — selection, crossover, mutation — to iteratively evolve better solutions. The main steps are:

Initialization of the population.
Selection of the fittest individuals to reproduce.
Crossover between selected individuals to create new offspring.
Mutation of offspring to maintain diversity in the population.
Replacement of part of the population with new offspring.

Over many generations, GAs can yield high-quality solutions even for complex search spaces, given sufficient time and a well-designed fitness function.

local search in continuous spaces

Local search algorithms also extend to continuous domains. Gradient-based methods like gradient descent or stochastic gradient descent (though typically used in machine learning contexts) can be seen as local search strategies that move in the direction of steepest descent in the solution space. Such methods can be combined with heuristics, momentum terms, or advanced adaptation techniques (e.g., Adam optimizer in deep learning) to refine local search performance in high-dimensional continuous optimization.

Local search algorithms are essential in engineering, finance, and operations research, where solution spaces might be enormous, and a thoroughly systematic search would be entirely impractical. Although local search does not guarantee an optimal solution, it is often one of the few feasible approaches for large-scale problems.

search in complex and uncertain environments

dealing with nondeterministic actions

In real-world scenarios, actions might not always have a single guaranteed effect. A delivery robot, for instance, could slip or face unexpected obstructions. In such nondeterministic environments, an action might lead to several possible outcomes. When formulating a search problem, we can represent this by branching into multiple successor states for each action, each with a certain probability or range of outcomes.

Algorithms like AND-OR search can handle environments in which the agent must succeed under the worst-case outcome or must handle branching contingencies. If there is a nontrivial probability associated with each outcome, we can incorporate expected cost or utility in the search framework.

search in partially observable environments

When the agent does not fully observe the underlying state, we can adopt a belief state approach, in which each "state" in our search representation is actually a set (or distribution) of possible real states. Searching in the space of belief states often leads to combinatorial explosions, but techniques like Partially Observable Markov Decision Processes (POMDPs) provide a framework for acting optimally under uncertainty and partial observability.

online search agents for unknown environments

Sometimes the agent must act before it has complete knowledge of the environment, known as online search. Here, the agent might not know the transition model or the cost structure in advance. It must actively explore and discover the environment while still striving toward a goal. Online versions of BFS or DFS can be used, but they need to incorporate real-time updates to the state space. Some algorithms maintain a learned map of the environment and replan incrementally (e.g., LRTA*, or Learning Real-Time A*, which updates its heuristic values based on local experiences).

constraint satisfaction problems (csps)

defining csps (variables, domains, constraints)

Constraint Satisfaction Problems are a specialized form of search where states are defined by the values of a fixed set of variables. The goal is to find an assignment of values to variables that satisfies all the given constraints. Examples include map coloring (assigning colors to regions on a map so that no two adjacent regions have the same color), cryptarithmetic puzzles, Sudoku, and various scheduling and resource allocation tasks.

A CSP is typically specified by:

A set of variables $X = \{ x_1, x_2, \dots, x_n \}$ .
Domains for each variable $D(x_i)$ .
Constraints $C$ that specify allowable combinations of values.

constraint propagation and inference in csps

Instead of blindly enumerating all possible value assignments, CSP solvers can use constraint propagation to reduce the search space. By systematically applying constraints, we can eliminate impossible or inconsistent values early. For example, Arc Consistency algorithms (like AC-3) propagate domain reductions: if a certain value of $x$ is incompatible with every possible value of $y$ , then we can remove that value from $x$ 's domain.

backtracking search for csps

Backtracking is essentially a depth-first search that attempts to assign values to variables sequentially. If a partial assignment violates any constraint, the algorithm backtracks immediately. Intelligent ordering heuristics (like the minimum remaining values heuristic) can greatly enhance efficiency by choosing the variable that is most constrained first.

Example: Suppose you have a CSP with variables $x_1, x_2, x_3$ each with domain 3, and constraints that $x_1 \neq x_2$ , $x_2 \neq x_3$ , and $x_1 \neq x_3$ . A naive approach enumerates all combinations (27 total). A backtracking approach with intelligent ordering might reduce the expansions drastically by pruning inconsistent partial solutions as soon as they appear.

local search for csps

Local search can also solve CSPs by starting with a complete but inconsistent assignment (e.g., assign all variables random values) and then iteratively making changes that reduce the number of violated constraints. For example, in a map-coloring CSP, you could randomly color each region, then repeatedly pick a conflicted region and change its color to reduce the conflict count. This is the essence of the min-conflicts algorithm, famously applied to the n-queens problem for extremely large n.

the structure of problems (constraint graphs, problem decomposition)

CSPs can be visualized using constraint graphs, where each node is a variable, and edges denote constraints between variables. Some graphs have specific structures (like trees or bipartite forms) that admit polynomial-time solutions (tree-structured CSPs can be solved efficiently via dynamic programming over the tree). More advanced decomposition methods, like cycle cutset or tree decomposition, can also reduce the complexity of certain CSPs by transforming them into simpler subproblems.

adversarial search and games

motivation for adversarial search (two-player games, competitive environments)

Adversarial search arises in game-playing scenarios where two (or more) agents compete. One agent's gain is often another's loss, so the search problem is no longer about finding a single path to a goal but maximizing a utility function in the presence of an opponent who tries to minimize it.

Classic examples include board games like chess, checkers, Go, or tic-tac-toe. The hallmark of adversarial search is the presence of a second player (or multiple players) whose actions directly impact your outcome.

optimal decisions in games (minimax)

Minimax is the foundational technique for deterministic, perfect-information games involving two players: the "maximizing" player tries to maximize the score, while the "minimizing" player tries to minimize it. A minimax tree alternates layers labeled "max" and "min". The leaf nodes are assigned utility values (the outcome from the perspective of the maximizing player), and these values get backed up to the root by alternating min or max operations at each layer.

Formal definition:

\text{Minimax}(n) = \begin{cases} \max_{n' \in \text{children}(n)} \text{Minimax}(n'), & \text{if } n \text{ is a max node};\\[6pt] \min_{n' \in \text{children}(n)} \text{Minimax}(n'), & \text{if } n \text{ is a min node}. \end{cases}

This works perfectly for small game trees but becomes infeasible for large games like chess or Go because the branching factor is huge.

heuristic alpha-beta tree search and pruning

Alpha-beta pruning enhances minimax search by pruning branches that cannot affect the final decision. It maintains two bounds: alpha (the best value so far for the maximizing player) and beta (the best value so far for the minimizing player). When alpha becomes greater than or equal to beta, further exploration of that branch can be cut off. This yields the same result as pure minimax but often with much fewer node expansions, particularly in well-ordered trees.

monte carlo tree search

Monte Carlo Tree Search (MCTS) gained prominence through successes in games with enormous branching factors, such as Go (famously used in AlphaGo by DeepMind). MCTS incrementally builds a game tree through repeated random simulations. The four phases of MCTS are:

Selection: Start at the root and select child nodes according to some policy (e.g., UCB1) until you reach a leaf.
Expansion: Expand the leaf node by generating one or more child nodes.
Simulation: Play out random moves from the expanded node until a terminal state is reached.
Backpropagation: Propagate the simulation result back up the tree, updating statistics (win rates, Q-values, etc.) at each node.

After many iterations, the most visited or highest-value child of the root is typically chosen as the next move. MCTS is widely cited in top-tier AI game research (Browne and gang, IEEE Trans. on CI in Games 2012).

stochastic games and partially observable games

In some games, chance (dice rolls, card draws) or hidden information (opponent's cards) must be taken into account, leading to a more general search approach that includes chance nodes in the game tree or a belief-based representation. Pioneering approaches combine minimax or alpha-beta style expansions with expected utility calculations to handle chance nodes effectively (e.g., searching game trees for backgammon or poker).

limitations of game search algorithms

Even with alpha-beta pruning or MCTS, game search can run into severe computational challenges. Real games like chess have an astronomically large state space. Heuristic evaluation functions become indispensable, domain knowledge is often incorporated (e.g., advanced piece-value heuristics in chess), and specialized data structures (like transposition tables) help avoid repeated expansions of identical positions. Nonetheless, building a strong game-playing AI often demands sophisticated engineering, search enhancements, and efficient parallelization.

classical search and its extensions

review of classic search algorithms (bfs, dfs, uniform cost, a*)

Classical search provides the building blocks for problem-solving in AI:

BFS: Guaranteed to find the shallowest solution, but can be memory-intensive.
DFS: Efficient memory usage, but can get stuck in deep or infinite paths.
Uniform-cost search: Extends BFS to handle varying costs, guaranteeing an optimal solution.
A*: Incorporates heuristics to guide search more intelligently.

when to use informed vs. uninformed search

Informed (heuristic) search methods (like A*) should be used whenever a useful heuristic function is available. If no heuristic is readily available or one can't be reliably derived, uninformed approaches may be the only choice. However, in real-world projects, domain knowledge is often gleaned from subject matter experts to craft heuristics, or we might adopt learning methods to generate approximate heuristics.

iterative deepening search strategies

Iterative deepening depth-first search (IDDFS) is a strategy that runs DFS with a depth limit of $l$ and increments $l$ in each iteration until the goal is found. This approach combines the memory advantage of DFS with the completeness of BFS. IDDFS re-explores states multiple times but still often proves effective, especially when the search space is large but solutions exist at relatively shallow depths.

bidirectional search

Bidirectional search runs two simultaneous searches: one forward from the initial state and one backward from the goal (or a set of goals). By meeting in the middle, the search can drastically reduce the search space from $O(b^d)$ to approximately $O(b^{d/2})$ in each direction, yielding $O(b^{d/2})$ overall. The challenge often lies in implementing the backward search if the goal is not a single explicit state, or if the actions are not easily reversible.

advanced topics

algorithmic problem solving at scale

Real-world problems often require searching huge state spaces under strict time or memory constraints. Researchers in conferences like NeurIPS or ICML regularly propose advanced search methods that incorporate sampling, parallelization, or advanced data structures to handle scale:

Parallel and distributed search: Use multiple processors/GPUs to expand states in parallel.
Hierarchical planning: Divide a problem into subproblems or layers.
Symbolic search: Represent states and expansions in a compressed symbolic form (Binary Decision Diagrams, for example) to handle large state spaces.

enhancing heuristic functions (pattern databases, learning heuristics)

Well-designed heuristic functions can drastically improve search. "Pattern databases" store exact solutions for smaller subproblems and use them to create perfect heuristics for partial configurations. For instance, in the 15-puzzle, a pattern database might store the exact moves needed to place a subset of tiles correctly, ignoring others. Summing multiple pattern databases (carefully to preserve admissibility) can offer powerful heuristics.

Machine learning techniques can also learn heuristics from experience. Reinforcement learning or self-play methods (like those used in AlphaZero) can generalize heuristics for complex domains, surpassing human-engineered knowledge in some cases.

incorporating domain-specific knowledge

A large part of success in AI search often hinges on domain-specific knowledge that drastically prunes the state space. In scheduling, for instance, knowledge about priority constraints or typical resource bottlenecks can guide the search away from obviously infeasible or suboptimal paths. In puzzle solving, specific transformations or known patterns can skip large swaths of states.

memory-bounded search (ida, sma)

To handle the excessive memory usage of algorithms like A*, researchers introduced depth-first variants that iteratively deepen on $f(n)$ instead of depth:

IDA* (Iterative Deepening A*) uses successive depth limits in terms of the $f(n)$ value, not the level in the tree.
SMA* (Simplified Memory-bounded A*) is similar to A* but discards the worst node when memory is full, storing only essential parts of the tree.

These methods aim to preserve the optimality of A* while managing memory constraints more gracefully.

dealing with real-time constraints

Sometimes, the agent must produce an action within a strict time limit (real-time decision-making). In these cases, the search might be truncated after exploring some limited number of states. A partial plan can be executed, with re-planning happening on the fly. Real-time heuristic search algorithms (like RTA*, LRTA*) adapt the heuristic based on actual outcomes in the environment, gradually improving their estimates over repeated trials.

This style of search is especially relevant in robotics and real-time strategy games, where actions must be taken continuously, and the environment might change faster than a comprehensive search can complete.

putting it all together

AI search is more than just BFS or A*; it's a broad family of algorithms and techniques designed to navigate complex spaces of possibilities and to systematically (or sometimes stochastically) pinpoint solutions in an efficient manner. Search methods range from naive to highly informed, from exhaustive expansions to local or evolutionary methods, from deterministic to probabilistic approaches, and from single-agent optimization to adversarial game-playing scenarios. Understanding these methods — as well as how to select, configure, or combine them — is critical for designing intelligent agents capable of tackling a variety of tasks.

Modern research extends classical search concepts in fascinating ways. For instance, advanced heuristics might be learned via neural networks, as seen in the latest chess, shogi, or Go engines (Silver and gang, Science 2018), where search complements deep learning. Constraint satisfaction has advanced to handle dynamic and distributed CSPs, and local search has evolved into large-scale optimization frameworks used in myriad industries. Adversarial search strategies form the foundation of many game-playing AIs and even negotiation or bidding agents in economic domains. Meanwhile, real-time search, belief state representation, or hierarchical expansions cater to partial observability and uncertain or rapidly changing environments.

In practice, the best approach to any search problem often involves a combination of these methods, along with domain-specific tricks, data structures (like transposition tables or pattern databases), or real-time learning of heuristics. Many of the underlying frameworks introduced here continue to inspire new research directions, from partial expansions to incorporate machine learning predictions, to memory-bounded or distributed solutions for massive search tasks, to synergy with reinforcement learning and generative models for complex decision-making.

illustrative figures and code

Below, I'll add a few schematic images to help visualize key concepts:

An image was requested, but the frog was found.

Alt: "Example of a search tree"

Caption: "A basic search tree expanding from an initial state. Each branching factor leads to a new set of states. Dashed lines can represent pruned branches."

Error type: missing path

An image was requested, but the frog was found.

Alt: "Constraint graph illustration"

Caption: "Constraint graph for a simple coloring problem. Each node is a region/variable, edges denote adjacency constraints."

Error type: missing path

And, for completeness, here's a more detailed code snippet demonstrating a local search approach (a simple hill-climbing variant with random restarts for a TSP-like problem in pseudocode form):


import random

def cost_function(solution, distances):
    total_dist = 0
    for i in range(len(solution) - 1):
        total_dist += distances[solution[i]][solution[i+1]]
    # Add distance for returning to start if needed
    total_dist += distances[solution[-1]][solution[0]]
    return total_dist

def get_random_neighbor(solution):
    # swap two random indices
    new_solution = solution[:]
    i, j = random.sample(range(len(solution)), 2)
    new_solution[i], new_solution[j] = new_solution[j], new_solution[i]
    return new_solution

def hill_climb_random_restart(nodes, distances, num_restarts=10):
    best_solution = None
    best_cost = float('inf')
    
    for _ in range(num_restarts):
        # Generate a random initial solution
        current_solution = nodes[:]
        random.shuffle(current_solution)
        current_cost = cost_function(current_solution, distances)
        
        improved = True
        while improved:
            improved = False
            neighbor = get_random_neighbor(current_solution)
            neighbor_cost = cost_function(neighbor, distances)
            
            if neighbor_cost < current_cost:
                current_solution = neighbor
                current_cost = neighbor_cost
                improved = True
        
        if current_cost < best_cost:
            best_cost = current_cost
            best_solution = current_solution[:]
    
    return best_solution, best_cost

This code outlines a basic approach: for a given set of nodes (cities) and a distance matrix, it tries to reorder them to minimize the total travel distance. It performs multiple restarts to avoid local minima. In practice, you can refine such a method with more sophisticated strategies, advanced metaheuristics, or domain-specific constraints and optimizations.

final remarks

Search is a fundamental pillar of AI, providing general-purpose methods to systematically explore sets of possible solutions. Practically every AI curriculum includes search early on because it not only clarifies how intelligent agents tackle problems but also reveals the key challenges — like exponential growth in complexity and the potential for sophisticated heuristic guidance or partial expansions.

Selecting or designing the right search approach involves carefully weighing problem size, available domain knowledge (to craft heuristics or constraints), memory limits, time requirements, and whether we're dealing with single-agent optimization, multiagent adversarial scenarios, or uncertain and partially observable worlds. From BFS to A* to local search, from CSP solvers to adversarial and real-time search, these techniques shape the backbone of problem-solving in AI. Research continues to refine and innovate upon these core concepts, integrating them with modern machine learning and domain-specific knowledge, pushing the frontiers of what search-based AI can achieve.

I hope this comprehensive deep dive into AI search strategies provides clarity and sparks ideas for applying these concepts to your own projects, research, and further study.