Fire is a natural agent of change for our planet’s survival and has the capability to cause devastating effects (economical, societal, environmental, etc) when it encroaches into our daily lives. In the midst of a wildland fire, incident commanders are bombarded with massive amounts of data, accurate or not, and must make real-time decisions on how to allocate available resources to extinguish the fire with minimal damage.
The scenario is modeled as an attacker-defender style game, such that the defender (resources with fire retardants) is protecting its assets (homes, businesses, power plants, etc) while the attacker (wildland fires) is attempting to deliver maximum destruction to those assets. The problem can be formulated in terms of optimal control theory, utilizing the gold standard of optimization, Dynamic Programming (DP), to exhaustively search the solution space for the minimized cost. However, its drawback is directly related to its method of finding the optimal solution: the exhaustive search. The amount of processing time to compute the minimum cost exponentially increases with the complexity of the system. For this reason, the DP approach is generally executed offline for real-world applications. Due to the large solution space of a wildland fire scenario, execution of DP offline is problematic as resource allocation decisions must be made in real-time.
The current research effort seeks to show a new and unique control algorithm, based on Neuro-Fuzzy Dynamic Programming (NFDP), that can nearly replicate the DP algorithm results but can execute in real-time and remain robust to uncertainties. An artificial neural network provides the approximate cost-to-go function for the DP, fulfilling the need for real-time execution. The neural network is trained by approximate policy iteration using Monte Carlo simulations. Since our sensors may provide inaccurate or incomplete data of the environment, a fuzzy logic component is integrated to provide robustness in the system. The problem is also extended to include multiple layers of defense as opposed to a one layer attempt to eliminate the incoming threat. The multi-layered defense requires a unique approach in the NFDP algorithm that calculates future expected costs since a fire must successfully elude three layers of defense to constitute an attack on an asset.
Four control methodologies are examined in the research: a greedy-based heuristic, DP, NDP (Neuro-Dynamic Programming), and NFDP. DP and the heuristic are used as benchmark cases; the premise of the heuristic approach is to protect the highest valued assets at all costs. The control methodologies are compared based on three parameters: processing time, remaining asset health, and scalability. The processing time quantifies the requirement of real-time decisions. The asset health is a measure of how well the defender protected its assets from the attacker. Scalability is how well the algorithm scales with increased complexity. With proper adjustments to the architecture and training techniques of the artificial neural network and fine-tuning of the fuzzy controller parameters, NFDP illustrates its ability to perform real-time decision-making, obtaining near optimal results in the presence of uncertainty in the sensor data, and scales well with increased complexity.