1Search-based Planning in RoboticsMaxim Likhachev Research Assistant ProfessorComputer and Information ScienceUniversity of Pennsylvania22Questions are welcome at anypoint3Introduction•I am heading Planning Groupat University of Pennsylvania4Introduction•The robots my group works on:segbotrobotic dog5Introduction•The robots my group works on:mobile manipulatorhousehold robot PR26Introduction•The robots my group works on:unmanned helicopterquadrotor7Introduction•The problem my group works on:•intelligent decision-making-what path to follow -where to look-where/how to move arms-where to fly and where to land…8Typical Robotic ArchitecturePlannerControllerpath/motionmotor commandspose sensors (GPS, Compass,IMU,…)PerceptionLocalizationpose estimateobstacle sensors (Lasers, Stereo camera,…)2D or 3D map estimate9Motion/Path Planning•Task:-findafeasible(andcost-minimal)path/motionfromthecurrentconfigurationoftherobottoitsgoalconfiguration(oroneofitsgoalconfigurations)•Twotypesofconstraints:-environmentalconstraints(e.g.,obstacles)-dynamics/kinematicsconstraintsoftherobot•Generatedmotion/pathshould(objective):-beanyfeasiblepath-minimizecostsuchasdistance,time,energy,risk,…10Motion/Path PlanningExamples (of what is usually referred to as path planning):11Motion/Path PlanningExamples (of what is usually referred to as motion planning):Piano Movers’ problem12Motion/Path PlanningExamples (of what is usually referred to as motion planning):Planned motion for a 6DOF robot arm13Motion/Path PlanningExamples (of what is usually referred to as motion planning):Planned motion for a 20DOF robot armSearch-based Planning•Construct a graph representing the planning problem•Find a (least-cost) path in the graph14Search-based Planning•General: applicable to any problem that can be formulated as finding a least-cost path in a graph•Theoretically well-grounded and thoroughly studied•Optimally efficient methods exist (e.g., BFS, Dijkstra’s search, A*)•Simple to understand and implement15Search-based PlanningSome of the research issues:-how to construct a graph that is compact and represents the problem well-how to search large complex graphs in real-time-how to deal with uncertainty in execution and information16Outline of the Rest of the Talk17•Graph construction•Graph search•Dealing with uncertaintyOutline of the Rest of the Talk18•Graph construction•Graph search•Dealing with uncertaintyGraph Construction•One of the most common approaches: approximate cell decomposition (gridworld)discretizeplanning mapS1S2S3S4S5S6convert into a graphsearch the graph for a least-cost path from sstartto sgoalS1S2S3S4S5S62D graph: each state represents 19Graph Construction•One of the most common approaches: approximate cell decomposition (gridworld)discretizeplanning mapS1S2S3S4S5S6convert into a graphsearch the graph for a least-cost path from sstartto sgoalS1S2S3S4S5S62D graph: each state represents 20Such plans are infeasible for many robotic systemsGraph Construction•Research on constructing graphs with feasible pathsmulti-resolution lattice graphs (joint work with D. Ferguson)21each state represents action templatereplicate it onlineeach transition is feasible(constructed beforehand)outcome state is the center of the corresponding cellGraph Construction•Research on constructing graphs with feasible pathsmulti-resolution lattice graphs (joint work with D. Ferguson)22planning on 4D lattice with Anytime D*:part of efforts by Tartanracing team from CMU for the Urban Challenge 2007 racemotion primitives4D planGraph Construction•Research on constructing compact but rich-enough graphstime-bounded lattice graphs for planning in dynamic environments (joint work with A. Kushleyev)23planning on a graph that combines together: high-D states with time as an additional state variable () low-D states without time variable ()lattice with states that have time as state variables2D gridworldGraph Construction•Research on constructing compact but rich-enough graphstime-bounded lattice graphs for planning in dynamic environments (joint work with A. Kushleyev)24planning on a graph that combines together: high-D states with time as an additional state variable () low-D states without time variable ()Graph Construction•Research on constructing compact, feasible and rich-enough graphsmobile manipulation (joint work with B. Cohen, S. Chitta)25planning on a 4D graph where: each state represents each transition corresponds to a small change in door angle and movementpart of efforts by Willow GarageOutline26•Graph construction•Graph search•Dealing with uncertaintyGraph Search•Theoretically doable using Dijkstra’s (BFS, A*, etc.)27S2S1Sgoal2g=1h=2g= 3h=1g= 5h=02S4S33g= 2h=2g= 5h=11Sstart11g=0h=3S2S1Sgoal2g=2S4S33g= 1Sstart113Challenges:-graph is usually huge (high-D) and doesn’t even fit into memory-edgecosts are often too time-consuming to compute all-plans need to be generated within seconds (real-time constraints)-plans need to be constantly re-generated as robot moves and map is updatedGraph Search28ATRV navigating initially-unknown environmentplanning map and path•Planning in -partially-known environments is a repeated process-dynamic environments is also a repeated processGraph Search29•Planning in -partially-known environments is a repeated process-dynamic environments is also a repeated processplanning in dynamic environmentsGraph Search30planning in dynamic environments•Need to re-plan fast!•Two ways to help with this requirementanytime planning –return the best plan possible within T msecsincremental planning –reuse previous planning effortsGraph Search•Our research on graph search:-graph search that finds a possibly suboptimal solution fast and then refines it until it either runs out of planning time or it finds a provably optimal solution (anytime graph search)-graph search that reuses its previous search efforts and finds a new plan much faster whenever edgecosts change and robot moves (incremental graph search)-graph search that explores huge graphs in randomized fashion and returns solutions with probabilistic guarantees on solution suboptimality (randomized graph search)-…3132•Constructing anytime search based on weighted A*:-find the best path possible given some amount of time for planning-do it by running a series of weighted A* searches with decreasing ε:ε=2.513 expansions solution=11 movesε=1.515 expansions solution=11 movesε=1.020 expansions solution=10 movesGraph Search33•Constructing anytime search based on weighted A*:-find the best path possible given some amount of time for planning-do it by running a series of weighted A* searches with decreasing ε:ε=2.513 expansions solution=11 movesε=1.515 expansions solution=11 movesε=1.020 expansions solution=10 movesGraph Search•After every search it is guaranteed that:–cost(solution) ≤ εcost(optimal solution)34•Constructing anytime search based on weighted A*:-find the best path possible given some amount of time for planning-do it by running a series of weighted A* searches with decreasing ε:ε=2.513 expansions solution=11 movesε=1.515 expansions solution=11 movesε=1.020 expansions solution=10 moves•Inefficient because –many state values remain the same between search iterations–we should be able to reuse the results of previous searchesGraph Search35•Constructing anytime search based on weighted A*:-find the best path possible given some amount of time for planning-do it by running a series of weighted A* searches with decreasing ε:ε=2.513 expansions solution=11 movesε=1.515 expansions solution=11 movesε=1.020 expansions solution=10 moves•ARA*: Anytime version of A* search-an efficient version of the above that reuses state values within any search iterationGraph Search36•A series of weighted A* searches •ARA*: Anytime version of A* searchε=2.513 expansions solution=11 movesε=1.515 expansions solution=11 movesε=1.020 expansions solution=10 movesε=2.513 expansions solution=11 movesε=1.51 expansion solution=11 movesε=1.09 expansions solution=10 movesGraph SearchGraph Search•Our research on graph search:-graph search that finds a possibly suboptimal solution fast and then refines it until it either runs out of planning time or it finds a provably optimal solution (anytime graph search)-graph search that reuses its previous search efforts and finds a new plan much faster whenever edgecosts change and robot moves (incremental graph search)-graph search that explores huge graphs in randomized fashion and returns solutions with probabilistic guarantees on solution suboptimality (randomized graph search)-…37•Incremental search: reuses state values from previous searchescost of least-cost paths to sgoalinitiallycost of least-cost paths to sgoalafter the door turns out to be closedGraph Search•Our research on graph search:-graph search that finds a possibly suboptimal solution fast and then refines it until it either runs out of planning time or it finds a provably optimal solution (anytime graph search)-graph search that reuses its previous search efforts and finds a new plan much faster whenever edgecosts change and robot moves (incremental graph search)-graph search that explores huge graphs in randomized fashion and returns solutions with probabilistic guarantees on solution suboptimality (randomized graph search)-…38•Incremental search: reuses state values from previous searchescost of least-cost paths to sgoalinitiallycost of least-cost paths to sgoalafter the door turns out to be closedOnly the values of states in grey changeGraph Search•Our research on graph search:39•D* Lite: Incremental searchinitial search by backwards A*second search by backwards A*initial search by D* Litesecond search by D* LiteGraph Search•Anytime D*: –decrease and update edge costs at the same time–re-compute a path by reusing previous state-values40part of efforts by Tartanracing team from CMU for the Urban Challenge 2007 raceplanning on 4D lattice ()graph with anytime incremental search Anytime D*:Graph Search•Our research on graph search:41part of efforts by Tartanracing team from CMU for the Urban Challenge 2007 raceplanning on 4D ()graph with anytime incremental search Anytime D* (joint work with D. Ferguson):42Graph Search•Sometimes weighted A* can find a solution easily, sometimes it can not SstartSgoal43Graph Search•Sometimes weighted A* can find a solution easily, sometimes it can not•R*: randomized version of A* –generates dynamically these random points–computes a solution using easy-to-solvetransitions SstartSgoaleasy-to-solveeasy-solveeasy-to-solveeasy-hard-to-solve44Weighted A* with K distant random successors while (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: compute a path s → s’using weighted A*; set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;SstartSgoalOPEN = {sstart}next state to expand: sstartR* Search by Maxim Likhachev & Anthony Stentz45Weighted A* with K distant random successors while (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: compute a path s → s’using weighted A*; set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;SstartSgoalproblem dependent distance (e.g., h(s,s’) = Δ)OPEN = {sstart}next state to expand: sstart46Weighted A* with K distant random successors while (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: compute a path s → s’using weighted A*; set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;SstartSgoalOPEN = {}expanding sstartS2S1S3c(sstart→s1)c(sstart→s2)c(sstart→s3)47Weighted A* with K distant random successors while (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: compute a path s → s’using weighted A*; set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;SstartSgoalOPEN = {s1, s2, s3}next state to expand: s3S2S1S3c(sstart→s1)c(sstart→s2)c(sstart→s3)g(s3) = c(sstart→s3)g(s2) = c(sstart→s2)g(s1) = c(sstart→s1)48Weighted A* with K distant random successors while (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: compute a path s → s’using weighted A*; set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;SstartSgoalOPEN = {s1, s2}expanding s3S2S1S3hard-to-solveR* avoids solving it until REALLY necessaryS5S6S4hard-R* Algorithmwhile (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN,and if empty then from AVOID;ifs comes fromAVOID, then compute a path from its predecessor tos;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: try to compute a path s → s’using weighted A*; if succeed and g(s’) ≤ w h(sstart,s’), set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;otherwise, set g(s’) =g(s) + h(s,s’) and insert s’into AVOID;R* Algorithmwhile (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN,and if empty then from AVOID;ifs comes fromAVOID, then compute a path from its predecessor tos;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: try to compute a path s → s’using weighted A*; if succeed and g(s’) ≤ w h(sstart,s’), set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;otherwise, set g(s’) =g(s) + h(s,s’) and insert s’into AVOID;SstartSgoalOPEN = {s1, s2}AVOID = {}expanding s3S2S1S3S5S6S4R* Algorithmwhile (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN,and if empty then from AVOID;ifs comes fromAVOID, then compute a path from its predecessor tos;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: try to compute a path s → s’using weighted A*; if succeed and g(s’) ≤ w h(sstart,s’), set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;otherwise, set g(s’) =g(s) + h(s,s’) and insert s’into AVOID;SstartSgoalOPEN = {s1, s2, s6, s4}AVOID = {s5, sgoal}next state to expand: s2S2S1S3S5S6S4R* Algorithmwhile (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN,and if empty then from AVOID;ifs comes fromAVOID, then compute a path from its predecessor tos;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: try to compute a path s → s’using weighted A*; if succeed and g(s’) ≤ w h(sstart,s’), set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;otherwise, set g(s’) =g(s) + h(s,s’) and insert s’into AVOID;SstartSgoalOPEN = {s1, s6, s4, sgoal}AVOID = {s5}expanding s2S2S1S3S5S6S4S8S7S9R* Algorithmwhile (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN,and if empty then from AVOID;ifs comes fromAVOID, then compute a path from its predecessor tos;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: try to compute a path s → s’using weighted A*; if succeed and g(s’) ≤ w h(sstart,s’), set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;otherwise, set g(s’) =g(s) + h(s,s’) and insert s’into AVOID;SstartSgoalS2S1S3S5S6S4S8S7S9OPEN = {s1, s6, s4, s7, s8, s9, sgoal}AVOID = {s5}next state to expand: sgoalR* Algorithmwhile (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN,and if empty then from AVOID;ifs comes fromAVOID, then compute a path from its predecessor tos;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: try to compute a path s → s’using weighted A*; if succeed and g(s’) ≤ w h(sstart,s’), set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;otherwise, set g(s’) =g(s) + h(s,s’) and insert s’into AVOID;SstartSgoalS2S1S3S5S6S4S8S7S9OPEN = {s1, s6, s4, s7, s8, s9}AVOID = {s5}expanding sgoalDONE!R* Algorithmwhile (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN,and if empty then from AVOID;ifs comes fromAVOID, then compute a path from its predecessor tos;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: try to compute a path s → s’using weighted A*; if succeed and g(s’) ≤ w h(sstart,s’), set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;otherwise, set g(s’) =g(s) + h(s,s’) and insert s’into AVOID;SstartSgoalS2S1S3S5S6S4S8S7S9OPEN = {s1, s6, s4, s7, s8, s9}AVOID = {s5}next state to expand:DONE!R* Algorithmwhile (sgoalnot expanded)remove swith smallest f(s) = g(s) + w h(s,sgoal)from OPEN,and if empty then from AVOID;ifs comes fromAVOID, then compute a path from its predecessor tos;generate Krandom successors of state sat distance Δ and sgoalif within Δ;for each generated successor s’: try to compute a path s → s’using weighted A*; if succeed and g(s’) ≤ w h(sstart,s’), set g(s’) =g(s) + c(s→s’)and insert s’into OPEN;otherwise, set g(s’) =g(s) + h(s,s’) and insert s’into AVOID;SstartSgoalS2S1S3S5S6S4S8S7S9OPEN = {s1, s6, s4, s7, s8, s9}AVOID = {s5}next state to expand:DONE!Graph Search with R*R* with w = 2R* with w = 1.5R* with w = 1Theoretical Properties of R*•Theorem 1: w-suboptimality in the limit of KSuppose, R* always generates all of the states that lie at distance Δ from s. Then cost(solution) ≤ w cost(optimal solution).•Theorem 2: w-suboptimality w.r.t. the generated tree ГEvery time R* expands a state s, the cost of the found path from sstartto s is no more than g(s) ≤ w cГ(sstart,s), where cГ(sstart,s) is the cost of an optimal path from sstartto s in Г.•Theorem 3: probabilistic w-suboptimality w.r.t. the original graphThe probability that a particular run of R* results in a path whose cost is no more than w2cost(optimal path) is bounded from below by the formula given in the paper (for uniform-cost tree-like domains).R* on Real Robot•planning in 8D ( for each foothold)•heuristic is Euclidean distance from the center of the body to the goal location•intermediate goals in R* are locations of the center of the body (2D)Outline60•Graph construction•Graph search•Dealing with UncertaintyDealing with Uncertainty•Common approach to dealing with uncertainty-assume no uncertainty, plan a path, re-plan during execution-fast deterministic planning-can make use of anytime/incremental/real-time graph searches61Dealing with Uncertainty•Common approach to dealing with uncertainty-assume no uncertainty, plan a path, re-plan during execution-fast deterministic planning-can make use of anytime/incremental/real-time graph searches-but making assumptions can be highly suboptimal62(e.g., ignoring uncertainty in actuation when moving along the cliff)Maxim Likhachev, University of PennsylvaniaMarkov Decision Processes•Dealingwithuncertaintyinexecution-assumingnouncertainty(deterministicgraph):-reasoningaboutpossibilityofslipping(MDP):S1S2S3S4S5S6S1S2S3S4S5S6convert into a graphS1S2S3S4S5S6S2S3S5convert into an MDPS4-an action has more than one outcome-each outcome is associated with probability and cost Maxim Likhachev, University of PennsylvaniaMarkov Decision Processes•Moregeneralexample:S2S1Sgoal2c(s1,a1,sgoal) = 2S4S331Sstart11a1P(sgoal|s1,a1)=0.9P(sgoal|s1,a1)=0.1c(s1,a1,s2) = 2•Robotneedstocomputeapolicyπ:mappingfromastateontoanaction•Optimalpolicyπ*:minimizestheexpectedcost-to-goalMaxim Likhachev, University of PennsylvaniaMarkov Decision ProcessesS2S1Sgoal2c(s1,a1,sgoal) = 2S4S331Sstart11a1P(sgoal|s1,a1)=0.9P(sgoal|s1,a1)=0.1c(s1,a1,s2) = 2•Optimalpolicyπ*:minimizestheexpectedcost-to-goalπ*=argminπE{cost-to-goal}expectation over outcomes•cost-to-goalforπ1=(sstart,s2,s4,s3,sgoal)is1+1+3+1=6•cost-to-goalforπ2=(trytogothroughs1)is:0.9*(1+2+2)+0.9*0.1*(1+2+2+2+2)+0.9*0.1*0.1*(1+2+2+2+2+2+2)+…Dealing with Uncertainty•Our Research on planning with Uncertainty is how to convert solving MDPs into a series of graph searches•How to guarantee convergence•What theoretical properties can be guaranteed•How to employ anytime/incremental graph searches to search these graphsConclusionsSearch-based planning is an nice overlap of graph algorithms (pure Computer Science) and RoboticsResearch challenges in search-based planning:How to construct a compact but rich enough graph representationHow to search this graph given real-time constraints and memory constraintsHow to deal with uncertainty in sensing, execution, environment