Research on the Bias Sampling RRT Algorithm for Supermarket Chain Distribution Routes under the O2O Model

: The purpose of this article is to propose the Bias Sampling RRT algorithm and use it as an optimization algorithm for supermarket chain distribution routes under the O2O model. As a retail store in the transformation and upgrading of chain stores, the actual terrain factors in distribution directly affect the timely delivery of goods from online to offline. The Bias Sampling RRT algorithm, as a path search method for time window vehicle routing problems, can find the optimal path that meets time window constraints. The applicability and effectiveness of the Bias Sampling RRT algorithm have been demonstrated through map simulation calculations. The simulation results show that compared with the RRT method, the Bias Sampling RRT algorithm has a shorter distribution path and shorter distribution time. This method is very suitable for the distribution activities of chain supermarkets or single store retail enterprises in the complex transformation and upgrading of actual terrain.


Introduction
According to a survey of five major supermarket chains implementing the O2O e-commerce model, the issue of timely delivery of the "last mile" is a "stuck neck" issue.This problem can be summarized as a time window vehicle routing problem.The time window vehicle routing problem refers to the need to allocate multiple vehicles to different destinations for distribution during the logistics or distribution process, and each destination has a specific time window, which specifies the time range for vehicles to arrive at that destination.The time window issue mainly involves the time limit for a product to reach the final consumer within a specific time period from the production location.This time limit may be determined by various factors, such as product shelf life, market demand, transportation time, and cost.Transportation time is an important influencing factor, and the selection of routes directly affects transportation time, provided that distribution destinations are specified.In order to achieve the optimal distribution strategy and schedule, enterprises must ensure that the transportation time is not too long.Otherwise, if the product cannot reach consumers within the specified time, it may lose sales opportunities or result in additional costs and losses.
Distribution distance refers to the distance traveled by a product during transportation and distribution from the production location to the final consumer.This process may involve multiple intermediate links, such as warehouses, distribution centers, transportation vehicles, etc.The impact of distribution distance on transportation time is obvious, but another more important issue that is easily overlooked is that the same distribution distance, different terrain, and transportation time will also vary.This is the focus of this article.
In classic time window problems, factors to consider include vehicle capacity, travel time, destination time window, and priority of delivering goods.The goal is to find an optimal routing scheme that enables all goods to be delivered within the specified time window, while minimizing the distance and cost of vehicles.
Heuristic algorithms or precise algorithms are commonly used to solve vehicle routing problems in time windows.Heuristic algorithms are usually based on greedy strategies, selecting the next destination according to certain rules and adjusting it according to the destination's time window.The precise algorithm enumerates all possible routing schemes and calculates the cost of each scheme, ultimately selecting the optimal solution.The time window vehicle routing problem has a wide range of applications in the fields of logistics and distribution, such as express delivery, food distribution, and cargo distribution.Through reasonable routing planning, distribution efficiency can be improved, transportation costs can be reduced, and customer satisfaction can be improved.
However, both heuristic algorithms and precise algorithms have made some assumptions, ignoring the changes in actual terrain, resulting in vehicles falling into local optima on the expected path, preventing them from exiting as soon as possible, and resulting in the inability to achieve the destination's time window.In summary, this article proposes an improved RRT algorithm for vehicle routing problems with time windows.

Vehicle Routing Problem in Time Window
The time window vehicle routing problem is a research hotspot that has attracted the attention and research of many scholars in the past few decades.Scholars such as Desaulniers have conducted detailed research and analysis on the application of time window constraints in vehicle routing and scheduling problems [1].Solomon proposed a heuristic algorithm based on insertion algorithm for solving vehicle routing problems with time window constraints [2].Subsequently, someone proposed a heuristic algorithm based on taboo search to solve periodic and multi warehouse vehicle routing problems [3].In 2003, Bent proposed a two-stage algorithm based on a hybrid local search algorithm for solving vehicle routing problems with time windows [4].Since 2004, multiple works have provided a detailed introduction to the research progress and challenges of vehicle routing problems with time windows [5,6].These studies provide effective algorithms and methods for solving vehicle routing problems in practical applications.

RRT Algorithm
PATH planning is one of the key technologies for achieving autonomous navigation of unmanned devices, which refers to finding a collision-free path from the starting point to the target point in an environment with obstacle threats, according to certain evaluation criteria.The common path planning algorithms currently available fall into two main categories: sampling-based (e.g., rapidly exploring random tree) and search-based methods (e.g., A-Star).The searchbased approach requires pre-processing of the environmental information in a rasterised manner, and the computational effort of the algorithm grows exponentially with the spatial The Rapidly Exploring Random Tree (RRT) algorithm is a random sampling algorithm proposed by Professor Lavallein 1998 that uses incremental growth [7].The algorithm has the following advantages: by sampling points, the whole state space can be searched with probabilistic completeness; fast expansion and high efficiency of space search; by means of collision detection to make the path avoid the threat of obstacles and avoid modelling the environment space.Based on the above advantages, the RRT algorithm is widely used to solve complex path planning problems.
However, the RRT algorithm also has some drawbacks.In specific, due to the high randomness of the algorithm, the search is not targeted with many redundant points generated.In addition, it is a purely random search algorithm insensitive to the type of environment and when the C-space contains a large number of obstacles or narrow channel constraints, the algorithm converges slowly and its efficiency drops significantly.As a result, RRT can be computationally complex, time-consuming, and easily fall into dead zones.
Kuffner and Lavalle et al. proposed the RRT-Connect algorithm, which is an improved version of the RRT algorithm and can more efficiently perform single query path planning [8].Lavalle used the RRT-Connect algorithm to solve path planning problems under dynamic constraints [9].In 2010, Karaman et al. proposed the RRT * algorithm, which achieved more efficient and optimized motion planning through incremental sampling and reconnection strategies [10].In 2019, Li, Y., Sun, and others proposed the RRT * -Smart algorithm, which introduced an intelligent sampling strategy on top of the RRT * algorithm, further improving the efficiency and quality of path planning.
These studies provide effective algorithms and methods for path planning problems, and have been widely applied in fields such as robot navigation and autonomous driving.

RRT Algorithm and Time Window for Vehicle Routing Problem
The RRT algorithm and time window for vehicle routing problem are two different research objects in different fields, with both connections and differences.The connection between them is manifested as: they both belong to path planning problems; the RRT algorithm and time window vehicle routing problem are specific applications of path planning problems.They all involve finding an optimal path or satisfying specific constraints in a given environment.The differences between them are manifested as follows: (1) Different research fields: RRT algorithm is mainly applied in fields such as robot navigation and autonomous driving, aiming to find the optimal path for robots or vehicles from the starting point to the target point.The time window vehicle routing problem is mainly applied in the fields of logistics and transportation, aiming to determine the route and scheduling of a group of vehicles to meet the customer's time window constraints.
The problem definition is different: RRT algorithm focuses on random exploration and search in a given environment to discover feasible paths.The time window vehicle routing problem involves the optimal path planning for transporting goods or passengers from the origin to the destination within a predetermined time window.
Different constraint conditions: RRT algorithms usually do not consider time window constraints, but mainly focus on optimizing path connectivity, obstacle avoidance, and shortest path.The time window vehicle routing problem needs to consider the time window requirements of each customer to ensure delivery or pick-up of passengers within the specified time frame.
Although RRT algorithm and time window vehicle routing problem have different research fields and problem definitions, they both belong to specific applications of path planning problems.In practical applications, the RRT algorithm can be used as a path search method for time window vehicle routing problems to find the optimal path that satisfies time window constraints.To address the aforementioned shortcomings of RRT, a biasing mechanism is added to the original RRT algorithm, replacing pure random sampling to 95% random sampling plus 5% biased sampling, achieving promising results in our simulation experiment.Based on this idea, the RRT bias sampling method was further improved by increasing the weight of the heuristic information provided by the target point in the sampling process.At the same time, an escape mechanism is added to increase the chance of jumping out of the local optimum in order to prevent the path from being trapped in the local optimum due to over-reliance on heuristic information.In addition, by performing collision detection on the line between the current point and the target point, the sampling points directly connected to the target point on the path can be detected in advance, speeding up the convergence of the algorithm.Case studies on maps with typical characteristics demonstrate that our approach significantly reduces path-finding time for similar generated path lengths, implying the effectiveness of simple heuristic information for RRT algorithm optimization.

Method
A. Early detection of direct connection points after each successful node generation and before proceeding to the next sampling, determines whether the last successfully generated node is directly connected to the target endpoint.Collision detection is performed on the line segment between this node and the endpoint.If it is determined to be safe, the node is considered to be directly connected and the endpoint is used as the next sampling point.If the distance between the node and the endpoint is greater than the expand distance, the node is extended by one step towards the endpoint and a new node is generated; otherwise, the endpoint is added to the path and the successfully generated path is returned.
B. Bias Sampling In order to make better use of the heuristic information provided by the target endpoint, the probability of bias sampling is increased to 95% while retaining a 5% probability of random sampling instead of using the original bias sampling strategy.In specific, the relative position relationship between the generated node and the target endpoint recorded after the previous sampling is used in the bias sampling process to determine the extent of the area to be sampled next time, so that the sampled point is closer to the target point with a higher probability.Taking Fig. 1 as an example, the next sampling range will be limited to section A.

C. Escape Mechanism
Increasing the probability of biased sampling means that there may be an over-reliance on the heuristic information provided by the endpoints.This may make the algorithm fall into a local optimum.In addition, in maps with many obstacles, or where obstacles are present in multiple directions over a short distance (like groove structure), large bias sampling weights may lead to a large waste of computational resources and the algorithm may not converge beyond the maximum number of iterations.Therefore, a suitable escape mechanism is necessary to exploit the stochastic nature of the RRT algorithm to exploit its powerful spatial search capabilities.In specific, whenever the cumulative generation of new nodes by means of bias sampling fails three times, the escape mechanism is activated.In the escape mechanism, the location of the new 'pseudoendpoint' that provides the heuristic information is first determined.The rules are as follows: the last successfully generated node is defined as the previous node.As is demonstrated by Fig. 2, the "pseudo-endpoint" is generated with 40% probability at a position where the true endpoint is horizontally symmetric with respect to the previous node, 40% probability at a position where the true endpoint is vertically symmetric with respect to the previous node, and a 20% probability at a position where the true endpoint is centrosymmetric with respect to the previous node.The region to be sampled next is then determined based on the relative position information provided by the pseudoendpoint as, using a similar heuristic as described above.Continuous sampling is performed until a new node is successfully generated.

Result
The proposed improved algorithm was tested 10 times on each of the four designed maps with typical characteristics as shown in Fig. 3.The average path length generated and the average time taken to successfully find the path were calculated and the results are shown in Fig. 4 and Fig. 5.It is worth noting that when using the original RRT algorithm to find a path on Map 2, there is a 60% chance of exceeding its maximum number of iterations and failing to find a path successfully.On maps 3 and 4, this figure is 30% and 20% respectively, while the proposed algorithm is able to find paths in a much shorter time with a success rate close to 100%, demonstrating the superiority of the proposed improved strategy.

Conclusion
Based on experimental results, the proposed biased sampling RRT algorithm based on information provided by endpoints can significantly reduce the time required for path lookup when the generated path lengths are close.In addition, the combination of the proposed escape mechanisms can also increase the probability of successfully finding a path and avoid falling into local optima.The simulation map illustrates different actual distribution paths, and the calculation results show that compared with the RRT method, the distribution paths can be shorter and the distribution time can be shortened.This method is very suitable for the distribution activities of chain supermarkets or single store retail enterprises in the transformation and upgrading of complex terrain.
The limitation of this article is that it fails to simulate real distribution maps and perform algorithm simulation calculations.Future research will draw real maps, identify map samples with terrain differences, use our algorithm for simulation, and obtain comparative results.In order to further optimize the algorithm in this article and accelerate its application in practical distribution work.

Fig 3 .
Fig 3. Four designed maps: 1) map provided by python Robotics 2)map with a groove structure 3) map with mass (24) obstacles 4) map with a narrow channel

Fig 4 .Fig 5 .
Fig 4. Average path length generated by the proposed algorithm and original RRT