NEO4J SEARCH: A* OPTIMIZATION
06 MARCH 2018
Written in 2026, backdated to 2018.
First real performance problem. Vessel tracking with Neo4J hit a wall. Need to find shortest paths in a 13,000-vertex graph. Dijkstra’s search, which Neo4J ships with, slowed to a crawl after 4,000.
Spent better part of the week replacing Dijkstra’s algorithm with A* search—Python shop, junior developer, took more than a day to set up toolchain and build the project. Haversine function as heuristic uses distance between ports as the crow flies to steer search:
private double computeHeuristic(
final double lat1, final double lon1,
final double lat2, final double lon2) {
final int earthRadius = 6371;
final double kmToNM = 0.539957;
final double latDistance = Math.toRadians(lat2 - lat1);
final double lonDistance = Math.toRadians(lon2 - lon1);
final double a = Math.sin(latDistance / 2)
* Math.sin(latDistance / 2)
+ Math.cos(Math.toRadians(lat1))
* Math.cos(Math.toRadians(lat2))
* Math.sin(lonDistance / 2)
* Math.sin(lonDistance / 2);
final double c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));
return earthRadius * c * kmToNM;
}
Core search loop updates costs when better path found:
private void updateCosts(
final int source, final int target,
final double newCost, final double heuristic) {
final double oldCost = gCosts.getOrDefault(target, Double.MAX_VALUE);
if (newCost < oldCost) {
gCosts.put(target, newCost);
fCosts.put(target, newCost + heuristic);
path.put(target, source);
}
}
Verdict: 300x speedup—scaled to 13,000 vertices.
Upstreamed changes: Neo4J v3.4.0 | Full source.
NOTE: Despite impressive gain, performance horizon visible. Unlikely to scale past 16,000 vertices without caching.