Skip to content
Advertisement

Detect GPS spikes in location data

I have a Dataset from the GPS log of my Google Account, from which I’d like to remove outliers from the CSV that clearly are not meant to be there.

For example, the GPS shows you are at 1,1 > 1,2 > 9,6 > 1,2 > 1,1, so a major variation in location, that a couple seconds later is back to approx where it has been a few seconds back.

I have already tried filtering by velocity of the GPS, but that could remove GPS points that were made whilst flying. This also did not work for when the GPS was normal, then updated a little later and went up to 500km away, stayed there for 10 minutes and then corrected itself, because the moving Velocity would then be low enough to pass the “speed test”.

How would I detect these in a Dataset of around 430k rows? Something like traveling in a plane with very infrequent GPS updates would have to be taken care of as well.

Advertisement

Answer

I have settled on a Hybrid solution.

  • Velocity limit: I used the distance function of the geopy module, to figure out the distance between two gps points. From the timestamp of the csv and the distance I then calculated the Velocity between these points and if it is over a certain threshhold which can be adjusted to your need, it will not write that point to the output CSV

Code

from geopy import distance
d1 = distance.distance(coords_1, coords_2)
d1 = float(str(d1)[:-3])*1000 # Convert to meters

FMT = "%Y-%m-%d %H:%M:%S" #Formatting so it matches CSV
Time = (datetime.strptime(cur_line["Time"], FMT) - datetime.strptime(pre_line["Time"], 
FMT)).total_seconds()
Velocity = d1 / Time
if Velocity < 800: # Set this to your needs
    # DO Stuff
  • Law of Cosines: Calculating the Angle between 3 points and if the angle is too narrow, remove the point

Code:

from geopy import distance
from trianglesolver import solve
from math import degrees
d1 = distance.distance(coords_1, coords_2)
d2 = distance.distance(coords_2, coords_3)
d3 = distance.distance(coords_3, coords_1)
d1 = float(str(d1)[:-3])*1000
d2 = float(str(d2)[:-3])*1000
d3 = float(str(d3)[:-3])*1000

degTresh = 30.0
if d1 > 0.01 and d2 > 0.01 and d3 > 0.01: # if they are 0, there will be an error
    a,b,c,A,B,C = solve(a=d1, b=d2, c=d3) # Calculate the angles from the sides
    A,B,C = degrees(A), degrees(B), degrees(C) # Convert to math.degrees
    if (360.0 - degTresh) < C or C < degTresh:
        spike= True
    else:
        spike = False

These two methods combined worked fairly well and most of the times even remove small GPS spikes when standing still.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement