Improved efficiency versus iterating over two large Pandas Dataframes

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I have two HUGE Pandas dataframes with location based values, and I need to update df1['count'] with the number of records from df2 which are less than 1000m from each point in df1.

Here's an example of my data, imported into Pandas

df1 =       lat      long    valA   count

        0   123.456  986.54  1      0

        1   223.456  886.54  2      0

        2   323.456  786.54  3      0

        3   423.456  686.54  2      0

        4   523.456  586.54  1      0



df2 =       lat      long    valB

        0   123.456  986.54  1

        1   223.456  886.54  2

        2   323.456  786.54  3

        3   423.456  686.54  2

        4   523.456  586.54  1

In reality, df1 has ~10 million rows, and df2 has ~1 million

I created a working nested FOR loop using the Pandas DF.itertuples() method, which works fine for smaller test data sets (df1=1k Rows & df2=100 Rows takes about an hour to complete), but the full data set is exponentially larger, and will take years to complete based on my calculations. Here's my working code...

import pandas as pd

import geopy.distance as gpd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)



df1.sort_values(['long', 'lat']), inplace=True) 

df2.sort_values(['long', 'lat']), inplace=True)



for irow in df1.itertuples():    

     count = 0

     indexLst =         

     Location1 = (irow[1], irow[2])    



     for jrow in df2.itertuples():  

          Location2 = (jrow[1], jrow[2])                                      

          if gpd.distance(Location1, Location2).kilometers < 1:

             count += 1

             indexLst.append(jrow[0])    

     if count > 0:                  #only update DF if a match is found

         df1.at[irow[0],'count'] = (count)      

         df2.drop(indexLst, inplace=True)       #drop rows already counted from df2 to speed up next iteration



 #save updated df1 to new csv file

 outFileName = 'combined.csv'

 df1.to_csv(outFileName, sep=',', index=False)

Each point in df2 need only be counted once since points in df1 are evenly spaced. To that end I added a drop statment to remove rows from df2 once they have been counted in hopes of improving iteration time. I also tried creating a merge/join statement initially, instead of the nested loops, but was unsuccessful.

At this stage, any help on improving efficiency here is greatly appreciated!

Edit:
Goal is to update the 'count' column in df1 (as shown below) with the number of points from df2 which are <1km, and output to a new file.

df1 =       lat      long    valA   count

        0   123.456  986.54  1      3

        1   223.456  886.54  2      1

        2   323.456  786.54  3      9

        3   423.456  686.54  2      2

        4   523.456  586.54  1      5

edited Nov 27 '18 at 18:09

asked Nov 26 '18 at 21:05

dP8884

185

Welcome @dP8884, To add clarity to the question am I understanding the intent of this code is to take a pair of lat/longs from df1 and then add to a counter for the number of lat/long points in df2 that are less than 1 km away? So in the end you would have the lat/longs in df1 with a update to count column that has the number of points it found in df2 less than 1km away?

– jtweeder
Nov 26 '18 at 21:15

Yes, that is correct. I'll update my question to reflect what the expected output should look like. Thank you.

– dP8884
Nov 26 '18 at 21:28

1

There should be a way to do some sort of filtering based on lat/long combos you know are out of range (i.e., where the lat or long are more than a degree apart), but I don't know the best way to do that in your case.

– Eliot K
Nov 26 '18 at 23:13

add a comment |

I have two HUGE Pandas dataframes with location based values, and I need to update df1['count'] with the number of records from df2 which are less than 1000m from each point in df1.

Here's an example of my data, imported into Pandas

df1 =       lat      long    valA   count

        0   123.456  986.54  1      0

        1   223.456  886.54  2      0

        2   323.456  786.54  3      0

        3   423.456  686.54  2      0

        4   523.456  586.54  1      0



df2 =       lat      long    valB

        0   123.456  986.54  1

        1   223.456  886.54  2

        2   323.456  786.54  3

        3   423.456  686.54  2

        4   523.456  586.54  1

In reality, df1 has ~10 million rows, and df2 has ~1 million

import pandas as pd

import geopy.distance as gpd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)



df1.sort_values(['long', 'lat']), inplace=True) 

df2.sort_values(['long', 'lat']), inplace=True)



for irow in df1.itertuples():    

     count = 0

     indexLst =         

     Location1 = (irow[1], irow[2])    



     for jrow in df2.itertuples():  

          Location2 = (jrow[1], jrow[2])                                      

          if gpd.distance(Location1, Location2).kilometers < 1:

             count += 1

             indexLst.append(jrow[0])    

     if count > 0:                  #only update DF if a match is found

         df1.at[irow[0],'count'] = (count)      

         df2.drop(indexLst, inplace=True)       #drop rows already counted from df2 to speed up next iteration



 #save updated df1 to new csv file

 outFileName = 'combined.csv'

 df1.to_csv(outFileName, sep=',', index=False)

At this stage, any help on improving efficiency here is greatly appreciated!

Edit:
Goal is to update the 'count' column in df1 (as shown below) with the number of points from df2 which are <1km, and output to a new file.

df1 =       lat      long    valA   count

        0   123.456  986.54  1      3

        1   223.456  886.54  2      1

        2   323.456  786.54  3      9

        3   423.456  686.54  2      2

        4   523.456  586.54  1      5

edited Nov 27 '18 at 18:09

asked Nov 26 '18 at 21:05

dP8884

185

Welcome @dP8884, To add clarity to the question am I understanding the intent of this code is to take a pair of lat/longs from df1 and then add to a counter for the number of lat/long points in df2 that are less than 1 km away? So in the end you would have the lat/longs in df1 with a update to count column that has the number of points it found in df2 less than 1km away?

– jtweeder
Nov 26 '18 at 21:15

Yes, that is correct. I'll update my question to reflect what the expected output should look like. Thank you.

– dP8884
Nov 26 '18 at 21:28

1

There should be a way to do some sort of filtering based on lat/long combos you know are out of range (i.e., where the lat or long are more than a degree apart), but I don't know the best way to do that in your case.

– Eliot K
Nov 26 '18 at 23:13

add a comment |

I have two HUGE Pandas dataframes with location based values, and I need to update df1['count'] with the number of records from df2 which are less than 1000m from each point in df1.

Here's an example of my data, imported into Pandas

df1 =       lat      long    valA   count

        0   123.456  986.54  1      0

        1   223.456  886.54  2      0

        2   323.456  786.54  3      0

        3   423.456  686.54  2      0

        4   523.456  586.54  1      0



df2 =       lat      long    valB

        0   123.456  986.54  1

        1   223.456  886.54  2

        2   323.456  786.54  3

        3   423.456  686.54  2

        4   523.456  586.54  1

In reality, df1 has ~10 million rows, and df2 has ~1 million

import pandas as pd

import geopy.distance as gpd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)



df1.sort_values(['long', 'lat']), inplace=True) 

df2.sort_values(['long', 'lat']), inplace=True)



for irow in df1.itertuples():    

     count = 0

     indexLst =         

     Location1 = (irow[1], irow[2])    



     for jrow in df2.itertuples():  

          Location2 = (jrow[1], jrow[2])                                      

          if gpd.distance(Location1, Location2).kilometers < 1:

             count += 1

             indexLst.append(jrow[0])    

     if count > 0:                  #only update DF if a match is found

         df1.at[irow[0],'count'] = (count)      

         df2.drop(indexLst, inplace=True)       #drop rows already counted from df2 to speed up next iteration



 #save updated df1 to new csv file

 outFileName = 'combined.csv'

 df1.to_csv(outFileName, sep=',', index=False)

At this stage, any help on improving efficiency here is greatly appreciated!

Edit:
Goal is to update the 'count' column in df1 (as shown below) with the number of points from df2 which are <1km, and output to a new file.

df1 =       lat      long    valA   count

        0   123.456  986.54  1      3

        1   223.456  886.54  2      1

        2   323.456  786.54  3      9

        3   423.456  686.54  2      2

        4   523.456  586.54  1      5

edited Nov 27 '18 at 18:09

asked Nov 26 '18 at 21:05

dP8884

185

I have two HUGE Pandas dataframes with location based values, and I need to update df1['count'] with the number of records from df2 which are less than 1000m from each point in df1.

Here's an example of my data, imported into Pandas

df1 =       lat      long    valA   count

        0   123.456  986.54  1      0

        1   223.456  886.54  2      0

        2   323.456  786.54  3      0

        3   423.456  686.54  2      0

        4   523.456  586.54  1      0



df2 =       lat      long    valB

        0   123.456  986.54  1

        1   223.456  886.54  2

        2   323.456  786.54  3

        3   423.456  686.54  2

        4   523.456  586.54  1

In reality, df1 has ~10 million rows, and df2 has ~1 million

import pandas as pd

import geopy.distance as gpd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)



df1.sort_values(['long', 'lat']), inplace=True) 

df2.sort_values(['long', 'lat']), inplace=True)



for irow in df1.itertuples():    

     count = 0

     indexLst =         

     Location1 = (irow[1], irow[2])    



     for jrow in df2.itertuples():  

          Location2 = (jrow[1], jrow[2])                                      

          if gpd.distance(Location1, Location2).kilometers < 1:

             count += 1

             indexLst.append(jrow[0])    

     if count > 0:                  #only update DF if a match is found

         df1.at[irow[0],'count'] = (count)      

         df2.drop(indexLst, inplace=True)       #drop rows already counted from df2 to speed up next iteration



 #save updated df1 to new csv file

 outFileName = 'combined.csv'

 df1.to_csv(outFileName, sep=',', index=False)

At this stage, any help on improving efficiency here is greatly appreciated!

Edit:
Goal is to update the 'count' column in df1 (as shown below) with the number of points from df2 which are <1km, and output to a new file.

df1 =       lat      long    valA   count

        0   123.456  986.54  1      3

        1   223.456  886.54  2      1

        2   323.456  786.54  3      9

        3   423.456  686.54  2      2

        4   523.456  586.54  1      5

python pandas performance loops dataframe

edited Nov 27 '18 at 18:09

asked Nov 26 '18 at 21:05

dP8884

185

edited Nov 27 '18 at 18:09

asked Nov 26 '18 at 21:05

dP8884

185

edited Nov 27 '18 at 18:09

asked Nov 26 '18 at 21:05

dP8884

185

asked Nov 26 '18 at 21:05

dP8884

185

asked Nov 26 '18 at 21:05

dP8884

185

Welcome @dP8884, To add clarity to the question am I understanding the intent of this code is to take a pair of lat/longs from df1 and then add to a counter for the number of lat/long points in df2 that are less than 1 km away? So in the end you would have the lat/longs in df1 with a update to count column that has the number of points it found in df2 less than 1km away?

– jtweeder
Nov 26 '18 at 21:15

Yes, that is correct. I'll update my question to reflect what the expected output should look like. Thank you.

– dP8884
Nov 26 '18 at 21:28

1

There should be a way to do some sort of filtering based on lat/long combos you know are out of range (i.e., where the lat or long are more than a degree apart), but I don't know the best way to do that in your case.

– Eliot K
Nov 26 '18 at 23:13

add a comment |

Welcome @dP8884, To add clarity to the question am I understanding the intent of this code is to take a pair of lat/longs from df1 and then add to a counter for the number of lat/long points in df2 that are less than 1 km away? So in the end you would have the lat/longs in df1 with a update to count column that has the number of points it found in df2 less than 1km away?

– jtweeder
Nov 26 '18 at 21:15

Yes, that is correct. I'll update my question to reflect what the expected output should look like. Thank you.

– dP8884
Nov 26 '18 at 21:28

1

There should be a way to do some sort of filtering based on lat/long combos you know are out of range (i.e., where the lat or long are more than a degree apart), but I don't know the best way to do that in your case.

– Eliot K
Nov 26 '18 at 23:13

Welcome @dP8884, To add clarity to the question am I understanding the intent of this code is to take a pair of lat/longs from df1 and then add to a counter for the number of lat/long points in df2 that are less than 1 km away? So in the end you would have the lat/longs in df1 with a update to count column that has the number of points it found in df2 less than 1km away?

– jtweeder
Nov 26 '18 at 21:15

Yes, that is correct. I'll update my question to reflect what the expected output should look like. Thank you.

– dP8884
Nov 26 '18 at 21:28

There should be a way to do some sort of filtering based on lat/long combos you know are out of range (i.e., where the lat or long are more than a degree apart), but I don't know the best way to do that in your case.

– Eliot K
Nov 26 '18 at 23:13

add a comment |

2 Answers
2

active

oldest

votes

Having done this kind of thing frequently, I've found a couple of best practices:

1) Try to use numpy and numba as much as possible

2) Try to leverage parallelization as much as possible

3) Skip loops for vectorized code (we use a loop with numba here to leverage parallelization).

In this particular case, I want to point out the slowdown introduced by geopy. While it is a great package and produces pretty accurate distances (compared to the Haversine method), it is much slower (have not looked at implementation as to why).

import numpy as np

from geopy import distance



origin = (np.random.uniform(-90,90), np.random.uniform(-180,180))

dest = (np.random.uniform(-90,90), np.random.uniform(-180,180))



%timeit distance.distance(origin, dest)

216 µs ± 363 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Which means at that time interval, computing 10 million x 1 million distances will take approximately 2160000000 seconds or 600k hours. Even parallelism will only help so much.

Because you're interested when the points are very close, I would suggest using Haversine distance (which is less accurate at greater distances).

from numba import jit, prange, vectorize



@vectorize

def haversine(s_lat,s_lng,e_lat,e_lng):



    # approximate radius of earth in km

    R = 6373.0



    s_lat = s_lat*np.pi/180.0                      

    s_lng = np.deg2rad(s_lng)     

    e_lat = np.deg2rad(e_lat)                       

    e_lng = np.deg2rad(e_lng)  



    d = np.sin((e_lat - s_lat)/2)**2 + np.cos(s_lat)*np.cos(e_lat) * np.sin((e_lng - s_lng)/2)**2



    return 2 * R * np.arcsin(np.sqrt(d))



%timeit haversine(origin[0], origin[0], dest[1], dest[1])

1.85 µs ± 53.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

That's already a 100-fold improvement. But we can do better. You may have noticed the @vectorize decorator I added from numba. This allows the previously scalar Haversine function to become vectorized and take vectors as inputs. We'll leverage this in the next step:

@jit(nopython=True, parallel=True)

def get_nearby_count(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: Second list of coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        x, y = coords[i]

        # returns an array of length k

        dist = haversine(x, y, coords2[:,0], coords2[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

Hopefully, you'll now have an array equal to the length of the first set of coordinates (10 million in your case). You can then just assign this to your data frame as your count!

Test time 100,000 x 10,000:

n = 100_000

k = 10_000



coords1 = np.zeros((n, 2))

coords2 = np.zeros((k, 2))



coords1[:,0] = np.random.uniform(-90, 90, n)

coords1[:,1] = np.random.uniform(-180, 180, n)

coords2[:,0] = np.random.uniform(-90, 90, k)

coords2[:,1] = np.random.uniform(-180, 180, k)



%timeit get_nearby_count(coords1, coords2, 1.0)

2.45 s ± 73.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Unfortunately, that still means you'll be looking at something around 20,000+ seconds. And this was on a machine with 80 cores (utilizing 76ish, based on top usage).

That's the best I can do for now, good luck (also, first post, thanks for inspiring me to contribute!)

PS: You might also look into Dask arrays and the function, map_block(), to parallelize this function (instead of relying on prange). How you partition the data may influence total execution time.

PPS: 1,000,000 x 100,000 (100x smaller than your full set) took: 3min 27s (207 seconds) so the scaling appears to be linear and a bit forgiving.

PPPS: Implemented with a simple latitude difference filter:

@jit(nopython=True, parallel=True)

def get_nearby_count_vlat(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: List of port coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    coords2_abs = np.abs(coords2)

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        point = coords[i]

        # subsetting coords2 to reduce haversine calc time. Value .02 is from playing with Gmaps and will need to change for max_dist > 1.0

        coords2_filtered = coords2[np.abs(point[0] - coords2[:,0]) < .02]

        # in case of no matches

        if coords2_filtered.shape[0] == 0: continue

        # returns an array of length k

        dist = haversine(point[0], point[1], coords2_filtered[:,0], coords2_filtered[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

edited Dec 6 '18 at 15:40

answered Nov 26 '18 at 23:15

ernestk

565

Thanks! This is excellent and well explained. Let me digest it a bit more and implement it on a sample of my data and I'll let you know how it turns out. Also, this is my first post as well, so thanks for the feedback as it seems to have given me just enough privilege to begin up-voting posts (yours first of course). :)

– dP8884
Nov 27 '18 at 17:59

Eliot K brought up a good point of making this faster by reducing the search space, but geo-coords give me a headache. I think I found a quick way to filter the results, but on latitude only. I tested it quickly on my laptop (quad core). My original method took 70 secs (100k coords x 50k coords) while the quick latitude distance filter took it down to 2.27 secs. That's with completely random coordinates. You can definitely improve filtering, especially with a sorted df2. I added the change to the code above. Filtering on longitude doesn't seem to be worth it (costly compared to Haversine)

– ernestk
Nov 27 '18 at 18:58

Thanks to both ernestk and Eliot K. This seems to have gotten my process down from ~90 years to <1 day (running on a VM with Xeon 12 core @2.1GHz).

– dP8884
Nov 28 '18 at 0:12

add a comment |

I've done something similar lately but not with lat,lon and I only had to find the nearest point and its distance. For this, I used the scipy.spatial.cKDTree package. It was quite fast.
cKDTree

I think that in your case, you could use the query_ball_point() function.

from scipy import spatial

import pandas as pd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)

# Build the index

tree = spatial.cKDTree(df1[['long', 'lat']])

# Then query the index

You should give it a try.

answered Nov 27 '18 at 19:49

Akarius

34616

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53489058%2fimproved-efficiency-versus-iterating-over-two-large-pandas-dataframes%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Having done this kind of thing frequently, I've found a couple of best practices:

1) Try to use numpy and numba as much as possible

2) Try to leverage parallelization as much as possible

3) Skip loops for vectorized code (we use a loop with numba here to leverage parallelization).

import numpy as np

from geopy import distance



origin = (np.random.uniform(-90,90), np.random.uniform(-180,180))

dest = (np.random.uniform(-90,90), np.random.uniform(-180,180))



%timeit distance.distance(origin, dest)

216 µs ± 363 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Which means at that time interval, computing 10 million x 1 million distances will take approximately 2160000000 seconds or 600k hours. Even parallelism will only help so much.

Because you're interested when the points are very close, I would suggest using Haversine distance (which is less accurate at greater distances).

from numba import jit, prange, vectorize



@vectorize

def haversine(s_lat,s_lng,e_lat,e_lng):



    # approximate radius of earth in km

    R = 6373.0



    s_lat = s_lat*np.pi/180.0                      

    s_lng = np.deg2rad(s_lng)     

    e_lat = np.deg2rad(e_lat)                       

    e_lng = np.deg2rad(e_lng)  



    d = np.sin((e_lat - s_lat)/2)**2 + np.cos(s_lat)*np.cos(e_lat) * np.sin((e_lng - s_lng)/2)**2



    return 2 * R * np.arcsin(np.sqrt(d))



%timeit haversine(origin[0], origin[0], dest[1], dest[1])

1.85 µs ± 53.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

@jit(nopython=True, parallel=True)

def get_nearby_count(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: Second list of coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        x, y = coords[i]

        # returns an array of length k

        dist = haversine(x, y, coords2[:,0], coords2[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

Hopefully, you'll now have an array equal to the length of the first set of coordinates (10 million in your case). You can then just assign this to your data frame as your count!

Test time 100,000 x 10,000:

n = 100_000

k = 10_000



coords1 = np.zeros((n, 2))

coords2 = np.zeros((k, 2))



coords1[:,0] = np.random.uniform(-90, 90, n)

coords1[:,1] = np.random.uniform(-180, 180, n)

coords2[:,0] = np.random.uniform(-90, 90, k)

coords2[:,1] = np.random.uniform(-180, 180, k)



%timeit get_nearby_count(coords1, coords2, 1.0)

2.45 s ± 73.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Unfortunately, that still means you'll be looking at something around 20,000+ seconds. And this was on a machine with 80 cores (utilizing 76ish, based on top usage).

That's the best I can do for now, good luck (also, first post, thanks for inspiring me to contribute!)

PS: You might also look into Dask arrays and the function, map_block(), to parallelize this function (instead of relying on prange). How you partition the data may influence total execution time.

PPS: 1,000,000 x 100,000 (100x smaller than your full set) took: 3min 27s (207 seconds) so the scaling appears to be linear and a bit forgiving.

PPPS: Implemented with a simple latitude difference filter:

@jit(nopython=True, parallel=True)

def get_nearby_count_vlat(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: List of port coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    coords2_abs = np.abs(coords2)

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        point = coords[i]

        # subsetting coords2 to reduce haversine calc time. Value .02 is from playing with Gmaps and will need to change for max_dist > 1.0

        coords2_filtered = coords2[np.abs(point[0] - coords2[:,0]) < .02]

        # in case of no matches

        if coords2_filtered.shape[0] == 0: continue

        # returns an array of length k

        dist = haversine(point[0], point[1], coords2_filtered[:,0], coords2_filtered[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

edited Dec 6 '18 at 15:40

answered Nov 26 '18 at 23:15

ernestk

565

Thanks! This is excellent and well explained. Let me digest it a bit more and implement it on a sample of my data and I'll let you know how it turns out. Also, this is my first post as well, so thanks for the feedback as it seems to have given me just enough privilege to begin up-voting posts (yours first of course). :)

– dP8884
Nov 27 '18 at 17:59

Eliot K brought up a good point of making this faster by reducing the search space, but geo-coords give me a headache. I think I found a quick way to filter the results, but on latitude only. I tested it quickly on my laptop (quad core). My original method took 70 secs (100k coords x 50k coords) while the quick latitude distance filter took it down to 2.27 secs. That's with completely random coordinates. You can definitely improve filtering, especially with a sorted df2. I added the change to the code above. Filtering on longitude doesn't seem to be worth it (costly compared to Haversine)

– ernestk
Nov 27 '18 at 18:58

Thanks to both ernestk and Eliot K. This seems to have gotten my process down from ~90 years to <1 day (running on a VM with Xeon 12 core @2.1GHz).

– dP8884
Nov 28 '18 at 0:12

add a comment |

Having done this kind of thing frequently, I've found a couple of best practices:

1) Try to use numpy and numba as much as possible

2) Try to leverage parallelization as much as possible

3) Skip loops for vectorized code (we use a loop with numba here to leverage parallelization).

import numpy as np

from geopy import distance



origin = (np.random.uniform(-90,90), np.random.uniform(-180,180))

dest = (np.random.uniform(-90,90), np.random.uniform(-180,180))



%timeit distance.distance(origin, dest)

216 µs ± 363 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Which means at that time interval, computing 10 million x 1 million distances will take approximately 2160000000 seconds or 600k hours. Even parallelism will only help so much.

Because you're interested when the points are very close, I would suggest using Haversine distance (which is less accurate at greater distances).

from numba import jit, prange, vectorize



@vectorize

def haversine(s_lat,s_lng,e_lat,e_lng):



    # approximate radius of earth in km

    R = 6373.0



    s_lat = s_lat*np.pi/180.0                      

    s_lng = np.deg2rad(s_lng)     

    e_lat = np.deg2rad(e_lat)                       

    e_lng = np.deg2rad(e_lng)  



    d = np.sin((e_lat - s_lat)/2)**2 + np.cos(s_lat)*np.cos(e_lat) * np.sin((e_lng - s_lng)/2)**2



    return 2 * R * np.arcsin(np.sqrt(d))



%timeit haversine(origin[0], origin[0], dest[1], dest[1])

1.85 µs ± 53.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

@jit(nopython=True, parallel=True)

def get_nearby_count(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: Second list of coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        x, y = coords[i]

        # returns an array of length k

        dist = haversine(x, y, coords2[:,0], coords2[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

Hopefully, you'll now have an array equal to the length of the first set of coordinates (10 million in your case). You can then just assign this to your data frame as your count!

Test time 100,000 x 10,000:

n = 100_000

k = 10_000



coords1 = np.zeros((n, 2))

coords2 = np.zeros((k, 2))



coords1[:,0] = np.random.uniform(-90, 90, n)

coords1[:,1] = np.random.uniform(-180, 180, n)

coords2[:,0] = np.random.uniform(-90, 90, k)

coords2[:,1] = np.random.uniform(-180, 180, k)



%timeit get_nearby_count(coords1, coords2, 1.0)

2.45 s ± 73.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Unfortunately, that still means you'll be looking at something around 20,000+ seconds. And this was on a machine with 80 cores (utilizing 76ish, based on top usage).

That's the best I can do for now, good luck (also, first post, thanks for inspiring me to contribute!)

PS: You might also look into Dask arrays and the function, map_block(), to parallelize this function (instead of relying on prange). How you partition the data may influence total execution time.

PPS: 1,000,000 x 100,000 (100x smaller than your full set) took: 3min 27s (207 seconds) so the scaling appears to be linear and a bit forgiving.

PPPS: Implemented with a simple latitude difference filter:

@jit(nopython=True, parallel=True)

def get_nearby_count_vlat(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: List of port coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    coords2_abs = np.abs(coords2)

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        point = coords[i]

        # subsetting coords2 to reduce haversine calc time. Value .02 is from playing with Gmaps and will need to change for max_dist > 1.0

        coords2_filtered = coords2[np.abs(point[0] - coords2[:,0]) < .02]

        # in case of no matches

        if coords2_filtered.shape[0] == 0: continue

        # returns an array of length k

        dist = haversine(point[0], point[1], coords2_filtered[:,0], coords2_filtered[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

edited Dec 6 '18 at 15:40

answered Nov 26 '18 at 23:15

ernestk

565

Thanks! This is excellent and well explained. Let me digest it a bit more and implement it on a sample of my data and I'll let you know how it turns out. Also, this is my first post as well, so thanks for the feedback as it seems to have given me just enough privilege to begin up-voting posts (yours first of course). :)

– dP8884
Nov 27 '18 at 17:59

Eliot K brought up a good point of making this faster by reducing the search space, but geo-coords give me a headache. I think I found a quick way to filter the results, but on latitude only. I tested it quickly on my laptop (quad core). My original method took 70 secs (100k coords x 50k coords) while the quick latitude distance filter took it down to 2.27 secs. That's with completely random coordinates. You can definitely improve filtering, especially with a sorted df2. I added the change to the code above. Filtering on longitude doesn't seem to be worth it (costly compared to Haversine)

– ernestk
Nov 27 '18 at 18:58

Thanks to both ernestk and Eliot K. This seems to have gotten my process down from ~90 years to <1 day (running on a VM with Xeon 12 core @2.1GHz).

– dP8884
Nov 28 '18 at 0:12

add a comment |

Having done this kind of thing frequently, I've found a couple of best practices:

1) Try to use numpy and numba as much as possible

2) Try to leverage parallelization as much as possible

3) Skip loops for vectorized code (we use a loop with numba here to leverage parallelization).

import numpy as np

from geopy import distance



origin = (np.random.uniform(-90,90), np.random.uniform(-180,180))

dest = (np.random.uniform(-90,90), np.random.uniform(-180,180))



%timeit distance.distance(origin, dest)

216 µs ± 363 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Which means at that time interval, computing 10 million x 1 million distances will take approximately 2160000000 seconds or 600k hours. Even parallelism will only help so much.

Because you're interested when the points are very close, I would suggest using Haversine distance (which is less accurate at greater distances).

from numba import jit, prange, vectorize



@vectorize

def haversine(s_lat,s_lng,e_lat,e_lng):



    # approximate radius of earth in km

    R = 6373.0



    s_lat = s_lat*np.pi/180.0                      

    s_lng = np.deg2rad(s_lng)     

    e_lat = np.deg2rad(e_lat)                       

    e_lng = np.deg2rad(e_lng)  



    d = np.sin((e_lat - s_lat)/2)**2 + np.cos(s_lat)*np.cos(e_lat) * np.sin((e_lng - s_lng)/2)**2



    return 2 * R * np.arcsin(np.sqrt(d))



%timeit haversine(origin[0], origin[0], dest[1], dest[1])

1.85 µs ± 53.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

@jit(nopython=True, parallel=True)

def get_nearby_count(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: Second list of coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        x, y = coords[i]

        # returns an array of length k

        dist = haversine(x, y, coords2[:,0], coords2[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

Hopefully, you'll now have an array equal to the length of the first set of coordinates (10 million in your case). You can then just assign this to your data frame as your count!

Test time 100,000 x 10,000:

n = 100_000

k = 10_000



coords1 = np.zeros((n, 2))

coords2 = np.zeros((k, 2))



coords1[:,0] = np.random.uniform(-90, 90, n)

coords1[:,1] = np.random.uniform(-180, 180, n)

coords2[:,0] = np.random.uniform(-90, 90, k)

coords2[:,1] = np.random.uniform(-180, 180, k)



%timeit get_nearby_count(coords1, coords2, 1.0)

2.45 s ± 73.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Unfortunately, that still means you'll be looking at something around 20,000+ seconds. And this was on a machine with 80 cores (utilizing 76ish, based on top usage).

That's the best I can do for now, good luck (also, first post, thanks for inspiring me to contribute!)

PS: You might also look into Dask arrays and the function, map_block(), to parallelize this function (instead of relying on prange). How you partition the data may influence total execution time.

PPS: 1,000,000 x 100,000 (100x smaller than your full set) took: 3min 27s (207 seconds) so the scaling appears to be linear and a bit forgiving.

PPPS: Implemented with a simple latitude difference filter:

@jit(nopython=True, parallel=True)

def get_nearby_count_vlat(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: List of port coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    coords2_abs = np.abs(coords2)

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        point = coords[i]

        # subsetting coords2 to reduce haversine calc time. Value .02 is from playing with Gmaps and will need to change for max_dist > 1.0

        coords2_filtered = coords2[np.abs(point[0] - coords2[:,0]) < .02]

        # in case of no matches

        if coords2_filtered.shape[0] == 0: continue

        # returns an array of length k

        dist = haversine(point[0], point[1], coords2_filtered[:,0], coords2_filtered[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

edited Dec 6 '18 at 15:40

answered Nov 26 '18 at 23:15

ernestk

565

Having done this kind of thing frequently, I've found a couple of best practices:

1) Try to use numpy and numba as much as possible

2) Try to leverage parallelization as much as possible

3) Skip loops for vectorized code (we use a loop with numba here to leverage parallelization).

import numpy as np

from geopy import distance



origin = (np.random.uniform(-90,90), np.random.uniform(-180,180))

dest = (np.random.uniform(-90,90), np.random.uniform(-180,180))



%timeit distance.distance(origin, dest)

216 µs ± 363 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Which means at that time interval, computing 10 million x 1 million distances will take approximately 2160000000 seconds or 600k hours. Even parallelism will only help so much.

Because you're interested when the points are very close, I would suggest using Haversine distance (which is less accurate at greater distances).

from numba import jit, prange, vectorize



@vectorize

def haversine(s_lat,s_lng,e_lat,e_lng):



    # approximate radius of earth in km

    R = 6373.0



    s_lat = s_lat*np.pi/180.0                      

    s_lng = np.deg2rad(s_lng)     

    e_lat = np.deg2rad(e_lat)                       

    e_lng = np.deg2rad(e_lng)  



    d = np.sin((e_lat - s_lat)/2)**2 + np.cos(s_lat)*np.cos(e_lat) * np.sin((e_lng - s_lng)/2)**2



    return 2 * R * np.arcsin(np.sqrt(d))



%timeit haversine(origin[0], origin[0], dest[1], dest[1])

1.85 µs ± 53.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

@jit(nopython=True, parallel=True)

def get_nearby_count(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: Second list of coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        x, y = coords[i]

        # returns an array of length k

        dist = haversine(x, y, coords2[:,0], coords2[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

Hopefully, you'll now have an array equal to the length of the first set of coordinates (10 million in your case). You can then just assign this to your data frame as your count!

Test time 100,000 x 10,000:

n = 100_000

k = 10_000



coords1 = np.zeros((n, 2))

coords2 = np.zeros((k, 2))



coords1[:,0] = np.random.uniform(-90, 90, n)

coords1[:,1] = np.random.uniform(-180, 180, n)

coords2[:,0] = np.random.uniform(-90, 90, k)

coords2[:,1] = np.random.uniform(-180, 180, k)



%timeit get_nearby_count(coords1, coords2, 1.0)

2.45 s ± 73.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Unfortunately, that still means you'll be looking at something around 20,000+ seconds. And this was on a machine with 80 cores (utilizing 76ish, based on top usage).

That's the best I can do for now, good luck (also, first post, thanks for inspiring me to contribute!)

PS: You might also look into Dask arrays and the function, map_block(), to parallelize this function (instead of relying on prange). How you partition the data may influence total execution time.

PPS: 1,000,000 x 100,000 (100x smaller than your full set) took: 3min 27s (207 seconds) so the scaling appears to be linear and a bit forgiving.

PPPS: Implemented with a simple latitude difference filter:

@jit(nopython=True, parallel=True)

def get_nearby_count_vlat(coords, coords2, max_dist):

    '''

    Input: `coords`: List of coordinates, lat-lngs in an n x 2 array

           `coords2`: List of port coordinates, lat-lngs in an k x 2 array

           `max_dist`: Max distance to be considered nearby

    Output: Array of length n with a count of coords nearby coords2

    '''

    # initialize

    n = coords.shape[0]

    k = coords2.shape[0]

    coords2_abs = np.abs(coords2)

    output = np.zeros(n)



    # prange is a parallel loop when operations are independent

    for i in prange(n):

        # comparing a point in coords to the arrays in coords2

        point = coords[i]

        # subsetting coords2 to reduce haversine calc time. Value .02 is from playing with Gmaps and will need to change for max_dist > 1.0

        coords2_filtered = coords2[np.abs(point[0] - coords2[:,0]) < .02]

        # in case of no matches

        if coords2_filtered.shape[0] == 0: continue

        # returns an array of length k

        dist = haversine(point[0], point[1], coords2_filtered[:,0], coords2_filtered[:,1])

        # sum the boolean of distances less than the max allowable

        output[i] = np.sum(dist < max_dist)



    return output

edited Dec 6 '18 at 15:40

answered Nov 26 '18 at 23:15

ernestk

565

edited Dec 6 '18 at 15:40

answered Nov 26 '18 at 23:15

ernestk

565

answered Nov 26 '18 at 23:15

ernestk

565

answered Nov 26 '18 at 23:15

ernestk

565

Thanks! This is excellent and well explained. Let me digest it a bit more and implement it on a sample of my data and I'll let you know how it turns out. Also, this is my first post as well, so thanks for the feedback as it seems to have given me just enough privilege to begin up-voting posts (yours first of course). :)

– dP8884
Nov 27 '18 at 17:59

Eliot K brought up a good point of making this faster by reducing the search space, but geo-coords give me a headache. I think I found a quick way to filter the results, but on latitude only. I tested it quickly on my laptop (quad core). My original method took 70 secs (100k coords x 50k coords) while the quick latitude distance filter took it down to 2.27 secs. That's with completely random coordinates. You can definitely improve filtering, especially with a sorted df2. I added the change to the code above. Filtering on longitude doesn't seem to be worth it (costly compared to Haversine)

– ernestk
Nov 27 '18 at 18:58

Thanks to both ernestk and Eliot K. This seems to have gotten my process down from ~90 years to <1 day (running on a VM with Xeon 12 core @2.1GHz).

– dP8884
Nov 28 '18 at 0:12

add a comment |

Thanks! This is excellent and well explained. Let me digest it a bit more and implement it on a sample of my data and I'll let you know how it turns out. Also, this is my first post as well, so thanks for the feedback as it seems to have given me just enough privilege to begin up-voting posts (yours first of course). :)

– dP8884
Nov 27 '18 at 17:59

Eliot K brought up a good point of making this faster by reducing the search space, but geo-coords give me a headache. I think I found a quick way to filter the results, but on latitude only. I tested it quickly on my laptop (quad core). My original method took 70 secs (100k coords x 50k coords) while the quick latitude distance filter took it down to 2.27 secs. That's with completely random coordinates. You can definitely improve filtering, especially with a sorted df2. I added the change to the code above. Filtering on longitude doesn't seem to be worth it (costly compared to Haversine)

– ernestk
Nov 27 '18 at 18:58

Thanks to both ernestk and Eliot K. This seems to have gotten my process down from ~90 years to <1 day (running on a VM with Xeon 12 core @2.1GHz).

– dP8884
Nov 28 '18 at 0:12

Thanks! This is excellent and well explained. Let me digest it a bit more and implement it on a sample of my data and I'll let you know how it turns out. Also, this is my first post as well, so thanks for the feedback as it seems to have given me just enough privilege to begin up-voting posts (yours first of course). :)

– dP8884
Nov 27 '18 at 17:59

Eliot K brought up a good point of making this faster by reducing the search space, but geo-coords give me a headache. I think I found a quick way to filter the results, but on latitude only. I tested it quickly on my laptop (quad core). My original method took 70 secs (100k coords x 50k coords) while the quick latitude distance filter took it down to 2.27 secs. That's with completely random coordinates. You can definitely improve filtering, especially with a sorted df2. I added the change to the code above. Filtering on longitude doesn't seem to be worth it (costly compared to Haversine)

– ernestk
Nov 27 '18 at 18:58

Thanks to both ernestk and Eliot K. This seems to have gotten my process down from ~90 years to <1 day (running on a VM with Xeon 12 core @2.1GHz).

– dP8884
Nov 28 '18 at 0:12

add a comment |

I've done something similar lately but not with lat,lon and I only had to find the nearest point and its distance. For this, I used the scipy.spatial.cKDTree package. It was quite fast.
cKDTree

I think that in your case, you could use the query_ball_point() function.

from scipy import spatial

import pandas as pd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)

# Build the index

tree = spatial.cKDTree(df1[['long', 'lat']])

# Then query the index

You should give it a try.

answered Nov 27 '18 at 19:49

Akarius

34616

add a comment |

I've done something similar lately but not with lat,lon and I only had to find the nearest point and its distance. For this, I used the scipy.spatial.cKDTree package. It was quite fast.
cKDTree

I think that in your case, you could use the query_ball_point() function.

from scipy import spatial

import pandas as pd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)

# Build the index

tree = spatial.cKDTree(df1[['long', 'lat']])

# Then query the index

You should give it a try.

answered Nov 27 '18 at 19:49

Akarius

34616

add a comment |

I've done something similar lately but not with lat,lon and I only had to find the nearest point and its distance. For this, I used the scipy.spatial.cKDTree package. It was quite fast.
cKDTree

I think that in your case, you could use the query_ball_point() function.

from scipy import spatial

import pandas as pd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)

# Build the index

tree = spatial.cKDTree(df1[['long', 'lat']])

# Then query the index

You should give it a try.

answered Nov 27 '18 at 19:49

Akarius

34616

I've done something similar lately but not with lat,lon and I only had to find the nearest point and its distance. For this, I used the scipy.spatial.cKDTree package. It was quite fast.
cKDTree

I think that in your case, you could use the query_ball_point() function.

from scipy import spatial

import pandas as pd



file1 = 'C:\path\file1.csv'    

file2 = 'C:\path\file2.csv' 



df1 = pd.read_csv(file1)

df2 = pd.read_csv(file2)

# Build the index

tree = spatial.cKDTree(df1[['long', 'lat']])

# Then query the index

You should give it a try.

answered Nov 27 '18 at 19:49

Akarius

34616

answered Nov 27 '18 at 19:49

Akarius

34616

answered Nov 27 '18 at 19:49

Akarius

34616

answered Nov 27 '18 at 19:49

Akarius

34616

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg