Extracting an adjaceny matrix containing haversine distance from points on map












6














I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.



The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?



import csv
from haversine import haversine
import matplotlib.pyplot as plt
import numpy as np


def read_two_column_file(file_name):
with open(file_name, 'r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )
long =
lat =
for col in csv_input:
x = float(col[0]) # converting to float
y = float(col[1])
long.append(x)
lat.append(y)

return long, lat


def display_points(long, lat):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(long, lat)
plt.show()


def main():
long, lat = read_two_column_file('latlong.txt')

points =
for i in range(len(lat)):
coords = tuple([lat[i], long[i]]) # converting to tuple to be able to perform haverine calc.
points.append(coords)

hav =
for i in range(len(lat)):
for j in range(len(long)):
hav.append(haversine(points[i], points[j]))

np.asarray(hav)
adj_matrix = np.reshape(hav, (10, 10)) # reshaping to 10 x 10 matrix
print(adj_matrix)

display_points(long, lat)


main()


Sample Input:



35.905333, 14.471970
35.896389, 14.477780
35.901281, 14.518173
35.860491, 14.572245
35.807607, 14.535320
35.832267, 14.455894
35.882414, 14.373217
35.983794, 14.336096
35.974463, 14.351006
35.930951, 14.401137


Sample Output:



[[ 0.          1.15959635  5.15603243 12.15003864 12.66090817  8.06760374
11.25481465 17.31108648 15.37358741 8.34541481]
[ 1.15959635 0. 4.52227294 11.19223786 11.50131214 7.32033758
11.72388583 18.35259685 16.41378987 9.29953014]
[ 5.15603243 4.52227294 0. 7.44480948 10.26177912 10.15688933
16.24592213 22.1101544 20.18967731 13.40020548]
[12.15003864 11.19223786 7.44480948 0. 7.01813758 13.28961044
22.25645098 29.42422794 27.49154954 20.48281039]
[12.66090817 11.50131214 10.26177912 7.01813758 0. 9.22215871
19.74293886 29.16680205 27.25540014 19.97465594]
[ 8.06760374 7.32033758 10.15688933 13.28961044 9.22215871 0.
10.66219491 21.06632671 19.24994647 12.24773666]
[11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491
0. 11.67502344 10.21846781 6.08016463]
[17.31108648 18.35259685 22.1101544 29.42422794 29.16680205 21.06632671
11.67502344 0. 1.93885474 9.20353461]
[15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647
10.21846781 1.93885474 0. 7.28280909]
[ 8.34541481 9.29953014 13.40020548 20.48281039 19.97465594 12.24773666
6.08016463 9.20353461 7.28280909 0. ]]


Plot:



enter image description here










share|improve this question




















  • 1




    Welcome to Code Review and congratulations on writing a decent question on your first try.
    – Mast
    Nov 12 '18 at 9:10










  • Thanks for the feedback!
    – Rrz0
    Nov 12 '18 at 11:09






  • 2




    This is a great first question, I hope you get some good feedback.
    – esote
    Nov 12 '18 at 14:54






  • 2




    If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
    – Graipher
    Nov 12 '18 at 15:55






  • 1




    Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
    – Rrz0
    Nov 12 '18 at 16:11


















6














I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.



The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?



import csv
from haversine import haversine
import matplotlib.pyplot as plt
import numpy as np


def read_two_column_file(file_name):
with open(file_name, 'r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )
long =
lat =
for col in csv_input:
x = float(col[0]) # converting to float
y = float(col[1])
long.append(x)
lat.append(y)

return long, lat


def display_points(long, lat):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(long, lat)
plt.show()


def main():
long, lat = read_two_column_file('latlong.txt')

points =
for i in range(len(lat)):
coords = tuple([lat[i], long[i]]) # converting to tuple to be able to perform haverine calc.
points.append(coords)

hav =
for i in range(len(lat)):
for j in range(len(long)):
hav.append(haversine(points[i], points[j]))

np.asarray(hav)
adj_matrix = np.reshape(hav, (10, 10)) # reshaping to 10 x 10 matrix
print(adj_matrix)

display_points(long, lat)


main()


Sample Input:



35.905333, 14.471970
35.896389, 14.477780
35.901281, 14.518173
35.860491, 14.572245
35.807607, 14.535320
35.832267, 14.455894
35.882414, 14.373217
35.983794, 14.336096
35.974463, 14.351006
35.930951, 14.401137


Sample Output:



[[ 0.          1.15959635  5.15603243 12.15003864 12.66090817  8.06760374
11.25481465 17.31108648 15.37358741 8.34541481]
[ 1.15959635 0. 4.52227294 11.19223786 11.50131214 7.32033758
11.72388583 18.35259685 16.41378987 9.29953014]
[ 5.15603243 4.52227294 0. 7.44480948 10.26177912 10.15688933
16.24592213 22.1101544 20.18967731 13.40020548]
[12.15003864 11.19223786 7.44480948 0. 7.01813758 13.28961044
22.25645098 29.42422794 27.49154954 20.48281039]
[12.66090817 11.50131214 10.26177912 7.01813758 0. 9.22215871
19.74293886 29.16680205 27.25540014 19.97465594]
[ 8.06760374 7.32033758 10.15688933 13.28961044 9.22215871 0.
10.66219491 21.06632671 19.24994647 12.24773666]
[11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491
0. 11.67502344 10.21846781 6.08016463]
[17.31108648 18.35259685 22.1101544 29.42422794 29.16680205 21.06632671
11.67502344 0. 1.93885474 9.20353461]
[15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647
10.21846781 1.93885474 0. 7.28280909]
[ 8.34541481 9.29953014 13.40020548 20.48281039 19.97465594 12.24773666
6.08016463 9.20353461 7.28280909 0. ]]


Plot:



enter image description here










share|improve this question




















  • 1




    Welcome to Code Review and congratulations on writing a decent question on your first try.
    – Mast
    Nov 12 '18 at 9:10










  • Thanks for the feedback!
    – Rrz0
    Nov 12 '18 at 11:09






  • 2




    This is a great first question, I hope you get some good feedback.
    – esote
    Nov 12 '18 at 14:54






  • 2




    If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
    – Graipher
    Nov 12 '18 at 15:55






  • 1




    Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
    – Rrz0
    Nov 12 '18 at 16:11
















6












6








6







I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.



The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?



import csv
from haversine import haversine
import matplotlib.pyplot as plt
import numpy as np


def read_two_column_file(file_name):
with open(file_name, 'r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )
long =
lat =
for col in csv_input:
x = float(col[0]) # converting to float
y = float(col[1])
long.append(x)
lat.append(y)

return long, lat


def display_points(long, lat):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(long, lat)
plt.show()


def main():
long, lat = read_two_column_file('latlong.txt')

points =
for i in range(len(lat)):
coords = tuple([lat[i], long[i]]) # converting to tuple to be able to perform haverine calc.
points.append(coords)

hav =
for i in range(len(lat)):
for j in range(len(long)):
hav.append(haversine(points[i], points[j]))

np.asarray(hav)
adj_matrix = np.reshape(hav, (10, 10)) # reshaping to 10 x 10 matrix
print(adj_matrix)

display_points(long, lat)


main()


Sample Input:



35.905333, 14.471970
35.896389, 14.477780
35.901281, 14.518173
35.860491, 14.572245
35.807607, 14.535320
35.832267, 14.455894
35.882414, 14.373217
35.983794, 14.336096
35.974463, 14.351006
35.930951, 14.401137


Sample Output:



[[ 0.          1.15959635  5.15603243 12.15003864 12.66090817  8.06760374
11.25481465 17.31108648 15.37358741 8.34541481]
[ 1.15959635 0. 4.52227294 11.19223786 11.50131214 7.32033758
11.72388583 18.35259685 16.41378987 9.29953014]
[ 5.15603243 4.52227294 0. 7.44480948 10.26177912 10.15688933
16.24592213 22.1101544 20.18967731 13.40020548]
[12.15003864 11.19223786 7.44480948 0. 7.01813758 13.28961044
22.25645098 29.42422794 27.49154954 20.48281039]
[12.66090817 11.50131214 10.26177912 7.01813758 0. 9.22215871
19.74293886 29.16680205 27.25540014 19.97465594]
[ 8.06760374 7.32033758 10.15688933 13.28961044 9.22215871 0.
10.66219491 21.06632671 19.24994647 12.24773666]
[11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491
0. 11.67502344 10.21846781 6.08016463]
[17.31108648 18.35259685 22.1101544 29.42422794 29.16680205 21.06632671
11.67502344 0. 1.93885474 9.20353461]
[15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647
10.21846781 1.93885474 0. 7.28280909]
[ 8.34541481 9.29953014 13.40020548 20.48281039 19.97465594 12.24773666
6.08016463 9.20353461 7.28280909 0. ]]


Plot:



enter image description here










share|improve this question















I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.



The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?



import csv
from haversine import haversine
import matplotlib.pyplot as plt
import numpy as np


def read_two_column_file(file_name):
with open(file_name, 'r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )
long =
lat =
for col in csv_input:
x = float(col[0]) # converting to float
y = float(col[1])
long.append(x)
lat.append(y)

return long, lat


def display_points(long, lat):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(long, lat)
plt.show()


def main():
long, lat = read_two_column_file('latlong.txt')

points =
for i in range(len(lat)):
coords = tuple([lat[i], long[i]]) # converting to tuple to be able to perform haverine calc.
points.append(coords)

hav =
for i in range(len(lat)):
for j in range(len(long)):
hav.append(haversine(points[i], points[j]))

np.asarray(hav)
adj_matrix = np.reshape(hav, (10, 10)) # reshaping to 10 x 10 matrix
print(adj_matrix)

display_points(long, lat)


main()


Sample Input:



35.905333, 14.471970
35.896389, 14.477780
35.901281, 14.518173
35.860491, 14.572245
35.807607, 14.535320
35.832267, 14.455894
35.882414, 14.373217
35.983794, 14.336096
35.974463, 14.351006
35.930951, 14.401137


Sample Output:



[[ 0.          1.15959635  5.15603243 12.15003864 12.66090817  8.06760374
11.25481465 17.31108648 15.37358741 8.34541481]
[ 1.15959635 0. 4.52227294 11.19223786 11.50131214 7.32033758
11.72388583 18.35259685 16.41378987 9.29953014]
[ 5.15603243 4.52227294 0. 7.44480948 10.26177912 10.15688933
16.24592213 22.1101544 20.18967731 13.40020548]
[12.15003864 11.19223786 7.44480948 0. 7.01813758 13.28961044
22.25645098 29.42422794 27.49154954 20.48281039]
[12.66090817 11.50131214 10.26177912 7.01813758 0. 9.22215871
19.74293886 29.16680205 27.25540014 19.97465594]
[ 8.06760374 7.32033758 10.15688933 13.28961044 9.22215871 0.
10.66219491 21.06632671 19.24994647 12.24773666]
[11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491
0. 11.67502344 10.21846781 6.08016463]
[17.31108648 18.35259685 22.1101544 29.42422794 29.16680205 21.06632671
11.67502344 0. 1.93885474 9.20353461]
[15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647
10.21846781 1.93885474 0. 7.28280909]
[ 8.34541481 9.29953014 13.40020548 20.48281039 19.97465594 12.24773666
6.08016463 9.20353461 7.28280909 0. ]]


Plot:



enter image description here







python performance beginner coordinate-system






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 13 '18 at 9:22

























asked Nov 12 '18 at 7:25









Rrz0

1586




1586








  • 1




    Welcome to Code Review and congratulations on writing a decent question on your first try.
    – Mast
    Nov 12 '18 at 9:10










  • Thanks for the feedback!
    – Rrz0
    Nov 12 '18 at 11:09






  • 2




    This is a great first question, I hope you get some good feedback.
    – esote
    Nov 12 '18 at 14:54






  • 2




    If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
    – Graipher
    Nov 12 '18 at 15:55






  • 1




    Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
    – Rrz0
    Nov 12 '18 at 16:11
















  • 1




    Welcome to Code Review and congratulations on writing a decent question on your first try.
    – Mast
    Nov 12 '18 at 9:10










  • Thanks for the feedback!
    – Rrz0
    Nov 12 '18 at 11:09






  • 2




    This is a great first question, I hope you get some good feedback.
    – esote
    Nov 12 '18 at 14:54






  • 2




    If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
    – Graipher
    Nov 12 '18 at 15:55






  • 1




    Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
    – Rrz0
    Nov 12 '18 at 16:11










1




1




Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10




Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10












Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09




Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09




2




2




This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54




This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54




2




2




If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55




If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55




1




1




Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11






Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11












1 Answer
1






active

oldest

votes


















2














More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).



The easiest to understand version would be to use numpy.loadtxt:



def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")


This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:



import pandas as pd

def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values


Which one is faster depends on the size of your file.



Now we need to modify the display_points function to work with this new data format:



def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()


Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:



from itertools import combinations_with_replacement

def main():

points = read_two_column_file(file_name)

adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))

print(adj_matrix)
display_points(points)


This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.






share|improve this answer





















  • Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
    – Rrz0
    Nov 12 '18 at 17:51






  • 1




    For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
    – Rrz0
    Nov 12 '18 at 18:36












  • @Rrz0 That sounds odd. Will also investigate when I get home later.
    – Graipher
    Nov 12 '18 at 18:43










  • I suspect tuple unpacking gone wrong, somewhere.
    – Mast
    Nov 13 '18 at 10:29










  • @Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
    – Graipher
    Nov 13 '18 at 13:22











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207461%2fextracting-an-adjaceny-matrix-containing-haversine-distance-from-points-on-map%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).



The easiest to understand version would be to use numpy.loadtxt:



def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")


This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:



import pandas as pd

def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values


Which one is faster depends on the size of your file.



Now we need to modify the display_points function to work with this new data format:



def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()


Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:



from itertools import combinations_with_replacement

def main():

points = read_two_column_file(file_name)

adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))

print(adj_matrix)
display_points(points)


This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.






share|improve this answer





















  • Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
    – Rrz0
    Nov 12 '18 at 17:51






  • 1




    For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
    – Rrz0
    Nov 12 '18 at 18:36












  • @Rrz0 That sounds odd. Will also investigate when I get home later.
    – Graipher
    Nov 12 '18 at 18:43










  • I suspect tuple unpacking gone wrong, somewhere.
    – Mast
    Nov 13 '18 at 10:29










  • @Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
    – Graipher
    Nov 13 '18 at 13:22
















2














More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).



The easiest to understand version would be to use numpy.loadtxt:



def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")


This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:



import pandas as pd

def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values


Which one is faster depends on the size of your file.



Now we need to modify the display_points function to work with this new data format:



def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()


Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:



from itertools import combinations_with_replacement

def main():

points = read_two_column_file(file_name)

adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))

print(adj_matrix)
display_points(points)


This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.






share|improve this answer





















  • Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
    – Rrz0
    Nov 12 '18 at 17:51






  • 1




    For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
    – Rrz0
    Nov 12 '18 at 18:36












  • @Rrz0 That sounds odd. Will also investigate when I get home later.
    – Graipher
    Nov 12 '18 at 18:43










  • I suspect tuple unpacking gone wrong, somewhere.
    – Mast
    Nov 13 '18 at 10:29










  • @Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
    – Graipher
    Nov 13 '18 at 13:22














2












2








2






More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).



The easiest to understand version would be to use numpy.loadtxt:



def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")


This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:



import pandas as pd

def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values


Which one is faster depends on the size of your file.



Now we need to modify the display_points function to work with this new data format:



def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()


Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:



from itertools import combinations_with_replacement

def main():

points = read_two_column_file(file_name)

adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))

print(adj_matrix)
display_points(points)


This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.






share|improve this answer












More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).



The easiest to understand version would be to use numpy.loadtxt:



def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")


This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:



import pandas as pd

def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values


Which one is faster depends on the size of your file.



Now we need to modify the display_points function to work with this new data format:



def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()


Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:



from itertools import combinations_with_replacement

def main():

points = read_two_column_file(file_name)

adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))

print(adj_matrix)
display_points(points)


This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 12 '18 at 17:34









Graipher

23.6k53585




23.6k53585












  • Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
    – Rrz0
    Nov 12 '18 at 17:51






  • 1




    For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
    – Rrz0
    Nov 12 '18 at 18:36












  • @Rrz0 That sounds odd. Will also investigate when I get home later.
    – Graipher
    Nov 12 '18 at 18:43










  • I suspect tuple unpacking gone wrong, somewhere.
    – Mast
    Nov 13 '18 at 10:29










  • @Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
    – Graipher
    Nov 13 '18 at 13:22


















  • Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
    – Rrz0
    Nov 12 '18 at 17:51






  • 1




    For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
    – Rrz0
    Nov 12 '18 at 18:36












  • @Rrz0 That sounds odd. Will also investigate when I get home later.
    – Graipher
    Nov 12 '18 at 18:43










  • I suspect tuple unpacking gone wrong, somewhere.
    – Mast
    Nov 13 '18 at 10:29










  • @Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
    – Graipher
    Nov 13 '18 at 13:22
















Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51




Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51




1




1




For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36






For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36














@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43




@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43












I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29




I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29












@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
– Graipher
Nov 13 '18 at 13:22




@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
– Graipher
Nov 13 '18 at 13:22


















draft saved

draft discarded




















































Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207461%2fextracting-an-adjaceny-matrix-containing-haversine-distance-from-points-on-map%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to reconfigure Docker Trusted Registry 2.x.x to use CEPH FS mount instead of NFS and other traditional...

is 'sed' thread safe

How to make a Squid Proxy server?