Extracting an adjaceny matrix containing haversine distance from points on map
I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.
The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?
import csv
from haversine import haversine
import matplotlib.pyplot as plt
import numpy as np
def read_two_column_file(file_name):
with open(file_name, 'r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )
long =
lat =
for col in csv_input:
x = float(col[0]) # converting to float
y = float(col[1])
long.append(x)
lat.append(y)
return long, lat
def display_points(long, lat):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(long, lat)
plt.show()
def main():
long, lat = read_two_column_file('latlong.txt')
points =
for i in range(len(lat)):
coords = tuple([lat[i], long[i]]) # converting to tuple to be able to perform haverine calc.
points.append(coords)
hav =
for i in range(len(lat)):
for j in range(len(long)):
hav.append(haversine(points[i], points[j]))
np.asarray(hav)
adj_matrix = np.reshape(hav, (10, 10)) # reshaping to 10 x 10 matrix
print(adj_matrix)
display_points(long, lat)
main()
Sample Input:
35.905333, 14.471970
35.896389, 14.477780
35.901281, 14.518173
35.860491, 14.572245
35.807607, 14.535320
35.832267, 14.455894
35.882414, 14.373217
35.983794, 14.336096
35.974463, 14.351006
35.930951, 14.401137
Sample Output:
[[ 0. 1.15959635 5.15603243 12.15003864 12.66090817 8.06760374
11.25481465 17.31108648 15.37358741 8.34541481]
[ 1.15959635 0. 4.52227294 11.19223786 11.50131214 7.32033758
11.72388583 18.35259685 16.41378987 9.29953014]
[ 5.15603243 4.52227294 0. 7.44480948 10.26177912 10.15688933
16.24592213 22.1101544 20.18967731 13.40020548]
[12.15003864 11.19223786 7.44480948 0. 7.01813758 13.28961044
22.25645098 29.42422794 27.49154954 20.48281039]
[12.66090817 11.50131214 10.26177912 7.01813758 0. 9.22215871
19.74293886 29.16680205 27.25540014 19.97465594]
[ 8.06760374 7.32033758 10.15688933 13.28961044 9.22215871 0.
10.66219491 21.06632671 19.24994647 12.24773666]
[11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491
0. 11.67502344 10.21846781 6.08016463]
[17.31108648 18.35259685 22.1101544 29.42422794 29.16680205 21.06632671
11.67502344 0. 1.93885474 9.20353461]
[15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647
10.21846781 1.93885474 0. 7.28280909]
[ 8.34541481 9.29953014 13.40020548 20.48281039 19.97465594 12.24773666
6.08016463 9.20353461 7.28280909 0. ]]
Plot:
python performance beginner coordinate-system
add a comment |
I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.
The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?
import csv
from haversine import haversine
import matplotlib.pyplot as plt
import numpy as np
def read_two_column_file(file_name):
with open(file_name, 'r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )
long =
lat =
for col in csv_input:
x = float(col[0]) # converting to float
y = float(col[1])
long.append(x)
lat.append(y)
return long, lat
def display_points(long, lat):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(long, lat)
plt.show()
def main():
long, lat = read_two_column_file('latlong.txt')
points =
for i in range(len(lat)):
coords = tuple([lat[i], long[i]]) # converting to tuple to be able to perform haverine calc.
points.append(coords)
hav =
for i in range(len(lat)):
for j in range(len(long)):
hav.append(haversine(points[i], points[j]))
np.asarray(hav)
adj_matrix = np.reshape(hav, (10, 10)) # reshaping to 10 x 10 matrix
print(adj_matrix)
display_points(long, lat)
main()
Sample Input:
35.905333, 14.471970
35.896389, 14.477780
35.901281, 14.518173
35.860491, 14.572245
35.807607, 14.535320
35.832267, 14.455894
35.882414, 14.373217
35.983794, 14.336096
35.974463, 14.351006
35.930951, 14.401137
Sample Output:
[[ 0. 1.15959635 5.15603243 12.15003864 12.66090817 8.06760374
11.25481465 17.31108648 15.37358741 8.34541481]
[ 1.15959635 0. 4.52227294 11.19223786 11.50131214 7.32033758
11.72388583 18.35259685 16.41378987 9.29953014]
[ 5.15603243 4.52227294 0. 7.44480948 10.26177912 10.15688933
16.24592213 22.1101544 20.18967731 13.40020548]
[12.15003864 11.19223786 7.44480948 0. 7.01813758 13.28961044
22.25645098 29.42422794 27.49154954 20.48281039]
[12.66090817 11.50131214 10.26177912 7.01813758 0. 9.22215871
19.74293886 29.16680205 27.25540014 19.97465594]
[ 8.06760374 7.32033758 10.15688933 13.28961044 9.22215871 0.
10.66219491 21.06632671 19.24994647 12.24773666]
[11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491
0. 11.67502344 10.21846781 6.08016463]
[17.31108648 18.35259685 22.1101544 29.42422794 29.16680205 21.06632671
11.67502344 0. 1.93885474 9.20353461]
[15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647
10.21846781 1.93885474 0. 7.28280909]
[ 8.34541481 9.29953014 13.40020548 20.48281039 19.97465594 12.24773666
6.08016463 9.20353461 7.28280909 0. ]]
Plot:
python performance beginner coordinate-system
1
Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10
Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09
2
This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54
2
If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55
1
Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11
add a comment |
I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.
The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?
import csv
from haversine import haversine
import matplotlib.pyplot as plt
import numpy as np
def read_two_column_file(file_name):
with open(file_name, 'r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )
long =
lat =
for col in csv_input:
x = float(col[0]) # converting to float
y = float(col[1])
long.append(x)
lat.append(y)
return long, lat
def display_points(long, lat):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(long, lat)
plt.show()
def main():
long, lat = read_two_column_file('latlong.txt')
points =
for i in range(len(lat)):
coords = tuple([lat[i], long[i]]) # converting to tuple to be able to perform haverine calc.
points.append(coords)
hav =
for i in range(len(lat)):
for j in range(len(long)):
hav.append(haversine(points[i], points[j]))
np.asarray(hav)
adj_matrix = np.reshape(hav, (10, 10)) # reshaping to 10 x 10 matrix
print(adj_matrix)
display_points(long, lat)
main()
Sample Input:
35.905333, 14.471970
35.896389, 14.477780
35.901281, 14.518173
35.860491, 14.572245
35.807607, 14.535320
35.832267, 14.455894
35.882414, 14.373217
35.983794, 14.336096
35.974463, 14.351006
35.930951, 14.401137
Sample Output:
[[ 0. 1.15959635 5.15603243 12.15003864 12.66090817 8.06760374
11.25481465 17.31108648 15.37358741 8.34541481]
[ 1.15959635 0. 4.52227294 11.19223786 11.50131214 7.32033758
11.72388583 18.35259685 16.41378987 9.29953014]
[ 5.15603243 4.52227294 0. 7.44480948 10.26177912 10.15688933
16.24592213 22.1101544 20.18967731 13.40020548]
[12.15003864 11.19223786 7.44480948 0. 7.01813758 13.28961044
22.25645098 29.42422794 27.49154954 20.48281039]
[12.66090817 11.50131214 10.26177912 7.01813758 0. 9.22215871
19.74293886 29.16680205 27.25540014 19.97465594]
[ 8.06760374 7.32033758 10.15688933 13.28961044 9.22215871 0.
10.66219491 21.06632671 19.24994647 12.24773666]
[11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491
0. 11.67502344 10.21846781 6.08016463]
[17.31108648 18.35259685 22.1101544 29.42422794 29.16680205 21.06632671
11.67502344 0. 1.93885474 9.20353461]
[15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647
10.21846781 1.93885474 0. 7.28280909]
[ 8.34541481 9.29953014 13.40020548 20.48281039 19.97465594 12.24773666
6.08016463 9.20353461 7.28280909 0. ]]
Plot:
python performance beginner coordinate-system
I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.
The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?
import csv
from haversine import haversine
import matplotlib.pyplot as plt
import numpy as np
def read_two_column_file(file_name):
with open(file_name, 'r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )
long =
lat =
for col in csv_input:
x = float(col[0]) # converting to float
y = float(col[1])
long.append(x)
lat.append(y)
return long, lat
def display_points(long, lat):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(long, lat)
plt.show()
def main():
long, lat = read_two_column_file('latlong.txt')
points =
for i in range(len(lat)):
coords = tuple([lat[i], long[i]]) # converting to tuple to be able to perform haverine calc.
points.append(coords)
hav =
for i in range(len(lat)):
for j in range(len(long)):
hav.append(haversine(points[i], points[j]))
np.asarray(hav)
adj_matrix = np.reshape(hav, (10, 10)) # reshaping to 10 x 10 matrix
print(adj_matrix)
display_points(long, lat)
main()
Sample Input:
35.905333, 14.471970
35.896389, 14.477780
35.901281, 14.518173
35.860491, 14.572245
35.807607, 14.535320
35.832267, 14.455894
35.882414, 14.373217
35.983794, 14.336096
35.974463, 14.351006
35.930951, 14.401137
Sample Output:
[[ 0. 1.15959635 5.15603243 12.15003864 12.66090817 8.06760374
11.25481465 17.31108648 15.37358741 8.34541481]
[ 1.15959635 0. 4.52227294 11.19223786 11.50131214 7.32033758
11.72388583 18.35259685 16.41378987 9.29953014]
[ 5.15603243 4.52227294 0. 7.44480948 10.26177912 10.15688933
16.24592213 22.1101544 20.18967731 13.40020548]
[12.15003864 11.19223786 7.44480948 0. 7.01813758 13.28961044
22.25645098 29.42422794 27.49154954 20.48281039]
[12.66090817 11.50131214 10.26177912 7.01813758 0. 9.22215871
19.74293886 29.16680205 27.25540014 19.97465594]
[ 8.06760374 7.32033758 10.15688933 13.28961044 9.22215871 0.
10.66219491 21.06632671 19.24994647 12.24773666]
[11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491
0. 11.67502344 10.21846781 6.08016463]
[17.31108648 18.35259685 22.1101544 29.42422794 29.16680205 21.06632671
11.67502344 0. 1.93885474 9.20353461]
[15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647
10.21846781 1.93885474 0. 7.28280909]
[ 8.34541481 9.29953014 13.40020548 20.48281039 19.97465594 12.24773666
6.08016463 9.20353461 7.28280909 0. ]]
Plot:
python performance beginner coordinate-system
python performance beginner coordinate-system
edited Nov 13 '18 at 9:22
asked Nov 12 '18 at 7:25
Rrz0
1586
1586
1
Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10
Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09
2
This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54
2
If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55
1
Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11
add a comment |
1
Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10
Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09
2
This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54
2
If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55
1
Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11
1
1
Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10
Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10
Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09
Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09
2
2
This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54
This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54
2
2
If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55
If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55
1
1
Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11
Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11
add a comment |
1 Answer
1
active
oldest
votes
More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).
The easiest to understand version would be to use numpy.loadtxt
:
def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")
This is then a 2D numpy.array
. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv
instead:
import pandas as pd
def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values
Which one is faster depends on the size of your file.
Now we need to modify the display_points
function to work with this new data format:
def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()
Now for the actual calculation. First, you can use itertools.combinations_with_replacement
to get all pairs of points. Then you can insert them directly into the correct row of an array:
from itertools import combinations_with_replacement
def main():
points = read_two_column_file(file_name)
adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))
print(adj_matrix)
display_points(points)
This can probably be further improved by using numpy.meshgrid
to get the combinations of points and using a vectorized version of the haversine function.
Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51
1
For some reason I'm getting an error on Line 28 being:adj_matrix[i] = haversine(point1, point2)
.ValueError: not enough values to unpack (expected 2, got 1)
I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36
@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43
I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29
@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace theread_two_column_file
function?
– Graipher
Nov 13 '18 at 13:22
|
show 2 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207461%2fextracting-an-adjaceny-matrix-containing-haversine-distance-from-points-on-map%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).
The easiest to understand version would be to use numpy.loadtxt
:
def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")
This is then a 2D numpy.array
. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv
instead:
import pandas as pd
def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values
Which one is faster depends on the size of your file.
Now we need to modify the display_points
function to work with this new data format:
def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()
Now for the actual calculation. First, you can use itertools.combinations_with_replacement
to get all pairs of points. Then you can insert them directly into the correct row of an array:
from itertools import combinations_with_replacement
def main():
points = read_two_column_file(file_name)
adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))
print(adj_matrix)
display_points(points)
This can probably be further improved by using numpy.meshgrid
to get the combinations of points and using a vectorized version of the haversine function.
Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51
1
For some reason I'm getting an error on Line 28 being:adj_matrix[i] = haversine(point1, point2)
.ValueError: not enough values to unpack (expected 2, got 1)
I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36
@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43
I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29
@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace theread_two_column_file
function?
– Graipher
Nov 13 '18 at 13:22
|
show 2 more comments
More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).
The easiest to understand version would be to use numpy.loadtxt
:
def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")
This is then a 2D numpy.array
. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv
instead:
import pandas as pd
def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values
Which one is faster depends on the size of your file.
Now we need to modify the display_points
function to work with this new data format:
def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()
Now for the actual calculation. First, you can use itertools.combinations_with_replacement
to get all pairs of points. Then you can insert them directly into the correct row of an array:
from itertools import combinations_with_replacement
def main():
points = read_two_column_file(file_name)
adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))
print(adj_matrix)
display_points(points)
This can probably be further improved by using numpy.meshgrid
to get the combinations of points and using a vectorized version of the haversine function.
Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51
1
For some reason I'm getting an error on Line 28 being:adj_matrix[i] = haversine(point1, point2)
.ValueError: not enough values to unpack (expected 2, got 1)
I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36
@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43
I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29
@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace theread_two_column_file
function?
– Graipher
Nov 13 '18 at 13:22
|
show 2 more comments
More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).
The easiest to understand version would be to use numpy.loadtxt
:
def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")
This is then a 2D numpy.array
. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv
instead:
import pandas as pd
def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values
Which one is faster depends on the size of your file.
Now we need to modify the display_points
function to work with this new data format:
def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()
Now for the actual calculation. First, you can use itertools.combinations_with_replacement
to get all pairs of points. Then you can insert them directly into the correct row of an array:
from itertools import combinations_with_replacement
def main():
points = read_two_column_file(file_name)
adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))
print(adj_matrix)
display_points(points)
This can probably be further improved by using numpy.meshgrid
to get the combinations of points and using a vectorized version of the haversine function.
More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).
The easiest to understand version would be to use numpy.loadtxt
:
def read_two_column_file(file_name):
return np.loadtxt(file_name, delimiter=", ")
This is then a 2D numpy.array
. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv
instead:
import pandas as pd
def read_two_column_file(file_name):
return pd.read_csv(file_name, header=None).values
Which one is faster depends on the size of your file.
Now we need to modify the display_points
function to work with this new data format:
def display_points(points):
plt.figure()
plt.ylabel('longitude')
plt.xlabel('latitude')
plt.title('longitude vs latitude')
plt.scatter(points[:, 0], points[:, 1])
plt.show()
Now for the actual calculation. First, you can use itertools.combinations_with_replacement
to get all pairs of points. Then you can insert them directly into the correct row of an array:
from itertools import combinations_with_replacement
def main():
points = read_two_column_file(file_name)
adj_matrix = np.empty(len(points)**2)
for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):
adj_matrix[i] = haversine(point1, point2)
adj_matrix.reshape((len(points), len(points))
print(adj_matrix)
display_points(points)
This can probably be further improved by using numpy.meshgrid
to get the combinations of points and using a vectorized version of the haversine function.
answered Nov 12 '18 at 17:34
Graipher
23.6k53585
23.6k53585
Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51
1
For some reason I'm getting an error on Line 28 being:adj_matrix[i] = haversine(point1, point2)
.ValueError: not enough values to unpack (expected 2, got 1)
I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36
@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43
I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29
@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace theread_two_column_file
function?
– Graipher
Nov 13 '18 at 13:22
|
show 2 more comments
Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51
1
For some reason I'm getting an error on Line 28 being:adj_matrix[i] = haversine(point1, point2)
.ValueError: not enough values to unpack (expected 2, got 1)
I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36
@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43
I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29
@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace theread_two_column_file
function?
– Graipher
Nov 13 '18 at 13:22
Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51
Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51
1
1
For some reason I'm getting an error on Line 28 being:
adj_matrix[i] = haversine(point1, point2)
. ValueError: not enough values to unpack (expected 2, got 1)
I haven't yet found what's wrong.– Rrz0
Nov 12 '18 at 18:36
For some reason I'm getting an error on Line 28 being:
adj_matrix[i] = haversine(point1, point2)
. ValueError: not enough values to unpack (expected 2, got 1)
I haven't yet found what's wrong.– Rrz0
Nov 12 '18 at 18:36
@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43
@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43
I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29
I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29
@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the
read_two_column_file
function?– Graipher
Nov 13 '18 at 13:22
@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the
read_two_column_file
function?– Graipher
Nov 13 '18 at 13:22
|
show 2 more comments
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207461%2fextracting-an-adjaceny-matrix-containing-haversine-distance-from-points-on-map%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10
Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09
2
This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54
2
If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55
1
Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11