Extracting an adjaceny matrix containing haversine distance from points on map

I am extracting 10 lat/long points from Google Maps and placing these into a text file. The program should be able to read in the text file, calculate the haversine distance between each point, and store in an adjacency matrix. The adjacency matrix will eventually be fed to a 2-opt algorithm, which is outside the scope of the code I am about to present.

The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?

import csv

from haversine import haversine

import matplotlib.pyplot as plt

import numpy as np





def read_two_column_file(file_name):

    with open(file_name, 'r') as f_input:

        csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )

        long = 

        lat = 

        for col in csv_input:

            x = float(col[0])  # converting to float

            y = float(col[1])

            long.append(x)

            lat.append(y)



    return long, lat





def display_points(long, lat):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(long, lat)

    plt.show()





def main():

    long, lat = read_two_column_file('latlong.txt')



    points = 

    for i in range(len(lat)):

            coords = tuple([lat[i], long[i]])  # converting to tuple to be able to perform haverine calc. 

            points.append(coords)



    hav = 

    for i in range(len(lat)):

        for j in range(len(long)):

            hav.append(haversine(points[i], points[j]))



    np.asarray(hav)

    adj_matrix = np.reshape(hav, (10, 10))  # reshaping to 10 x 10 matrix

    print(adj_matrix)



    display_points(long, lat)





main()

Sample Input:

35.905333, 14.471970

35.896389, 14.477780

35.901281, 14.518173

35.860491, 14.572245

35.807607, 14.535320

35.832267, 14.455894

35.882414, 14.373217

35.983794, 14.336096

35.974463, 14.351006

35.930951, 14.401137

Sample Output:

[[ 0.          1.15959635  5.15603243 12.15003864 12.66090817  8.06760374

  11.25481465 17.31108648 15.37358741  8.34541481]

 [ 1.15959635  0.          4.52227294 11.19223786 11.50131214  7.32033758

  11.72388583 18.35259685 16.41378987  9.29953014]

 [ 5.15603243  4.52227294  0.          7.44480948 10.26177912 10.15688933

  16.24592213 22.1101544  20.18967731 13.40020548]

 [12.15003864 11.19223786  7.44480948  0.          7.01813758 13.28961044

  22.25645098 29.42422794 27.49154954 20.48281039]

 [12.66090817 11.50131214 10.26177912  7.01813758  0.          9.22215871

  19.74293886 29.16680205 27.25540014 19.97465594]

 [ 8.06760374  7.32033758 10.15688933 13.28961044  9.22215871  0.

  10.66219491 21.06632671 19.24994647 12.24773666]

 [11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491

   0.         11.67502344 10.21846781  6.08016463]

 [17.31108648 18.35259685 22.1101544  29.42422794 29.16680205 21.06632671

  11.67502344  0.          1.93885474  9.20353461]

 [15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647

  10.21846781  1.93885474  0.          7.28280909]

 [ 8.34541481  9.29953014 13.40020548 20.48281039 19.97465594 12.24773666

   6.08016463  9.20353461  7.28280909  0.        ]]

Plot:

enter image description here

edited Nov 13 '18 at 9:22

asked Nov 12 '18 at 7:25

Rrz0

1586

1

Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10

Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09

2

This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54

2

If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55

1

Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11

add a comment |

The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?

import csv

from haversine import haversine

import matplotlib.pyplot as plt

import numpy as np





def read_two_column_file(file_name):

    with open(file_name, 'r') as f_input:

        csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )

        long = 

        lat = 

        for col in csv_input:

            x = float(col[0])  # converting to float

            y = float(col[1])

            long.append(x)

            lat.append(y)



    return long, lat





def display_points(long, lat):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(long, lat)

    plt.show()





def main():

    long, lat = read_two_column_file('latlong.txt')



    points = 

    for i in range(len(lat)):

            coords = tuple([lat[i], long[i]])  # converting to tuple to be able to perform haverine calc. 

            points.append(coords)



    hav = 

    for i in range(len(lat)):

        for j in range(len(long)):

            hav.append(haversine(points[i], points[j]))



    np.asarray(hav)

    adj_matrix = np.reshape(hav, (10, 10))  # reshaping to 10 x 10 matrix

    print(adj_matrix)



    display_points(long, lat)





main()

Sample Input:

35.905333, 14.471970

35.896389, 14.477780

35.901281, 14.518173

35.860491, 14.572245

35.807607, 14.535320

35.832267, 14.455894

35.882414, 14.373217

35.983794, 14.336096

35.974463, 14.351006

35.930951, 14.401137

Sample Output:

[[ 0.          1.15959635  5.15603243 12.15003864 12.66090817  8.06760374

  11.25481465 17.31108648 15.37358741  8.34541481]

 [ 1.15959635  0.          4.52227294 11.19223786 11.50131214  7.32033758

  11.72388583 18.35259685 16.41378987  9.29953014]

 [ 5.15603243  4.52227294  0.          7.44480948 10.26177912 10.15688933

  16.24592213 22.1101544  20.18967731 13.40020548]

 [12.15003864 11.19223786  7.44480948  0.          7.01813758 13.28961044

  22.25645098 29.42422794 27.49154954 20.48281039]

 [12.66090817 11.50131214 10.26177912  7.01813758  0.          9.22215871

  19.74293886 29.16680205 27.25540014 19.97465594]

 [ 8.06760374  7.32033758 10.15688933 13.28961044  9.22215871  0.

  10.66219491 21.06632671 19.24994647 12.24773666]

 [11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491

   0.         11.67502344 10.21846781  6.08016463]

 [17.31108648 18.35259685 22.1101544  29.42422794 29.16680205 21.06632671

  11.67502344  0.          1.93885474  9.20353461]

 [15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647

  10.21846781  1.93885474  0.          7.28280909]

 [ 8.34541481  9.29953014 13.40020548 20.48281039 19.97465594 12.24773666

   6.08016463  9.20353461  7.28280909  0.        ]]

Plot:

enter image description here

edited Nov 13 '18 at 9:22

asked Nov 12 '18 at 7:25

Rrz0

1586

1

Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10

Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09

2

This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54

2

If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55

1

Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11

add a comment |

The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?

import csv

from haversine import haversine

import matplotlib.pyplot as plt

import numpy as np





def read_two_column_file(file_name):

    with open(file_name, 'r') as f_input:

        csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )

        long = 

        lat = 

        for col in csv_input:

            x = float(col[0])  # converting to float

            y = float(col[1])

            long.append(x)

            lat.append(y)



    return long, lat





def display_points(long, lat):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(long, lat)

    plt.show()





def main():

    long, lat = read_two_column_file('latlong.txt')



    points = 

    for i in range(len(lat)):

            coords = tuple([lat[i], long[i]])  # converting to tuple to be able to perform haverine calc. 

            points.append(coords)



    hav = 

    for i in range(len(lat)):

        for j in range(len(long)):

            hav.append(haversine(points[i], points[j]))



    np.asarray(hav)

    adj_matrix = np.reshape(hav, (10, 10))  # reshaping to 10 x 10 matrix

    print(adj_matrix)



    display_points(long, lat)





main()

Sample Input:

35.905333, 14.471970

35.896389, 14.477780

35.901281, 14.518173

35.860491, 14.572245

35.807607, 14.535320

35.832267, 14.455894

35.882414, 14.373217

35.983794, 14.336096

35.974463, 14.351006

35.930951, 14.401137

Sample Output:

[[ 0.          1.15959635  5.15603243 12.15003864 12.66090817  8.06760374

  11.25481465 17.31108648 15.37358741  8.34541481]

 [ 1.15959635  0.          4.52227294 11.19223786 11.50131214  7.32033758

  11.72388583 18.35259685 16.41378987  9.29953014]

 [ 5.15603243  4.52227294  0.          7.44480948 10.26177912 10.15688933

  16.24592213 22.1101544  20.18967731 13.40020548]

 [12.15003864 11.19223786  7.44480948  0.          7.01813758 13.28961044

  22.25645098 29.42422794 27.49154954 20.48281039]

 [12.66090817 11.50131214 10.26177912  7.01813758  0.          9.22215871

  19.74293886 29.16680205 27.25540014 19.97465594]

 [ 8.06760374  7.32033758 10.15688933 13.28961044  9.22215871  0.

  10.66219491 21.06632671 19.24994647 12.24773666]

 [11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491

   0.         11.67502344 10.21846781  6.08016463]

 [17.31108648 18.35259685 22.1101544  29.42422794 29.16680205 21.06632671

  11.67502344  0.          1.93885474  9.20353461]

 [15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647

  10.21846781  1.93885474  0.          7.28280909]

 [ 8.34541481  9.29953014 13.40020548 20.48281039 19.97465594 12.24773666

   6.08016463  9.20353461  7.28280909  0.        ]]

Plot:

enter image description here

edited Nov 13 '18 at 9:22

asked Nov 12 '18 at 7:25

Rrz0

1586

The following code is functional, but extremely inefficient. If I had 1000 points instead of 10, the adjacency matrix would need 1000 x 1000 iterations to be filled. Can this be optimized?

import csv

from haversine import haversine

import matplotlib.pyplot as plt

import numpy as np





def read_two_column_file(file_name):

    with open(file_name, 'r') as f_input:

        csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True, )

        long = 

        lat = 

        for col in csv_input:

            x = float(col[0])  # converting to float

            y = float(col[1])

            long.append(x)

            lat.append(y)



    return long, lat





def display_points(long, lat):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(long, lat)

    plt.show()





def main():

    long, lat = read_two_column_file('latlong.txt')



    points = 

    for i in range(len(lat)):

            coords = tuple([lat[i], long[i]])  # converting to tuple to be able to perform haverine calc. 

            points.append(coords)



    hav = 

    for i in range(len(lat)):

        for j in range(len(long)):

            hav.append(haversine(points[i], points[j]))



    np.asarray(hav)

    adj_matrix = np.reshape(hav, (10, 10))  # reshaping to 10 x 10 matrix

    print(adj_matrix)



    display_points(long, lat)





main()

Sample Input:

35.905333, 14.471970

35.896389, 14.477780

35.901281, 14.518173

35.860491, 14.572245

35.807607, 14.535320

35.832267, 14.455894

35.882414, 14.373217

35.983794, 14.336096

35.974463, 14.351006

35.930951, 14.401137

Sample Output:

[[ 0.          1.15959635  5.15603243 12.15003864 12.66090817  8.06760374

  11.25481465 17.31108648 15.37358741  8.34541481]

 [ 1.15959635  0.          4.52227294 11.19223786 11.50131214  7.32033758

  11.72388583 18.35259685 16.41378987  9.29953014]

 [ 5.15603243  4.52227294  0.          7.44480948 10.26177912 10.15688933

  16.24592213 22.1101544  20.18967731 13.40020548]

 [12.15003864 11.19223786  7.44480948  0.          7.01813758 13.28961044

  22.25645098 29.42422794 27.49154954 20.48281039]

 [12.66090817 11.50131214 10.26177912  7.01813758  0.          9.22215871

  19.74293886 29.16680205 27.25540014 19.97465594]

 [ 8.06760374  7.32033758 10.15688933 13.28961044  9.22215871  0.

  10.66219491 21.06632671 19.24994647 12.24773666]

 [11.25481465 11.72388583 16.24592213 22.25645098 19.74293886 10.66219491

   0.         11.67502344 10.21846781  6.08016463]

 [17.31108648 18.35259685 22.1101544  29.42422794 29.16680205 21.06632671

  11.67502344  0.          1.93885474  9.20353461]

 [15.37358741 16.41378987 20.18967731 27.49154954 27.25540014 19.24994647

  10.21846781  1.93885474  0.          7.28280909]

 [ 8.34541481  9.29953014 13.40020548 20.48281039 19.97465594 12.24773666

   6.08016463  9.20353461  7.28280909  0.        ]]

Plot:

enter image description here

python performance beginner coordinate-system

edited Nov 13 '18 at 9:22

asked Nov 12 '18 at 7:25

Rrz0

1586

edited Nov 13 '18 at 9:22

asked Nov 12 '18 at 7:25

Rrz0

1586

edited Nov 13 '18 at 9:22

asked Nov 12 '18 at 7:25

Rrz0

1586

asked Nov 12 '18 at 7:25

Rrz0

1586

asked Nov 12 '18 at 7:25

Rrz0

1586

1

Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10

Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09

2

This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54

2

If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55

1

Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11

add a comment |

1

Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10

Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09

2

This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54

2

If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55

1

Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11

Welcome to Code Review and congratulations on writing a decent question on your first try.
– Mast
Nov 12 '18 at 9:10

Thanks for the feedback!
– Rrz0
Nov 12 '18 at 11:09

This is a great first question, I hope you get some good feedback.
– esote
Nov 12 '18 at 14:54

If you want your code to be error prone, you actually want it so it is easy to have bugs. You maybe meant error proof?
– Graipher
Nov 12 '18 at 15:55

Well spotted @Graipher! Seems like I want suggestions on how to actually include bugs in my code. Thanks for pointing out, will edit.
– Rrz0
Nov 12 '18 at 16:11

add a comment |

1 Answer
1

active

oldest

votes

More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).

The easiest to understand version would be to use numpy.loadtxt:

def read_two_column_file(file_name):

    return np.loadtxt(file_name, delimiter=", ")

This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:

import pandas as pd



def read_two_column_file(file_name):

    return pd.read_csv(file_name, header=None).values

Which one is faster depends on the size of your file.

Now we need to modify the display_points function to work with this new data format:

def display_points(points):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(points[:, 0], points[:, 1])

    plt.show()

Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:

from itertools import combinations_with_replacement



def main():



    points = read_two_column_file(file_name)



    adj_matrix = np.empty(len(points)**2)

    for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):

        adj_matrix[i] = haversine(point1, point2)

    adj_matrix.reshape((len(points), len(points))



    print(adj_matrix)

    display_points(points)

This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.

answered Nov 12 '18 at 17:34

Graipher

23.6k53585

Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51

1

For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36

@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43

I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29

@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
– Graipher
Nov 13 '18 at 13:22

|
show 2 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207461%2fextracting-an-adjaceny-matrix-containing-haversine-distance-from-points-on-map%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).

The easiest to understand version would be to use numpy.loadtxt:

def read_two_column_file(file_name):

    return np.loadtxt(file_name, delimiter=", ")

This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:

import pandas as pd



def read_two_column_file(file_name):

    return pd.read_csv(file_name, header=None).values

Which one is faster depends on the size of your file.

Now we need to modify the display_points function to work with this new data format:

def display_points(points):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(points[:, 0], points[:, 1])

    plt.show()

Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:

from itertools import combinations_with_replacement



def main():



    points = read_two_column_file(file_name)



    adj_matrix = np.empty(len(points)**2)

    for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):

        adj_matrix[i] = haversine(point1, point2)

    adj_matrix.reshape((len(points), len(points))



    print(adj_matrix)

    display_points(points)

This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.

answered Nov 12 '18 at 17:34

Graipher

23.6k53585

Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51

1

For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36

@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43

I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29

@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
– Graipher
Nov 13 '18 at 13:22

|
show 2 more comments

More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).

The easiest to understand version would be to use numpy.loadtxt:

def read_two_column_file(file_name):

    return np.loadtxt(file_name, delimiter=", ")

This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:

import pandas as pd



def read_two_column_file(file_name):

    return pd.read_csv(file_name, header=None).values

Which one is faster depends on the size of your file.

Now we need to modify the display_points function to work with this new data format:

def display_points(points):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(points[:, 0], points[:, 1])

    plt.show()

Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:

from itertools import combinations_with_replacement



def main():



    points = read_two_column_file(file_name)



    adj_matrix = np.empty(len(points)**2)

    for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):

        adj_matrix[i] = haversine(point1, point2)

    adj_matrix.reshape((len(points), len(points))



    print(adj_matrix)

    display_points(points)

This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.

answered Nov 12 '18 at 17:34

Graipher

23.6k53585

Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51

1

For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36

@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43

I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29

@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
– Graipher
Nov 13 '18 at 13:22

|
show 2 more comments

More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).

The easiest to understand version would be to use numpy.loadtxt:

def read_two_column_file(file_name):

    return np.loadtxt(file_name, delimiter=", ")

This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:

import pandas as pd



def read_two_column_file(file_name):

    return pd.read_csv(file_name, header=None).values

Which one is faster depends on the size of your file.

Now we need to modify the display_points function to work with this new data format:

def display_points(points):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(points[:, 0], points[:, 1])

    plt.show()

Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:

from itertools import combinations_with_replacement



def main():



    points = read_two_column_file(file_name)



    adj_matrix = np.empty(len(points)**2)

    for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):

        adj_matrix[i] = haversine(point1, point2)

    adj_matrix.reshape((len(points), len(points))



    print(adj_matrix)

    display_points(points)

This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.

answered Nov 12 '18 at 17:34

Graipher

23.6k53585

More than half of your code is being used to convert from one data format to another (from two lat and long list to tuples and then from a list of lists to an array).

The easiest to understand version would be to use numpy.loadtxt:

def read_two_column_file(file_name):

    return np.loadtxt(file_name, delimiter=", ")

This is then a 2D numpy.array. However, this is actually a lot slower than it could be, so you could also use pandas.read_csv instead:

import pandas as pd



def read_two_column_file(file_name):

    return pd.read_csv(file_name, header=None).values

Which one is faster depends on the size of your file.

Now we need to modify the display_points function to work with this new data format:

def display_points(points):

    plt.figure()

    plt.ylabel('longitude')

    plt.xlabel('latitude')

    plt.title('longitude vs latitude')

    plt.scatter(points[:, 0], points[:, 1])

    plt.show()

Now for the actual calculation. First, you can use itertools.combinations_with_replacement to get all pairs of points. Then you can insert them directly into the correct row of an array:

from itertools import combinations_with_replacement



def main():



    points = read_two_column_file(file_name)



    adj_matrix = np.empty(len(points)**2)

    for i, (point1, point2) in enumerate(combinations_with_replacement(points, 2)):

        adj_matrix[i] = haversine(point1, point2)

    adj_matrix.reshape((len(points), len(points))



    print(adj_matrix)

    display_points(points)

This can probably be further improved by using numpy.meshgrid to get the combinations of points and using a vectorized version of the haversine function.

answered Nov 12 '18 at 17:34

Graipher

23.6k53585

answered Nov 12 '18 at 17:34

Graipher

23.6k53585

answered Nov 12 '18 at 17:34

Graipher

23.6k53585

answered Nov 12 '18 at 17:34

Graipher

23.6k53585

Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51

1

For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36

@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43

I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29

@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
– Graipher
Nov 13 '18 at 13:22

|
show 2 more comments

Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51

1

For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36

@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43

I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29

@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
– Graipher
Nov 13 '18 at 13:22

Thanks for the constructive feedback. Will get back after I look into all suggestions and try to implement.
– Rrz0
Nov 12 '18 at 17:51

For some reason I'm getting an error on Line 28 being: adj_matrix[i] = haversine(point1, point2). ValueError: not enough values to unpack (expected 2, got 1) I haven't yet found what's wrong.
– Rrz0
Nov 12 '18 at 18:36

@Rrz0 That sounds odd. Will also investigate when I get home later.
– Graipher
Nov 12 '18 at 18:43

I suspect tuple unpacking gone wrong, somewhere.
– Mast
Nov 13 '18 at 10:29

@Rrz0: I just tested it again, it works on my machine with Python 3.6.3. Did you maybe not replace the read_two_column_file function?
– Graipher
Nov 13 '18 at 13:22

|
show 2 more comments

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytdyklly