Taking text from a file and formatting it
up vote
0
down vote
favorite
My code takes numbers from a large text file, then splits it to organise the spacing and to place it into a 2-dimensional array. The code is used to get data for a job scheduler that I'm building.
#reading in workload data
def getworkload():
work =
strings =
with open("workload.txt") as f:
read_data = f.read()
jobs = read_data.split("n")
for j in jobs:
strings.append(" ".join(j.split()))
for i in strings:
work.append([float(s) for s in i.split(" ")])
return work
print(getworkload())
The text file is over 2000 lines long, and looks like this:
1 0 1835117 330855 640 5886 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
2 0 2265800 251924 640 3124 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
3 1 3114175 -1 640 -1 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
4 1813487 7481 -1 128 -1 20250 -1 -1 -1 5 3 1 5 8 -1 -1 -1
5 1814044 0 122 512 1.13 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
6 1814374 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
7 1814511 0 55 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
8 1814695 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
9 1815198 0 75 512 2.14 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
10 1815617 0 115 512 1.87 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
…
It takes 2 and a half minutes to run but I can print the returned data. How can it be optimised?
python performance csv formatting
New contributor
add a comment |
up vote
0
down vote
favorite
My code takes numbers from a large text file, then splits it to organise the spacing and to place it into a 2-dimensional array. The code is used to get data for a job scheduler that I'm building.
#reading in workload data
def getworkload():
work =
strings =
with open("workload.txt") as f:
read_data = f.read()
jobs = read_data.split("n")
for j in jobs:
strings.append(" ".join(j.split()))
for i in strings:
work.append([float(s) for s in i.split(" ")])
return work
print(getworkload())
The text file is over 2000 lines long, and looks like this:
1 0 1835117 330855 640 5886 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
2 0 2265800 251924 640 3124 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
3 1 3114175 -1 640 -1 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
4 1813487 7481 -1 128 -1 20250 -1 -1 -1 5 3 1 5 8 -1 -1 -1
5 1814044 0 122 512 1.13 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
6 1814374 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
7 1814511 0 55 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
8 1814695 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
9 1815198 0 75 512 2.14 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
10 1815617 0 115 512 1.87 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
…
It takes 2 and a half minutes to run but I can print the returned data. How can it be optimised?
python performance csv formatting
New contributor
1
Welcome on Code Review. I'm afraid this question does not match what this site is about. Code Review is about improving existing, working code. If you're having trouble getting something working, or ask for features, then you'd better ask on StackOverflow (the main site)
– Calak
20 hours ago
The code works, as I can print work_row with out any problems and I know that work will be a two dimensional array/list. I just believe it can be sped up.
– timtti
20 hours ago
1
"If I try to print work the text is too long and I get an overflow error" for me it's sounds lile you have a problem. Try to reformulated your question to get rid of this doubt.
– Calak
20 hours ago
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
My code takes numbers from a large text file, then splits it to organise the spacing and to place it into a 2-dimensional array. The code is used to get data for a job scheduler that I'm building.
#reading in workload data
def getworkload():
work =
strings =
with open("workload.txt") as f:
read_data = f.read()
jobs = read_data.split("n")
for j in jobs:
strings.append(" ".join(j.split()))
for i in strings:
work.append([float(s) for s in i.split(" ")])
return work
print(getworkload())
The text file is over 2000 lines long, and looks like this:
1 0 1835117 330855 640 5886 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
2 0 2265800 251924 640 3124 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
3 1 3114175 -1 640 -1 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
4 1813487 7481 -1 128 -1 20250 -1 -1 -1 5 3 1 5 8 -1 -1 -1
5 1814044 0 122 512 1.13 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
6 1814374 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
7 1814511 0 55 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
8 1814695 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
9 1815198 0 75 512 2.14 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
10 1815617 0 115 512 1.87 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
…
It takes 2 and a half minutes to run but I can print the returned data. How can it be optimised?
python performance csv formatting
New contributor
My code takes numbers from a large text file, then splits it to organise the spacing and to place it into a 2-dimensional array. The code is used to get data for a job scheduler that I'm building.
#reading in workload data
def getworkload():
work =
strings =
with open("workload.txt") as f:
read_data = f.read()
jobs = read_data.split("n")
for j in jobs:
strings.append(" ".join(j.split()))
for i in strings:
work.append([float(s) for s in i.split(" ")])
return work
print(getworkload())
The text file is over 2000 lines long, and looks like this:
1 0 1835117 330855 640 5886 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
2 0 2265800 251924 640 3124 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
3 1 3114175 -1 640 -1 945 -1 -1 -1 5 2 1 4 9 -1 -1 -1
4 1813487 7481 -1 128 -1 20250 -1 -1 -1 5 3 1 5 8 -1 -1 -1
5 1814044 0 122 512 1.13 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
6 1814374 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
7 1814511 0 55 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
8 1814695 1 51 512 -1 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
9 1815198 0 75 512 2.14 1181 -1 -1 -1 1 1 1 2 9 -1 -1 -1
10 1815617 0 115 512 1.87 1181 -1 -1 -1 1 1 1 1 9 -1 -1 -1
…
It takes 2 and a half minutes to run but I can print the returned data. How can it be optimised?
python performance csv formatting
python performance csv formatting
New contributor
New contributor
edited 39 mins ago
200_success
127k14148410
127k14148410
New contributor
asked 20 hours ago
timtti
83
83
New contributor
New contributor
1
Welcome on Code Review. I'm afraid this question does not match what this site is about. Code Review is about improving existing, working code. If you're having trouble getting something working, or ask for features, then you'd better ask on StackOverflow (the main site)
– Calak
20 hours ago
The code works, as I can print work_row with out any problems and I know that work will be a two dimensional array/list. I just believe it can be sped up.
– timtti
20 hours ago
1
"If I try to print work the text is too long and I get an overflow error" for me it's sounds lile you have a problem. Try to reformulated your question to get rid of this doubt.
– Calak
20 hours ago
add a comment |
1
Welcome on Code Review. I'm afraid this question does not match what this site is about. Code Review is about improving existing, working code. If you're having trouble getting something working, or ask for features, then you'd better ask on StackOverflow (the main site)
– Calak
20 hours ago
The code works, as I can print work_row with out any problems and I know that work will be a two dimensional array/list. I just believe it can be sped up.
– timtti
20 hours ago
1
"If I try to print work the text is too long and I get an overflow error" for me it's sounds lile you have a problem. Try to reformulated your question to get rid of this doubt.
– Calak
20 hours ago
1
1
Welcome on Code Review. I'm afraid this question does not match what this site is about. Code Review is about improving existing, working code. If you're having trouble getting something working, or ask for features, then you'd better ask on StackOverflow (the main site)
– Calak
20 hours ago
Welcome on Code Review. I'm afraid this question does not match what this site is about. Code Review is about improving existing, working code. If you're having trouble getting something working, or ask for features, then you'd better ask on StackOverflow (the main site)
– Calak
20 hours ago
The code works, as I can print work_row with out any problems and I know that work will be a two dimensional array/list. I just believe it can be sped up.
– timtti
20 hours ago
The code works, as I can print work_row with out any problems and I know that work will be a two dimensional array/list. I just believe it can be sped up.
– timtti
20 hours ago
1
1
"If I try to print work the text is too long and I get an overflow error" for me it's sounds lile you have a problem. Try to reformulated your question to get rid of this doubt.
– Calak
20 hours ago
"If I try to print work the text is too long and I get an overflow error" for me it's sounds lile you have a problem. Try to reformulated your question to get rid of this doubt.
– Calak
20 hours ago
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
You are doing a lot of unnecessary work. Why split each row only to join it with single spaces and then split it again by those single spaces?
Instead, here is a list comprehension that should do the same thing:
def get_workload(file_name="workload.txt"):
with open(file_name) as f:
return [[float(x) for x in row.split()] for row in f]
This uses the fact that files are iterable and when iterating over them you get each row on its own.
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
You are doing a lot of unnecessary work. Why split each row only to join it with single spaces and then split it again by those single spaces?
Instead, here is a list comprehension that should do the same thing:
def get_workload(file_name="workload.txt"):
with open(file_name) as f:
return [[float(x) for x in row.split()] for row in f]
This uses the fact that files are iterable and when iterating over them you get each row on its own.
add a comment |
up vote
1
down vote
accepted
You are doing a lot of unnecessary work. Why split each row only to join it with single spaces and then split it again by those single spaces?
Instead, here is a list comprehension that should do the same thing:
def get_workload(file_name="workload.txt"):
with open(file_name) as f:
return [[float(x) for x in row.split()] for row in f]
This uses the fact that files are iterable and when iterating over them you get each row on its own.
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
You are doing a lot of unnecessary work. Why split each row only to join it with single spaces and then split it again by those single spaces?
Instead, here is a list comprehension that should do the same thing:
def get_workload(file_name="workload.txt"):
with open(file_name) as f:
return [[float(x) for x in row.split()] for row in f]
This uses the fact that files are iterable and when iterating over them you get each row on its own.
You are doing a lot of unnecessary work. Why split each row only to join it with single spaces and then split it again by those single spaces?
Instead, here is a list comprehension that should do the same thing:
def get_workload(file_name="workload.txt"):
with open(file_name) as f:
return [[float(x) for x in row.split()] for row in f]
This uses the fact that files are iterable and when iterating over them you get each row on its own.
answered 16 hours ago
Graipher
21.8k53183
21.8k53183
add a comment |
add a comment |
timtti is a new contributor. Be nice, and check out our Code of Conduct.
timtti is a new contributor. Be nice, and check out our Code of Conduct.
timtti is a new contributor. Be nice, and check out our Code of Conduct.
timtti is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f207469%2ftaking-text-from-a-file-and-formatting-it%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
Welcome on Code Review. I'm afraid this question does not match what this site is about. Code Review is about improving existing, working code. If you're having trouble getting something working, or ask for features, then you'd better ask on StackOverflow (the main site)
– Calak
20 hours ago
The code works, as I can print work_row with out any problems and I know that work will be a two dimensional array/list. I just believe it can be sped up.
– timtti
20 hours ago
1
"If I try to print work the text is too long and I get an overflow error" for me it's sounds lile you have a problem. Try to reformulated your question to get rid of this doubt.
– Calak
20 hours ago