for a persistent perceptual experience, why is video able to have a lower frame rate than audio?
$begingroup$
In film, images are typically shown to us at around 24 frames per second, but modern sound files will often have 44100 or 48000 samples per second.
There's a threshold above ~12 fps where we will perceive successive frames as unified motion instead of individual pictures (c.f. phi phenomenon, persistence of vision, beta movement). But to get this unified experience in the auditory domain, we need a much higher "framerate". Why is this?
perception psychophysics sensory-perception
$endgroup$
add a comment |
$begingroup$
In film, images are typically shown to us at around 24 frames per second, but modern sound files will often have 44100 or 48000 samples per second.
There's a threshold above ~12 fps where we will perceive successive frames as unified motion instead of individual pictures (c.f. phi phenomenon, persistence of vision, beta movement). But to get this unified experience in the auditory domain, we need a much higher "framerate". Why is this?
perception psychophysics sensory-perception
$endgroup$
add a comment |
$begingroup$
In film, images are typically shown to us at around 24 frames per second, but modern sound files will often have 44100 or 48000 samples per second.
There's a threshold above ~12 fps where we will perceive successive frames as unified motion instead of individual pictures (c.f. phi phenomenon, persistence of vision, beta movement). But to get this unified experience in the auditory domain, we need a much higher "framerate". Why is this?
perception psychophysics sensory-perception
$endgroup$
In film, images are typically shown to us at around 24 frames per second, but modern sound files will often have 44100 or 48000 samples per second.
There's a threshold above ~12 fps where we will perceive successive frames as unified motion instead of individual pictures (c.f. phi phenomenon, persistence of vision, beta movement). But to get this unified experience in the auditory domain, we need a much higher "framerate". Why is this?
perception psychophysics sensory-perception
perception psychophysics sensory-perception
asked 9 hours ago
RECURSIVE FARTSRECURSIVE FARTS
3441314
3441314
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Sound is pressure waves; young humans can hear (aka, detect pressure waves) up to about 20 kHz. To produce these high frequency waves with a speaker with a time-domain signal, it is necessary to have a sampling rate at least 2x the highest frequency that will be represented. In practice, those very high frequencies aren't included in music, and definitely aren't included in speech, so ~44kHz is sufficient. There is a membrane inside the cochlea that is structured to vibrate at different frequencies along its length. At the higher frequencies, neurons don't actually respond to every sound wave, they respond to the envelope, so it is possible to respond to frequencies much higher than the frequencies that neurons can even fire at.
Vision depends on detection of photons. A photon hits a photosensitive molecule in a photoreceptor in the retina, which causes a chemical change. That changed chemical binds to a protein, which causes a cascade of events that ultimately causes a change in the release of a neurotransmitter. Vision is slow: the cascade in response to a single photon takes on the order of 100s of milliseconds. We can detect things a bit faster than that because the visual system responds to changes so the slope of that response is a relevant feature, but overall this slow process means that light information is low-pass filtered. As long as a signal is sufficiently faster than this low-pass filter, differences between a frame-by-frame versus a smooth signal are mostly not noticed. However, it isn't true that 24 frames per second is a limit. Modern monitors often operate much faster, such as 60-144Hz, because these faster frame rates are important for perception of smooth motion at high speeds. Slower frame rates are sufficient when changes are small, however.
In nature, a lot of things vibrate a high frequencies into the 1000s of Hz, so there are good evolutionary reasons to detect high frequency sounds. However, very few things move at those speeds, and those that do are typically not behaviorally relevant (e.g., you don't need to see every sweep of an insect's wings to detect it as an insect).
$endgroup$
$begingroup$
Do you have any references for your claims?
$endgroup$
– Chris Rogers
5 hours ago
2
$begingroup$
@ChrisRogers I don't generally provide references for intro textbook-level knowledge. Everything here is available on Wikipedia for people who like to use Wikipedia, or any introductory neuroscience textbook.
$endgroup$
– Bryan Krause
5 hours ago
2
$begingroup$
@ChrisRogers I think that policy makes sense where it makes sense and not where it doesn't. If you have a link to Meta explaining that, I'll have a look, but I'm not aware of such a policy that applies to knowledge like "sound is pressure waves" or "vision is the detection of photons." I don't see value in adding links to Wikipedia or adding an arbitrary textbook source that most people aren't likely to have access to anyways.
$endgroup$
– Bryan Krause
5 hours ago
3
$begingroup$
@ChrisRogers I could provide references to those, but those examples are exactly why I feel referencing everything is a bit silly, because neither of those numbers is in any way relevant to this answer. 2x the highest frequency is the Nyquist rate, a basic concept in physics. The human hearing range is also not important, as long as it's well above the 24 Hz rate for movie video.
$endgroup$
– Bryan Krause
5 hours ago
1
$begingroup$
@ChrisRogers I agree with Bryan, there is nothing in this answer that is not covered in a general reference like Wikipedia (not that I think Wikipedia is an authoritative source, but if it is in wikipedia, it should be considered common knowledge). As the question can be answered with common knowledge, maybe it is not a good fit for the site as it is not an advanced questions in psychology & neuroscience ...
$endgroup$
– StrongBad
4 hours ago
|
show 4 more comments
$begingroup$
I don't have a full answer, but it might get things started...
You are mixing up two concepts frame rate and sampling rate. In a video presented at 24 fps each frame, potentially, has a wide range of spatial frequencies. Typically the spatial frequencies are limited by the number of pixels, but you can low pass filter each frame to reduce the spatial frequencies (you will end up with a blurry picture). This spatial filtering has nothing to do with frame rate.
The 44.1 kHz sampling rate in audio signals is more akin to the spatial frequencies of a picture/frame than the frame rate of a video. An example of audio frames would be something like decomposing the audio signal into a bunch of slices with the short time Fourier transform (STFT), setting each slice to have a constant spectrum (and phase???), and reconstructing. Reconstructing the signal from a modified STFT is non-trivial (cf., Griffin and Lim 1984). Given the difficulties in the process and the lack of an application, I am not sure anyone has really investigated how the duration of the slices affects things.
$endgroup$
$begingroup$
Lack of an application? Audio compression (mp3, etc) typically uses a Fourier transform (or some sort of wavelet). Pretty much all of the dimensions of compression have been investigated to find the most efficient encoding that limits perceived decay, since compression is lossy.
$endgroup$
– Bryan Krause
4 hours ago
$begingroup$
@BryanKrause yes, but I don't see the link between those types of frames and the idea of a constant segment of signal frame. Maybe there is and maybe I am missing the relevant literature and if so would love to see an answer with more reference ... as I said, I don't have a full answer.
$endgroup$
– StrongBad
3 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "391"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fpsychology.stackexchange.com%2fquestions%2f21572%2ffor-a-persistent-perceptual-experience-why-is-video-able-to-have-a-lower-frame%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Sound is pressure waves; young humans can hear (aka, detect pressure waves) up to about 20 kHz. To produce these high frequency waves with a speaker with a time-domain signal, it is necessary to have a sampling rate at least 2x the highest frequency that will be represented. In practice, those very high frequencies aren't included in music, and definitely aren't included in speech, so ~44kHz is sufficient. There is a membrane inside the cochlea that is structured to vibrate at different frequencies along its length. At the higher frequencies, neurons don't actually respond to every sound wave, they respond to the envelope, so it is possible to respond to frequencies much higher than the frequencies that neurons can even fire at.
Vision depends on detection of photons. A photon hits a photosensitive molecule in a photoreceptor in the retina, which causes a chemical change. That changed chemical binds to a protein, which causes a cascade of events that ultimately causes a change in the release of a neurotransmitter. Vision is slow: the cascade in response to a single photon takes on the order of 100s of milliseconds. We can detect things a bit faster than that because the visual system responds to changes so the slope of that response is a relevant feature, but overall this slow process means that light information is low-pass filtered. As long as a signal is sufficiently faster than this low-pass filter, differences between a frame-by-frame versus a smooth signal are mostly not noticed. However, it isn't true that 24 frames per second is a limit. Modern monitors often operate much faster, such as 60-144Hz, because these faster frame rates are important for perception of smooth motion at high speeds. Slower frame rates are sufficient when changes are small, however.
In nature, a lot of things vibrate a high frequencies into the 1000s of Hz, so there are good evolutionary reasons to detect high frequency sounds. However, very few things move at those speeds, and those that do are typically not behaviorally relevant (e.g., you don't need to see every sweep of an insect's wings to detect it as an insect).
$endgroup$
$begingroup$
Do you have any references for your claims?
$endgroup$
– Chris Rogers
5 hours ago
2
$begingroup$
@ChrisRogers I don't generally provide references for intro textbook-level knowledge. Everything here is available on Wikipedia for people who like to use Wikipedia, or any introductory neuroscience textbook.
$endgroup$
– Bryan Krause
5 hours ago
2
$begingroup$
@ChrisRogers I think that policy makes sense where it makes sense and not where it doesn't. If you have a link to Meta explaining that, I'll have a look, but I'm not aware of such a policy that applies to knowledge like "sound is pressure waves" or "vision is the detection of photons." I don't see value in adding links to Wikipedia or adding an arbitrary textbook source that most people aren't likely to have access to anyways.
$endgroup$
– Bryan Krause
5 hours ago
3
$begingroup$
@ChrisRogers I could provide references to those, but those examples are exactly why I feel referencing everything is a bit silly, because neither of those numbers is in any way relevant to this answer. 2x the highest frequency is the Nyquist rate, a basic concept in physics. The human hearing range is also not important, as long as it's well above the 24 Hz rate for movie video.
$endgroup$
– Bryan Krause
5 hours ago
1
$begingroup$
@ChrisRogers I agree with Bryan, there is nothing in this answer that is not covered in a general reference like Wikipedia (not that I think Wikipedia is an authoritative source, but if it is in wikipedia, it should be considered common knowledge). As the question can be answered with common knowledge, maybe it is not a good fit for the site as it is not an advanced questions in psychology & neuroscience ...
$endgroup$
– StrongBad
4 hours ago
|
show 4 more comments
$begingroup$
Sound is pressure waves; young humans can hear (aka, detect pressure waves) up to about 20 kHz. To produce these high frequency waves with a speaker with a time-domain signal, it is necessary to have a sampling rate at least 2x the highest frequency that will be represented. In practice, those very high frequencies aren't included in music, and definitely aren't included in speech, so ~44kHz is sufficient. There is a membrane inside the cochlea that is structured to vibrate at different frequencies along its length. At the higher frequencies, neurons don't actually respond to every sound wave, they respond to the envelope, so it is possible to respond to frequencies much higher than the frequencies that neurons can even fire at.
Vision depends on detection of photons. A photon hits a photosensitive molecule in a photoreceptor in the retina, which causes a chemical change. That changed chemical binds to a protein, which causes a cascade of events that ultimately causes a change in the release of a neurotransmitter. Vision is slow: the cascade in response to a single photon takes on the order of 100s of milliseconds. We can detect things a bit faster than that because the visual system responds to changes so the slope of that response is a relevant feature, but overall this slow process means that light information is low-pass filtered. As long as a signal is sufficiently faster than this low-pass filter, differences between a frame-by-frame versus a smooth signal are mostly not noticed. However, it isn't true that 24 frames per second is a limit. Modern monitors often operate much faster, such as 60-144Hz, because these faster frame rates are important for perception of smooth motion at high speeds. Slower frame rates are sufficient when changes are small, however.
In nature, a lot of things vibrate a high frequencies into the 1000s of Hz, so there are good evolutionary reasons to detect high frequency sounds. However, very few things move at those speeds, and those that do are typically not behaviorally relevant (e.g., you don't need to see every sweep of an insect's wings to detect it as an insect).
$endgroup$
$begingroup$
Do you have any references for your claims?
$endgroup$
– Chris Rogers
5 hours ago
2
$begingroup$
@ChrisRogers I don't generally provide references for intro textbook-level knowledge. Everything here is available on Wikipedia for people who like to use Wikipedia, or any introductory neuroscience textbook.
$endgroup$
– Bryan Krause
5 hours ago
2
$begingroup$
@ChrisRogers I think that policy makes sense where it makes sense and not where it doesn't. If you have a link to Meta explaining that, I'll have a look, but I'm not aware of such a policy that applies to knowledge like "sound is pressure waves" or "vision is the detection of photons." I don't see value in adding links to Wikipedia or adding an arbitrary textbook source that most people aren't likely to have access to anyways.
$endgroup$
– Bryan Krause
5 hours ago
3
$begingroup$
@ChrisRogers I could provide references to those, but those examples are exactly why I feel referencing everything is a bit silly, because neither of those numbers is in any way relevant to this answer. 2x the highest frequency is the Nyquist rate, a basic concept in physics. The human hearing range is also not important, as long as it's well above the 24 Hz rate for movie video.
$endgroup$
– Bryan Krause
5 hours ago
1
$begingroup$
@ChrisRogers I agree with Bryan, there is nothing in this answer that is not covered in a general reference like Wikipedia (not that I think Wikipedia is an authoritative source, but if it is in wikipedia, it should be considered common knowledge). As the question can be answered with common knowledge, maybe it is not a good fit for the site as it is not an advanced questions in psychology & neuroscience ...
$endgroup$
– StrongBad
4 hours ago
|
show 4 more comments
$begingroup$
Sound is pressure waves; young humans can hear (aka, detect pressure waves) up to about 20 kHz. To produce these high frequency waves with a speaker with a time-domain signal, it is necessary to have a sampling rate at least 2x the highest frequency that will be represented. In practice, those very high frequencies aren't included in music, and definitely aren't included in speech, so ~44kHz is sufficient. There is a membrane inside the cochlea that is structured to vibrate at different frequencies along its length. At the higher frequencies, neurons don't actually respond to every sound wave, they respond to the envelope, so it is possible to respond to frequencies much higher than the frequencies that neurons can even fire at.
Vision depends on detection of photons. A photon hits a photosensitive molecule in a photoreceptor in the retina, which causes a chemical change. That changed chemical binds to a protein, which causes a cascade of events that ultimately causes a change in the release of a neurotransmitter. Vision is slow: the cascade in response to a single photon takes on the order of 100s of milliseconds. We can detect things a bit faster than that because the visual system responds to changes so the slope of that response is a relevant feature, but overall this slow process means that light information is low-pass filtered. As long as a signal is sufficiently faster than this low-pass filter, differences between a frame-by-frame versus a smooth signal are mostly not noticed. However, it isn't true that 24 frames per second is a limit. Modern monitors often operate much faster, such as 60-144Hz, because these faster frame rates are important for perception of smooth motion at high speeds. Slower frame rates are sufficient when changes are small, however.
In nature, a lot of things vibrate a high frequencies into the 1000s of Hz, so there are good evolutionary reasons to detect high frequency sounds. However, very few things move at those speeds, and those that do are typically not behaviorally relevant (e.g., you don't need to see every sweep of an insect's wings to detect it as an insect).
$endgroup$
Sound is pressure waves; young humans can hear (aka, detect pressure waves) up to about 20 kHz. To produce these high frequency waves with a speaker with a time-domain signal, it is necessary to have a sampling rate at least 2x the highest frequency that will be represented. In practice, those very high frequencies aren't included in music, and definitely aren't included in speech, so ~44kHz is sufficient. There is a membrane inside the cochlea that is structured to vibrate at different frequencies along its length. At the higher frequencies, neurons don't actually respond to every sound wave, they respond to the envelope, so it is possible to respond to frequencies much higher than the frequencies that neurons can even fire at.
Vision depends on detection of photons. A photon hits a photosensitive molecule in a photoreceptor in the retina, which causes a chemical change. That changed chemical binds to a protein, which causes a cascade of events that ultimately causes a change in the release of a neurotransmitter. Vision is slow: the cascade in response to a single photon takes on the order of 100s of milliseconds. We can detect things a bit faster than that because the visual system responds to changes so the slope of that response is a relevant feature, but overall this slow process means that light information is low-pass filtered. As long as a signal is sufficiently faster than this low-pass filter, differences between a frame-by-frame versus a smooth signal are mostly not noticed. However, it isn't true that 24 frames per second is a limit. Modern monitors often operate much faster, such as 60-144Hz, because these faster frame rates are important for perception of smooth motion at high speeds. Slower frame rates are sufficient when changes are small, however.
In nature, a lot of things vibrate a high frequencies into the 1000s of Hz, so there are good evolutionary reasons to detect high frequency sounds. However, very few things move at those speeds, and those that do are typically not behaviorally relevant (e.g., you don't need to see every sweep of an insect's wings to detect it as an insect).
edited 5 hours ago
answered 7 hours ago
Bryan KrauseBryan Krause
1,399211
1,399211
$begingroup$
Do you have any references for your claims?
$endgroup$
– Chris Rogers
5 hours ago
2
$begingroup$
@ChrisRogers I don't generally provide references for intro textbook-level knowledge. Everything here is available on Wikipedia for people who like to use Wikipedia, or any introductory neuroscience textbook.
$endgroup$
– Bryan Krause
5 hours ago
2
$begingroup$
@ChrisRogers I think that policy makes sense where it makes sense and not where it doesn't. If you have a link to Meta explaining that, I'll have a look, but I'm not aware of such a policy that applies to knowledge like "sound is pressure waves" or "vision is the detection of photons." I don't see value in adding links to Wikipedia or adding an arbitrary textbook source that most people aren't likely to have access to anyways.
$endgroup$
– Bryan Krause
5 hours ago
3
$begingroup$
@ChrisRogers I could provide references to those, but those examples are exactly why I feel referencing everything is a bit silly, because neither of those numbers is in any way relevant to this answer. 2x the highest frequency is the Nyquist rate, a basic concept in physics. The human hearing range is also not important, as long as it's well above the 24 Hz rate for movie video.
$endgroup$
– Bryan Krause
5 hours ago
1
$begingroup$
@ChrisRogers I agree with Bryan, there is nothing in this answer that is not covered in a general reference like Wikipedia (not that I think Wikipedia is an authoritative source, but if it is in wikipedia, it should be considered common knowledge). As the question can be answered with common knowledge, maybe it is not a good fit for the site as it is not an advanced questions in psychology & neuroscience ...
$endgroup$
– StrongBad
4 hours ago
|
show 4 more comments
$begingroup$
Do you have any references for your claims?
$endgroup$
– Chris Rogers
5 hours ago
2
$begingroup$
@ChrisRogers I don't generally provide references for intro textbook-level knowledge. Everything here is available on Wikipedia for people who like to use Wikipedia, or any introductory neuroscience textbook.
$endgroup$
– Bryan Krause
5 hours ago
2
$begingroup$
@ChrisRogers I think that policy makes sense where it makes sense and not where it doesn't. If you have a link to Meta explaining that, I'll have a look, but I'm not aware of such a policy that applies to knowledge like "sound is pressure waves" or "vision is the detection of photons." I don't see value in adding links to Wikipedia or adding an arbitrary textbook source that most people aren't likely to have access to anyways.
$endgroup$
– Bryan Krause
5 hours ago
3
$begingroup$
@ChrisRogers I could provide references to those, but those examples are exactly why I feel referencing everything is a bit silly, because neither of those numbers is in any way relevant to this answer. 2x the highest frequency is the Nyquist rate, a basic concept in physics. The human hearing range is also not important, as long as it's well above the 24 Hz rate for movie video.
$endgroup$
– Bryan Krause
5 hours ago
1
$begingroup$
@ChrisRogers I agree with Bryan, there is nothing in this answer that is not covered in a general reference like Wikipedia (not that I think Wikipedia is an authoritative source, but if it is in wikipedia, it should be considered common knowledge). As the question can be answered with common knowledge, maybe it is not a good fit for the site as it is not an advanced questions in psychology & neuroscience ...
$endgroup$
– StrongBad
4 hours ago
$begingroup$
Do you have any references for your claims?
$endgroup$
– Chris Rogers
5 hours ago
$begingroup$
Do you have any references for your claims?
$endgroup$
– Chris Rogers
5 hours ago
2
2
$begingroup$
@ChrisRogers I don't generally provide references for intro textbook-level knowledge. Everything here is available on Wikipedia for people who like to use Wikipedia, or any introductory neuroscience textbook.
$endgroup$
– Bryan Krause
5 hours ago
$begingroup$
@ChrisRogers I don't generally provide references for intro textbook-level knowledge. Everything here is available on Wikipedia for people who like to use Wikipedia, or any introductory neuroscience textbook.
$endgroup$
– Bryan Krause
5 hours ago
2
2
$begingroup$
@ChrisRogers I think that policy makes sense where it makes sense and not where it doesn't. If you have a link to Meta explaining that, I'll have a look, but I'm not aware of such a policy that applies to knowledge like "sound is pressure waves" or "vision is the detection of photons." I don't see value in adding links to Wikipedia or adding an arbitrary textbook source that most people aren't likely to have access to anyways.
$endgroup$
– Bryan Krause
5 hours ago
$begingroup$
@ChrisRogers I think that policy makes sense where it makes sense and not where it doesn't. If you have a link to Meta explaining that, I'll have a look, but I'm not aware of such a policy that applies to knowledge like "sound is pressure waves" or "vision is the detection of photons." I don't see value in adding links to Wikipedia or adding an arbitrary textbook source that most people aren't likely to have access to anyways.
$endgroup$
– Bryan Krause
5 hours ago
3
3
$begingroup$
@ChrisRogers I could provide references to those, but those examples are exactly why I feel referencing everything is a bit silly, because neither of those numbers is in any way relevant to this answer. 2x the highest frequency is the Nyquist rate, a basic concept in physics. The human hearing range is also not important, as long as it's well above the 24 Hz rate for movie video.
$endgroup$
– Bryan Krause
5 hours ago
$begingroup$
@ChrisRogers I could provide references to those, but those examples are exactly why I feel referencing everything is a bit silly, because neither of those numbers is in any way relevant to this answer. 2x the highest frequency is the Nyquist rate, a basic concept in physics. The human hearing range is also not important, as long as it's well above the 24 Hz rate for movie video.
$endgroup$
– Bryan Krause
5 hours ago
1
1
$begingroup$
@ChrisRogers I agree with Bryan, there is nothing in this answer that is not covered in a general reference like Wikipedia (not that I think Wikipedia is an authoritative source, but if it is in wikipedia, it should be considered common knowledge). As the question can be answered with common knowledge, maybe it is not a good fit for the site as it is not an advanced questions in psychology & neuroscience ...
$endgroup$
– StrongBad
4 hours ago
$begingroup$
@ChrisRogers I agree with Bryan, there is nothing in this answer that is not covered in a general reference like Wikipedia (not that I think Wikipedia is an authoritative source, but if it is in wikipedia, it should be considered common knowledge). As the question can be answered with common knowledge, maybe it is not a good fit for the site as it is not an advanced questions in psychology & neuroscience ...
$endgroup$
– StrongBad
4 hours ago
|
show 4 more comments
$begingroup$
I don't have a full answer, but it might get things started...
You are mixing up two concepts frame rate and sampling rate. In a video presented at 24 fps each frame, potentially, has a wide range of spatial frequencies. Typically the spatial frequencies are limited by the number of pixels, but you can low pass filter each frame to reduce the spatial frequencies (you will end up with a blurry picture). This spatial filtering has nothing to do with frame rate.
The 44.1 kHz sampling rate in audio signals is more akin to the spatial frequencies of a picture/frame than the frame rate of a video. An example of audio frames would be something like decomposing the audio signal into a bunch of slices with the short time Fourier transform (STFT), setting each slice to have a constant spectrum (and phase???), and reconstructing. Reconstructing the signal from a modified STFT is non-trivial (cf., Griffin and Lim 1984). Given the difficulties in the process and the lack of an application, I am not sure anyone has really investigated how the duration of the slices affects things.
$endgroup$
$begingroup$
Lack of an application? Audio compression (mp3, etc) typically uses a Fourier transform (or some sort of wavelet). Pretty much all of the dimensions of compression have been investigated to find the most efficient encoding that limits perceived decay, since compression is lossy.
$endgroup$
– Bryan Krause
4 hours ago
$begingroup$
@BryanKrause yes, but I don't see the link between those types of frames and the idea of a constant segment of signal frame. Maybe there is and maybe I am missing the relevant literature and if so would love to see an answer with more reference ... as I said, I don't have a full answer.
$endgroup$
– StrongBad
3 hours ago
add a comment |
$begingroup$
I don't have a full answer, but it might get things started...
You are mixing up two concepts frame rate and sampling rate. In a video presented at 24 fps each frame, potentially, has a wide range of spatial frequencies. Typically the spatial frequencies are limited by the number of pixels, but you can low pass filter each frame to reduce the spatial frequencies (you will end up with a blurry picture). This spatial filtering has nothing to do with frame rate.
The 44.1 kHz sampling rate in audio signals is more akin to the spatial frequencies of a picture/frame than the frame rate of a video. An example of audio frames would be something like decomposing the audio signal into a bunch of slices with the short time Fourier transform (STFT), setting each slice to have a constant spectrum (and phase???), and reconstructing. Reconstructing the signal from a modified STFT is non-trivial (cf., Griffin and Lim 1984). Given the difficulties in the process and the lack of an application, I am not sure anyone has really investigated how the duration of the slices affects things.
$endgroup$
$begingroup$
Lack of an application? Audio compression (mp3, etc) typically uses a Fourier transform (or some sort of wavelet). Pretty much all of the dimensions of compression have been investigated to find the most efficient encoding that limits perceived decay, since compression is lossy.
$endgroup$
– Bryan Krause
4 hours ago
$begingroup$
@BryanKrause yes, but I don't see the link between those types of frames and the idea of a constant segment of signal frame. Maybe there is and maybe I am missing the relevant literature and if so would love to see an answer with more reference ... as I said, I don't have a full answer.
$endgroup$
– StrongBad
3 hours ago
add a comment |
$begingroup$
I don't have a full answer, but it might get things started...
You are mixing up two concepts frame rate and sampling rate. In a video presented at 24 fps each frame, potentially, has a wide range of spatial frequencies. Typically the spatial frequencies are limited by the number of pixels, but you can low pass filter each frame to reduce the spatial frequencies (you will end up with a blurry picture). This spatial filtering has nothing to do with frame rate.
The 44.1 kHz sampling rate in audio signals is more akin to the spatial frequencies of a picture/frame than the frame rate of a video. An example of audio frames would be something like decomposing the audio signal into a bunch of slices with the short time Fourier transform (STFT), setting each slice to have a constant spectrum (and phase???), and reconstructing. Reconstructing the signal from a modified STFT is non-trivial (cf., Griffin and Lim 1984). Given the difficulties in the process and the lack of an application, I am not sure anyone has really investigated how the duration of the slices affects things.
$endgroup$
I don't have a full answer, but it might get things started...
You are mixing up two concepts frame rate and sampling rate. In a video presented at 24 fps each frame, potentially, has a wide range of spatial frequencies. Typically the spatial frequencies are limited by the number of pixels, but you can low pass filter each frame to reduce the spatial frequencies (you will end up with a blurry picture). This spatial filtering has nothing to do with frame rate.
The 44.1 kHz sampling rate in audio signals is more akin to the spatial frequencies of a picture/frame than the frame rate of a video. An example of audio frames would be something like decomposing the audio signal into a bunch of slices with the short time Fourier transform (STFT), setting each slice to have a constant spectrum (and phase???), and reconstructing. Reconstructing the signal from a modified STFT is non-trivial (cf., Griffin and Lim 1984). Given the difficulties in the process and the lack of an application, I am not sure anyone has really investigated how the duration of the slices affects things.
answered 4 hours ago
StrongBadStrongBad
2,014623
2,014623
$begingroup$
Lack of an application? Audio compression (mp3, etc) typically uses a Fourier transform (or some sort of wavelet). Pretty much all of the dimensions of compression have been investigated to find the most efficient encoding that limits perceived decay, since compression is lossy.
$endgroup$
– Bryan Krause
4 hours ago
$begingroup$
@BryanKrause yes, but I don't see the link between those types of frames and the idea of a constant segment of signal frame. Maybe there is and maybe I am missing the relevant literature and if so would love to see an answer with more reference ... as I said, I don't have a full answer.
$endgroup$
– StrongBad
3 hours ago
add a comment |
$begingroup$
Lack of an application? Audio compression (mp3, etc) typically uses a Fourier transform (or some sort of wavelet). Pretty much all of the dimensions of compression have been investigated to find the most efficient encoding that limits perceived decay, since compression is lossy.
$endgroup$
– Bryan Krause
4 hours ago
$begingroup$
@BryanKrause yes, but I don't see the link between those types of frames and the idea of a constant segment of signal frame. Maybe there is and maybe I am missing the relevant literature and if so would love to see an answer with more reference ... as I said, I don't have a full answer.
$endgroup$
– StrongBad
3 hours ago
$begingroup$
Lack of an application? Audio compression (mp3, etc) typically uses a Fourier transform (or some sort of wavelet). Pretty much all of the dimensions of compression have been investigated to find the most efficient encoding that limits perceived decay, since compression is lossy.
$endgroup$
– Bryan Krause
4 hours ago
$begingroup$
Lack of an application? Audio compression (mp3, etc) typically uses a Fourier transform (or some sort of wavelet). Pretty much all of the dimensions of compression have been investigated to find the most efficient encoding that limits perceived decay, since compression is lossy.
$endgroup$
– Bryan Krause
4 hours ago
$begingroup$
@BryanKrause yes, but I don't see the link between those types of frames and the idea of a constant segment of signal frame. Maybe there is and maybe I am missing the relevant literature and if so would love to see an answer with more reference ... as I said, I don't have a full answer.
$endgroup$
– StrongBad
3 hours ago
$begingroup$
@BryanKrause yes, but I don't see the link between those types of frames and the idea of a constant segment of signal frame. Maybe there is and maybe I am missing the relevant literature and if so would love to see an answer with more reference ... as I said, I don't have a full answer.
$endgroup$
– StrongBad
3 hours ago
add a comment |
Thanks for contributing an answer to Psychology & Neuroscience Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fpsychology.stackexchange.com%2fquestions%2f21572%2ffor-a-persistent-perceptual-experience-why-is-video-able-to-have-a-lower-frame%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown