Home > Archive > Matlab > January 2008 > calculating a pdf and cdf from data?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
calculating a pdf and cdf from data?
|
|
| Paul Mcmillan 2008-01-14, 7:29 pm |
| hi im trying to plot a pdf and cdf from ping data that
i've been collecting. the data looks as such
2007-12-10 16:43:32Z 68
this is repeated some 200,000 times
ive been able to bring these results into matlab in
columns using the following:
function[pingtime, ping] = pingread(file_name)
fid = fopen(file_name);
line = fgetl(fid);
line = strrep(line,'-',' ');
line = strrep(line,':',' ');
line = strrep(line,'Z','');
pingtime = sscanf(line,'%d')';
line = fgetl(fid);
stop = 0;
while (stop == 0)
line = strrep(line,'-',' ');
line = strrep(line,':',' ');
line = strrep(line,'Z','');
pingtime = [pingtime; sscanf(line,'%d')'];
line = fgetl(fid);
line(1,1);
if( line(1,1) ~= '2' )
stop = 1;
end
end
ping = pingtime(:,7);
clear('stop');
clear('line');
clear('ans');
clear('fid');
i now need to plot a cdf and pdf. ive been trying
Y = pdf('rayl',X,A)
but im not sure of how to get it working. i only need to
use the 7th column of my data ('ping')
can anyone help?
many thanks
Paul
| |
| Volkan 2008-01-14, 7:29 pm |
|
> i now need to plot a cdf and pdf. ive been trying
>
> Y = pdf('rayl',X,A)
>
You are using the wrong function. pdf evaluates a known
distribution function (in this case rayleigh) at requested
points. Try hist function. Hist also returns the
distribution, so using that you can calculate cdf.
Volkan
| |
| Paul Mcmillan 2008-01-14, 7:29 pm |
| "Volkan " <volkan@buyukgungor.gmail.com> wrote in message
<fmg4cm$2u6$1@fred.mathworks.com>...
>
>
> You are using the wrong function. pdf evaluates a known
> distribution function (in this case rayleigh) at
requested
> points. Try hist function. Hist also returns the
> distribution, so using that you can calculate cdf.
>
> Volkan
ah right ok. when i run my original file i get 7 columns
with the ping one as such:
ping =
60
62
62
60
62
62
60
62
64
68
62
10000
62
how do i get a nice pdf plot from this using hist function?
there are alot more results than above. the 10,000 value
is where a connection timed out. could i filter these
easily?
thanks (im new to coding!)
Paul
| |
| Volkan 2008-01-14, 7:29 pm |
|
> there are alot more results than above. the 10,000 value
> is where a connection timed out. could i filter these
> easily?
Filtering time outs may be a very good idea, since they will
completely throw off the bins that hist function
automaticaly calculates.
[color=darkred]
will filter them out.
But if you really need to have them in your graph, you will
need to manually tell hist what bins to use, so that
valuable information will not be squeezed to the extreme
left of a linear 0-10000 scale.
[color=darkred]
probably has all the information you will need. Just play
around with parameters to get a graph that suits you.
Volkan
| |
| Peter Perkins 2008-01-14, 7:29 pm |
| Paul Mcmillan wrote:
> hi im trying to plot a pdf and cdf from ping data that
> i've been collecting. the data looks as such
> i now need to plot a cdf and pdf. ive been trying
>
> Y = pdf('rayl',X,A)
Paul, that line will plot the PDF for a Rayleigh dist'n with known
perameter. You want something empirical before you fit a distribution,
I presume. I don't know anything about "ping data" or how you're
preprocessing your data, but since you have the Statistics Toolbox, I
suggest using the DFITTOOL GUI to plot your data and fit a distribution
to it. Especially, look at the empirical CDF plots, and the
nonparametric kernel density estimate.
You can do any of what DFITTOOL does from the command line, but I think
you'll find the GUI convient.
Hope this helps.
- Peter Perkins
The MathWorks, Inc.
|
|
|
|
|