For sake of clarity we have color-coded this step-plot to distinguish its features. Notice that, the more accurately these observations are measured, the less likely you will observe two that are identical. In which case, a very large sample of very accurately measured observations would be almost smooth - which allows mathematical statisticians to simplify their calculations rather a lot.
The following code produces much the same sort of result using R.
Note, for comparison, we have produced two versions of the same plot.
Note:
- Because a stepplot is drawn in the order that values are presented to it, those values must be sorted beforehand.
- In order to get the scatterplot to draw the first vertical, we repeat the first y value (using c(y[1],y) and begin with a zero frequency using c(0,rr)
- The ecdf function does not need the data to be sorted beforehand, and passes the plot function virtually everything it needs.
- We have merely used the pch and col tags to modify the plot symbol to a small red dot - rather than the usual circle. If you want to omit the points altogether use plot(ecdf(y), pch = ' ')
- Somewhat irritatingly, plotting the output of R's ecdf function does not draw in any verticals, nor does it accept colour. Hence we usually prefer to produce our own ECDF plots.