Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

Search this site



Just a note

  • y=c(...) assigns the data to variable y,

  • m=NULL; i=0 combines two instructions on the same line - which is what the semicolon (;) is for. Putting more than one instruction on the same line is merely to save space.
  • m=NULL ensures there is a variable called m, ready to receive the smoothed means, and assigning NULL to m ensures it is completely empty. This is because, although R is reasonably clever, it is not bright enough to find elements of a variable which does not exist!
  • i=0 sets up variable i, which will be used as a counter in the next set of instructions.

  • length(y) finds how many values are in variable y
  • while(i < length(y)) will repeatedly evaluate its arguments (i < length(y)) and, as long as they are found to be true, executes the next instruction, then revaluates itself, and so on, until (i < length(y)) is false, when the next instruction is skipped, and the remainder of the programme is executed. In this case y contains 52 values, so this loop will be repeated 52 times.
  • The curly brackets { } tell R to execute any instructions enclosed therein as if they were a single instruction. The {} brackets ensure that none of the second set of instructions can be evaluated separately. The combination of while and its next instruction{s} are known as an iterative loop.
  • i=i+1 adds 1 to i each time it is evaluated
  • m[i] = tells R to assign the result of evaluating the right-hand half of the expression to the ith element off variable m.
  • (y[i-2]+y[i-1] + y[i] + y[i+1]+y[i+2]) /5 calculates the 5 point moving average - and contains several linked instructions. Collectively, they find the mean of the i-1th element of variable y, the ith element of y, and the i+1th element of y - puts the result into the ith element of variable m. If any of these elements do not exist, no value can be assigned to variable m, so it ends up being NA (Not Available).

  • The next instruction plot(y) plot a scatterplot of the unsmoothed data (variable y) against an 'index' of that order.

  • points(m,type='l') overplots the smoothed data in variable m as a lineplot - aside for values that are Not Available (for a 5-point average, the first two and last two). The instruction lines(m) would achieve the same result. Once again, because the points function was only given one variable, it plots each value against its position in variable m.

  • Notice that, in addition to the smoothed data set being shorter than the original, any embedded missing value (that has not been interpolated) will result in at least (M=)3 values being lost from the smoothed data. One way to overcome the first of these problems is to use the first and last value instead of a mean, then calculate the second and second to last value from M-1 values.