Tuesday, January 15, 2013

Introduction to R – Control Structures

Like every other programing language, R have control structures that allow you control the flow of your code execution.

If, else for testing a condition. else section is optional.

RGui (64-bit)_2013-01-15_15-43-49

if it’s all about assigning a value to a variable, you can do like this

RGui (64-bit)_2013-01-15_15-46-22

for for executing a loop for a fixed number of times. It takes a variable and assign it successive values from a sequence or vector.

RGui (64-bit)_2013-01-15_15-59-06

while for executing a loop while a condition is true. It begins by testing that condition, if it is true, the loop body will execute, if not, R will skip the loop.

RGui (64-bit)_2013-01-15_16-08-36

repeat for executing an infinite loop; the only way to exit the loop is to call break

RGui (64-bit)_2013-01-15_16-12-57

break for breaking the execution of a loop and continue from the next line of code after the loop (just like in the previous example)

next is used to skip an iteration of a loop

RGui (64-bit)_2013-01-15_16-35-49

Writing multiple lines of code on the command-line interactive environment is hard. I have used the script editor to write the code in this post and then copied it to R console.

Loop functions

Loop functions is so similar to loops. It just more compact and easy to use on command line.

lapply loop over a list and evaluate a function on each element. If the first argument wasn’t a list, it will be coerced to a list (using as.list). lapply always returns a list. Any arguments passed to lapply beyonf the FUN parameter, will be assigned to the ellipsis and then passed as parameters to FUN. FUN can be an anonymous function.

RGui (64-bit)_2013-01-16_12-46-03

sapply will try to simplify the result of lapply if possible. If the result is a list where every element is length 1, then it returns a vector. If the result is a list where every element is a vector of the same length (>1), it returns a matrix. If it can’t figure things out, it returns a list.

RGui (64-bit)_2013-01-16_12-59-27

apply apply a function over the margins of an array. Often used to apply a function to rows and columns of a matrix. It takes as parameters the array; margin which indicates which dimension will be used as parameter to the function applied; and the function to be applied. In the example below, when passing 2 for the margin it means apply the function to columns, so we got a result of vector with length 10 containing the sum of each column. When we passed 1 for the margin, it means apply the function to rows, so we got a result of vector with length 20 containing the sum of each row.

RGui (64-bit)_2013-01-16_13-15-23

for sums and means of matrix dimensions, we have some shortcuts:

  • rowSums = apply(x, 1, sum)
  • rowMeans = apply(x, 1, mean)
  • colSums = apply(x, 2, sum)
  • colMeans = apply(x, 2, mean)

tapply apply a function over subsets of a vector. It is equal to using split and lapply together. split take a vector or other objects and splits it into groups determined by a factor or list of factors.

mapply is a multivariate version of lapply. Each element will in 1:4 repeated by the corresponding number in 4:1.

RGui (64-bit)_2013-01-16_14-14-26

In this post we introduced the basic control structured in R. Its almost the same in any c-like programming language.

Stay tuned for more R notes.