13  Conditionals in R

13.1 Acknowledgment/License

The original source for this chapter was from the web site

https://datacarpentry.org/semester-biology/

which was built using this underlying code

https://github.com/datacarpentry/semester-biology

and is used under the

Attribution 4.0 International (CC BY 4.0)

license https://creativecommons.org/licenses/by/4.0/.

The material presented here has been modified from the original source.

Accordingly this chapter is made available under the same license terms.

13.2 Source code

If you’d like to work within R Studio using the source code of this chapter, you can obtain it from here.

13.3 Conditionals

  • Conditional statements are when we check to see if some condition is true or not
  • We used these for filtering data in dplyr
  • These statements generate a value is of type "logical"

  • The value is TRUE if the condition is satisfied

  • The value is FALSE if the condition is not satisfied

  • These aren’t the strings “TRUE” and “FALSE”

  • They are a special type of value

  • Conditional statements are made with a range of operators

  • We’ve seen

    • == for equals
    • != for not equals
    • <, > for less than and greater than
    • <=, >= for less than or equal to and greater than or equal to
    • is.na() for is this value null
  • There are others, including %in%, which checks to see if a value is present in a vector of possible values

  • We can combine conditions using “and” and “or”
  • We use the & for “and”
  • Which means if both conditions are TRUE return TRUE
  • If one of the contions is FALSE then return FALSE
  • We use the | for “or”
  • Which means if either or both of the conditions are TRUE return TRUE
  • Vectors of values compared to a single value return one logical per value
  • Checks each value to see if equal to 1
  • This is what subsetting approaches use to subset
  • They keep the values where the value in this condition vector is equal to TRUE
  • Let’s look at an example where we have a vector of sites and a vector the the states they occur in
  • A conditional statement checking if the state is 'FL' returns a vector of TRUE’s and FALSEs
  • So when we filter the site vector to only return values where the state is equal to 'FL'
  • It is the same as pass a vector of TRUE and FALSE values inside the square brackets
  • This keeps the first and second values in site because the values in the vector are TRUE
  • This is how dplyr::filter() and other methods for subsetting data work

13.3.1 Tasks: Choice Operators

Important

Do Tasks 1-4 in Choice Operators

Create the following variables.

Use them to print whether or not the following statements are TRUE or FALSE.

  1. w is greater than 10
  2. "green" is in colors
  3. x is greater than y
  4. Each value in masses is greater than 40.

13.4 if statements

  • Conditional statements generate logical values to filter inputs.
  • if statements use conditional statements to control flow of the program.
if (the conditional statement is TRUE ) {
  do something
}
  • Example
  • x > 5 is TRUE, so the code in the if runs
  • x is now 6^2 or 36
  • Change x to 4
  • x > 5 is FALSE, so the code in the if doesn’t run

  • x is still 4

  • This is not a function, so everything that happens in the if statement influences the global environment

  • Different mass calculations for different vegetation types

13.4.1 Task 1: Basic If Statements

Important

Do Task 1 in Basic If Statements

1. Complete (i.e., copy into your code and them modify) the following if statement so that if age_class is equal to “sapling” it sets y <- 10.

  • Often want to chose one of several options
  • Can add more conditions and associated actions with else if
  • Checks the first condition

  • If TRUE runs that condition’s code and skips the rest

  • If not it checks the next one until it runs out of conditions

  • Can specify what to do if none of the conditions is TRUE using else on its own

13.4.2 Tasks 2-3: Basic If Statements

Important

Do Tasks 2-3 in Basic If Statements

2. Complete the following if statement so that if age_class is equal to “sapling” it sets y <- 10 and if age_class is equal to “seedling” it sets y <- 5.

3. Complete the following if statement so that if age_class is equal to “sapling” it sets y <- 10 and if age_class is equal to “seedling” it sets y <- 5 and if age_class is something else then it sets the value of y <- 0.

13.5 Multiple ifs vs else if

  • Multiple ifs check each conditional separately
  • Executes code of all conditions that are TRUE
  • else if checks each condition sequentially
  • Executes code for the first condition that is TRUE

13.6 Using Conditionals Inside Functions

  • We’ve used a conditional to estimate mass differently for different types of vegetation
  • This is the kind of code we are going to want to reuse, so let’s move it into a function
  • We do this by placing the same code inside of a function
  • And making sure that the function takes all required variables as input
  • We can then run this function with different vegetation types and get different estimates for mass
  • Let’s walk through how this code executes using the debugger
  • When we call the function the first thing that happens is that 1.6 gets assigned to volume and "tree" gets assigned to veg_type
  • The code then checks to see if veg_type is equal to "shrub"
  • It isn’t so the code then checks to see if veg_type is equal to "grass"
  • It isn’t so the code then hits the else statement and executes the code in the else block
  • It assigns NA to mass
  • It then finishes the if/else if/else statement and returns the value for mass, which is NA to the global environment

13.6.1 Task: Size Estimates by Name

Important

Do Size Estimates by Name

13.6.1.1 Part I

The length of an organism is typically strongly correlated with its body mass. This is useful because it allows us to estimate the mass of an organism even if we only know its length. This relationship generally takes the form:

mass = a * length^b

Where the parameters a and b vary among groups. This allometric approach is regularly used to estimate the mass of dinosaurs since we cannot weigh something that is only preserved as bones.

The following function estimates the mass of an organism in kg based on its length in meters for a particular set of parameter values, those for Theropoda (where a has been estimated as 0.73 and b has been estimated as 3.63; Seebacher 2001).

  1. Use this function to print out the mass of a Theropoda that is 16 m long based on its reassembled skeleton.
  1. Create a new version of this function called get_mass_from_length() that takes length, a and b as arguments and uses the following code to estimate the mass mass <- a * length ^ b. Use this function to estimate the mass of a Sauropoda (a = 214.44, b = 1.46) that is 26 m long.

13.6.1.2 Part II

To make it even easier to work with your dinosaur size estimation functions you decide to create a function that lets you specify which dinosaur group you need to estimate the size of by name and then have the function automatically choose the right parameters.

Create a new function get_mass_from_length_by_name() that takes two arguments, the length and the name of the dinosaur group. Inside this function use if/else if/else statements to check to see if the name is one of the following values and if so use the associated a and b values to estimate the species mass.

If the name is not any of these values the function should return NA.

Run the function for: 1. A Stegosauria that is 10 meters long. 2. A Theropoda that is 8 meters long. 3. A Sauropoda that is 12 meters long. 4. A Ankylosauria that is 13 meters long.

Challenge (optional): If the name is not one of values that have a and b values print out a message that it doesn’t know how to convert that group that includes that groups name in a message like “No known estimation for Ankylosauria”. (the function paste() will be helpful here). Doing this successfully will modify your answer to (4), which is fine.

Challenge (optional): Change your function so that it uses two different values of a and b for Stegosauria. When Stegosauria is greater than 8 meters long use the equation above. When it is less than 8 meters long use a = 8.5 and b = 2.8. Run the function for a Stegosauria that is 6 meters long.

Challenge (optional): Rewrite your function so that instead of calculating mass directly it sets the values of a and b to the values for the species (or to NA if the species doesn’t have an equation) and then calls another function to do the basic mass = a * length ^ b calculation.

13.7 Automatically extracting functions

  • Can pull code out into functions
  • Highlight the code
  • Code -> Extract Function
  • Provide a name for the function

13.8 Nested conditionals

  • Sometimes decisions are more complicated
  • For example we might have different equations for some vegetation types based on the age of the plant
  • Can “nest” conditionals inside of one another
  • First checks if the vegetation type is “shrub”
  • If it is checks to see if it is < 5 years old
  • If so does one calculation, if not does another
  • But nesting can be difficult to follow so try to minimize it

13.8.1 Task 4: Basic If Statements

Important

Do Task 4 in Basic If Statements

4. Convert your conditional statement from Task 3 in Section 13.4.2 into a function that takes age_class as an argument and returns y. Call this function 5 times, once with each of the following values for age_class: “sapling”, “seedling”, “adult”, “mature”, “established”.