R ifelse tripped me up

In the continuing saga of climbing the R learning curve, I just found a bug in older code.

Although in hindsight I can’t see what on earth I was thinking.

I have been using ifelse as sort of a replacement for case statements. ifelse is cool because if you have a list of things with a handful of values or that you want to split on a value, you can write an ifelse statement that works on the whole list. So for example:

## assume ts is a POSIXlt object from January 1, 2008 through January 1, 2009
saturdays <- ifelse(ts.wday==6,saturdayvalue,otherdayvalue)

This isn’t a great example, because you can accomplish the same thing by indexing. There are cases when it is much more useful than indexing.

However, I made the mistake of thinking that the ifelse operator knows what its target is (what context it is operating in), but instead the operator only looks at it arguments, not its expected result. So I did something like:

> testCondition <- 2
> list.of.data <- 1:10

> list.of.data
[1]  1  2  3  4  5  6  7  8  9 10
> target.list <- ifelse(testCondition==1,list.of.data,list.of.data*2)

I expected target.list to be a list of 10 items, either doubled or not, depending on the value of testCondition. In fact, I just got one item

> target.list
[1] 10

But, if you make a list condition, rather than a scalar, you get a list result. So consider this

> ## ifelse needs a list condition to generate a list output
> testCondition <- rep(1:3,5)
> testCondition
[1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
> target.list <- ifelse(testCondition==1,list.of.data,rev(list.of.data))
> target.list
[1]  1  9  8  4  6  5  7  3  2 10 10  9  3  7  6

Which is an utterly crazy result. Even now, when I think I’m writing an example of what ifelse does, I was expecting a list of lists, with reversed lists where the condition was true. But no, ifelse generates as a result the same thing that it has for its condition…a simple vector. Looking again at the code, what that ifelse command did is that every time the testCondition vector hit 1, it drew from the normally ordered list, and every time it hit 2 or 3, it drew from the reversed the list.of.data variable, keeping the index in the vector going. So you start with a normal list at the first element, get 2 and 3 from the reversed list elements 2 and 3, then pull element 4 from the forward list at the 4th element, etc., then recycle both lists when you hit element 11 in the output vector.

> rev(list.of.data)
[1] 10  9  8  7  6  5  4  3  2  1
> list.of.data
[1]  1  2  3  4  5  6  7  8  9 10
> testCondition==1
[1]  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE
[13]  TRUE FALSE FALSE
> target.list
[1]  1  9  8  4  6  5  7  3  2 10 10  9  3  7  6
> 

So the result is entirely based on the test value, not the expected result, or the two true or false answers. If you have a single scalar for your test, then you’ll get a scalar for your answer. If you have a list of 15 elements for your test, you’ll get a list of 15 elements for your answer, with the two conditions being recycled as and if necessary. And you can generate truly weird results if you don’t think about what you’re doing.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s