Back to home

Control flow in reduce/inject

On how to skip items, conditionally apply logic and how to break or stop iteration early when using reduce/inject in Ruby.

reduce (inject) is one of the most powerful methods that exists on the Enumerable module, meaning that the methods are available on any instances of any class that includes this module, including Array, Hash, Set and Range.

reduce can be used in a MapReduce process, often is the base for comprehensions and is a great way to group values or calculate a single value (reducing a set of values to a single value) given a set of values.

This article quickly shows you how to skip values / conditionally return values during a reduce iteration and how to break early / return a different value and stop iteration.

Bridge In The Mist in Stockholm, Sweden
Photo by Anders Jildén (https://unsplash.com/@andersjilden) on Unsplash (https://unsplash.com/)

Recap 💬

From the documentation, given an instance enum (an Enumerable) calling enum.reduce:

# Combines all elements of <i>enum</i> by applying a binary
# operation, specified by a block or a symbol that names a
# method or operator.

An example of using reduce would be write a function that sums all the elements in a collection:

##
# Sums each item in the enumerable (naive)
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
  sum = 0
  enum.each do |item|
    sum += item
  end
  sum
end

##
# Sums each item in the enumerable (reduce block)
#
# Each iteration the result of the block is the passed in previous_result.
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
  enum.reduce do |previous_result, item|
    previous_result + item
  end
end

##
# Sums each item in the enumerable (reduce method)
#
# Each iteration the :+ symbol is sent as a message to the current result with
# the next value as argument. The result is the new current result.
#
# @param [Enumerable] enum the enumeration of items to sum
# @return [Numeric] the sum
#
def summation(enum)
  enum.reduce(:+)
end

##
# Alias for enum.sum
#
def summation(enum)
  enum.sum
end

reduce takes an optional initial value, which is used instead of the first item of the collection, when given.

How to control the flow?

When working with reduce you might find yourself in one of two situations:

  • you want to conditionally return a different value for the iteration (which is used as base value for the next iteration)
  • you want to break out early (stop iteration altogether)

next ⏭

The next keyword allows you to return early from a yield block, which is the case for any enumeration.

Let’s say you the sum of a set of numbers, but want half of any even number, and double of any odd number:

def halfly_even_doubly_odd(enum)
  enum.reduce(0) do |result, i|
    result + i * (i.even? ? 0.5 : 2)
  end
end

Not too bad. But now another business requirement comes in to skip any number under 5:

def halfly_even_doubly_odd(enum)
  enum.reduce(0) do |result, i|
    if i < 5
      result
    else
      result + i * (i.even? ? 0.5 : 2)
    end
  end
end

Ugh. That’s not very nice ruby code. Using next it could look like:

def halfly_even_doubly_odd(enum)
  enum.reduce(0) do |result, i|
    next result if i < 5
    next result + i * 0.5 if i.even?
    result + i * 2
  end
end

next works in any enumeration, so if you’re just processing items using .each , you can use it too:

(1..10).each do |num|
  next if num.odd?
  puts num
end
# 2
# 4
# 6
# 8
# 10
# => 1..10

break 🛑

Instead of skipping to the next item, you can completely stop iteration of a an enumerator using break.

If we have the same business requirements as before, but we have to return the number 42 if the item is exactly 7, this is what it would look like:

def halfly_even_doubly_odd(enum)
  enum.reduce(0) do |result, i|
    break 42 if i == 7
    next result if i < 5
    next result + i * 0.5 if i.even?
    result + i * 2
  end
end

Again, this works in any loop. So if you’re using find to try to find an item in your enumeration and want to change the return value of that find, you can do so using break:

def find_my_red_item(enum)
  enum.find do |item|
    break item.name if item.color == 'red'
  end
end

find_my_red_item([
  { name: "umbrella", color: "black" },
  { name: "shoe", color: "red" },
  { name: "pen", color: "blue" }
])
# => 'shoe'

StopIteration

You might have heard about or seen raise StopIteration. It is a special exception that you can use to stop iteration of an enumeration, as it is caught be Kernel#loop, but its use-cases are limited as you should not try to control flow using raise or fail. The airbrake blog has a good article about this use case.

When to use reduce

If you need a guideline when to use reduce, look no further. I use the four rules to determine if I need to use reduce or each_with_object or something else.

I use reduce when:

  • reducing a collection of values to a smaller result (e.g. 1 value)
  • grouping a collection of values (use group_by if possible)
  • changing immutable primitives / value objects (returning a new value)
  • you need a new value (e.g. new Array or Hash)

Alternatives 🔀

When the use case does not match the guidelines above, most of the time I actually need each_with_object which has a similar signature, but does not build a new value based on the return value of a block, but instead iterates the collection with a predefined “object”, making it much easier to use logic inside the block:

doubles = (1..10).each_with_object([]) do |num, result|
  result << num* 2
  # same as result.push(num * 2)
end
# => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

doubles_over_ten = (1..10).each_with_object([]) do |num, result|
  result << num * 2 if num > 5
end
# => [12, 14, 16, 18, 20]

Use each_with_object when:

  • building a new container (e.g. Array or Hash). Note that you’re not really reducing the current collection to a smaller result, but instead conditionally or unconditionally map values.
  • you want logic in your block without repeating the result value (because you must provide a return value when using reduce)

My use case

The reason I looked into control flow using reduce is because I was iterating through a list of value objects that represented a migration path. Without using lazy, I wanted an elegant way of representing when these migrations should run, so used semantic versioning. The migrations enumerable is a sorted list of migrations with a semantic version attached.

migrations.reduce(input) do |migrated, (version, migration)|
  migrated = migration.call(migrated)
  next migrated unless current_version.in_range?(version)
  break migrated
end

The function in_range? determines if a migration is executed, based on the current “input” version, and the semantic version of the migration. This will execute migrations until the “current” version becomes in-range, at which point it should execute the final migration and stop.

The alternatives were less favourable:

  • take_while, select and friends are able to filter the list, but it requires multiple iterations of the migrations collection (filter, then “execute”);
  • find would be a good candidate, but I needed to change the input so that would require me to have a bookkeeping variable keeping track of “migrated”. Bookkeeping variables are almost never necessary in Ruby.

Photo called 'It’s Own Kind of Tranquility', displaying a series of windmills on either side of a 'water street (canal)' in Alblasserdam, The Netherlands
Photo by Vishwas Katti (https://unsplash.com/@vishkatti) on Unsplash (https://unsplash.com/)

Reference

Keywords