Map Filter and Reduce

No real consensus about what that means.

But there are some “classic” methods available in Python.

map

map “maps” a function onto a sequence of objects – It applies the function to each item in the list, returning another list

In [23]: l = [2, 5, 7, 12, 6, 4]
In [24]: def fun(x):
             return x*2 + 10
In [25]: map(fun, l)
Out[25]: [14, 20, 24, 34, 22, 18]

But if it’s a small function, and you only need it once:

In [26]: map(lambda x: x*2 + 10, l)
Out[26]: [14, 20, 24, 34, 22, 18]

filter

filter “filters” a sequence of objects with a boolean function – It keeps only those for which the function is True – filtering our the rest.

To get only the even numbers:

In [27]: l = [2, 5, 7, 12, 6, 4]
In [28]: filter(lambda x: not x%2, l)
Out[28]: [2, 12, 6, 4]

If you pass None to filter(), you get only items that evaluate to true:

In [1]: l = [1, 0, 2.3, 0.0, 'text', '', [1,2], [], False, True, None ]

In [2]: filter(None, l)
Out[2]: [1, 2.3, 'text', [1, 2], True]

reduce

reduce “reduces” a sequence of objects to a single object with a function that combines two arguments

To get the sum:

In [30]: l = [2, 5, 7, 12, 6, 4]
In [31]: reduce(lambda x,y: x+y, l)
Out[31]: 36

To get the product:

In [32]: reduce(lambda x,y: x*y, l)
Out[32]: 20160

or

In [13]: import operator
In [14]: reduce(operator.mul, l)
Out[14]: 20160

Comprehensions

Couldn’t you do all this with comprehensions?

Yes:

In [33]: [x+2 + 10 for x in l]
Out[33]: [14, 17, 19, 24, 18, 16]

In [34]: [x for x in l if not x%2]
Out[34]: [2, 12, 6, 4]

In [6]: l
Out[6]: [1, 0, 2.3, 0.0, 'text', '', [1, 2], [], False, True, None]
In [7]: [i for i in l if i]
Out[7]: [1, 2.3, 'text', [1, 2], True]

(Except Reduce)

But Guido thinks almost all uses of reduce are really sum()

Functional Programming

Comprehensions and map, filter, reduce are all “functional programming” approaches}

map, filter and reduce pre-date comprehensions in Python’s history

Some people like that syntax better

And “map-reduce” is a big concept these days for parallel processing of “Big Data” in NoSQL databases.

(Hadoop, MongoDB, etc.)