map & lambda

Introduction

Using lambda can save you having to write a function.

If you’ve not used ‘map’ then we’ll show you how it can perform the same task as lambda in an example

import pandas as pd
pd.set_option('max_rows',10)
import numpy as np
reviews = pd.read_csv("winemag-data-130k-v2.csv",index_col=0)
reviews

next we’ll drop any rows full of NaNs

reviews.dropna()

now we have good data…

reviews.price.mean()

35.363389129985535

We can now use a lambda expression to run all the way down the price column and update it to show whether it is more or less than the mean:

reviews_price_mean = reviews.price.mean()
reviews.price.apply(lambda p : p - reviews_price_mean)

What does this do exactly?

lambda p is equivalent to the price value in each row

p - reviews_price_mean

We subtract the mean review price from the ‘p’ value to give us a positive or negative value compared to the mean price.

By applying it with apply we can go all the way through the dataframe.

We can create a new column reviews[‘price_dfif’] and set that equal to the result of our lambda function.

0               NaN
1        -20.363389
2        -21.363389
3        -22.363389
4         29.636611
            ...    
129966    -7.363389
129967    39.636611
129968    -5.363389
129969    -3.363389
129970   -14.363389
Name: price, Length: 129971, dtype: float64

The result now shows as a price as +/- the mean

Summary:

Using map gives the same results:

reviews.price_diff.map(lambda p : p - reviews_price_mean)
wine-reviews-price-vs-points
For those interested, here are the prices vs points from the reviewers

Both of these ways allow you to apply a function without the need for a traditional Python ‘for’ loop.

Previous article

ffill_bfill