map & lambda
Introduction
Using lambda can save you having to write a function.
If you’ve not used ‘map’ then we’ll show you how it can perform the same task as lambda in an example
import pandas as pd
pd.set_option('max_rows',10)
import numpy as np
reviews = pd.read_csv("winemag-data-130k-v2.csv",index_col=0)
reviews
next we’ll drop any rows full of NaNs
reviews.dropna()
now we have good data…
reviews.price.mean() 35.363389129985535
We can now use a lambda expression to run all the way down the price column and update it to show whether it is more or less than the mean:
reviews_price_mean = reviews.price.mean() reviews.price.apply(lambda p : p - reviews_price_mean)
What does this do exactly?
lambda p is equivalent to the price value in each row
p - reviews_price_mean
We subtract the mean review price from the ‘p’ value to give us a positive or negative value compared to the mean price.
By applying it with apply we can go all the way through the dataframe.
We can create a new column reviews[‘price_dfif’] and set that equal to the result of our lambda function.
0 NaN
1 -20.363389
2 -21.363389
3 -22.363389
4 29.636611
...
129966 -7.363389
129967 39.636611
129968 -5.363389
129969 -3.363389
129970 -14.363389
Name: price, Length: 129971, dtype: float64
The result now shows as a price as +/- the mean
Summary:
Using map gives the same results:
reviews.price_diff.map(lambda p : p - reviews_price_mean)
Both of these ways allow you to apply a function without the need for a traditional Python ‘for’ loop.