Using lambda can save you having to write a function.
If you’ve not used ‘map’ then we’ll show you how it can perform the same task as lambda in an example
import pandas as pd
import numpy as np
reviews = pd.read_csv("winemag-data-130k-v2.csv",index_col=0)
next we’ll drop any rows full of NaNs
now we have good data…
We can now use a lambda expression to run all the way down the price column and update it to show whether it is more or less than the mean:
reviews_price_mean = reviews.price.mean() reviews.price.apply(lambda p : p - reviews_price_mean)
What does this do exactly?
lambda p is equivalent to the price value in each row
p - reviews_price_mean
We subtract the mean review price from the ‘p’ value to give us a positive or negative value compared to the mean price.
By applying it with apply we can go all the way through the dataframe.
We can create a new column reviews[‘price_dfif’] and set that equal to the result of our lambda function.
0 NaN 1 -20.363389 2 -21.363389 3 -22.363389 4 29.636611 ... 129966 -7.363389 129967 39.636611 129968 -5.363389 129969 -3.363389 129970 -14.363389 Name: price, Length: 129971, dtype: float64
The result now shows as a price as +/- the mean
Using map gives the same results:
reviews.price_diff.map(lambda p : p - reviews_price_mean)
Both of these ways allow you to apply a function without the need for a traditional Python ‘for’ loop.