Conditional Selection
# Ask yourself: why is :9 is the correct slice to select the first 10 rows?
babynames_first_10_rows = babaynames.loc[:9, :]
# Notice how we have exactly 10 elements in our boolean array argument
babynames_first_10_rows[[True, False, True, False, True, False, True, False, True, False]]
To make things easier, we can instead provide a logical condition as an input to .loc or [ ] that returns a boolean array with the necessary length.
# First, use a logical condition to generate a boolean array
logical_operator = (babynames["Sex"] == "F")
# Then, use this boolean array to filter the DataFrame
babynames[logical_operator].head()
Here is a simple print code:
print("There are a total of {} values in 'logical_operator'".format(len(logical_operagtor)))
print("The 0th item in this 'logical_operator' is: {}".format(logical_operator.iloc[0]))
print("The 235790th item in this 'logical_operator' is: {}".format(logical_operator.iloc[235790]))
print("The 235791th item in this 'logical_operator' is: {}".format(logical_operator.iloc[235791]))
babynames.loc[babynames["Sex"] == "F"].head()
# if we want to return data on all females born before the 21st century, we can write:
babynames[(babynames["Sex"] == "F") & (babynames["Year"] < 2000)].head()
Boolean array selection is a useful tool, but can lead to overly verbose code for complex conditions.
# Note: the parentheses surrounding the code make it possible to break the code on to multiple lines for readability
(
babynames[ (babynames["Name"] == "Bella") |
(babynames["Name"] == "Alex") |
(babynames["Name"] == "Ani") |
(babynames["Name"] == "Lisa") |
).head()
.isin
# .isin function can be used to filter dataframes. select rows with having a particular value in a particular column
names = ["Bella", "Alex", "Ani", "Lisa"]
babynames[babynames["Name"].isin(names)].head()
str.startswith
# can be used to define a filter based on string values in a Series object
babynames[babynames["Name"].str.startswith("N")].head()
'Computer Science 🌋 > Machine Learning🐼' 카테고리의 다른 글
Add & Remove Columns (0) | 2023.05.23 |
---|---|
Handy Utility Functions in Pandas (0) | 2023.05.23 |
Indexing in Pandas (0) | 2023.05.22 |
Basics in Pandas (0) | 2023.05.22 |
Data Science Lifecycle (0) | 2023.05.22 |