Numpy
bella_counts = babynames[babynames["Name"] == "Bella"]["Count"]
# Average number of babies named Bella each year
np.mean(bella_counts)
# Max number of babies named Bella born on a given year
max(bella_counts)
.shape & .size
# return a tuple containing the number of rows and columns
babynames.shape
# return the total number of elements in a structure, equivalent to the number of rows times the number of columns
babynames.size
.describe()
# order a Dataframe by a specified rule
# this is descending order
babynames.sort_values(by = "Count", ascending=False).head()
# return statistics(minimum, maximum, mean value)
babynames.describe()
# A different set of statistics will be reported if .describe() is called on a Series
babynames["Sex"].describe()
.sample()
# Random process: train-test splits, bootstrapping, and cross-validation
# quickly select random entries (a row if called from a DataFrame, or a value if called from a Series)
babynames.sample()
babynames.sample(5).iloc[:, 2:]
# year2000
babynames[babynames["Year"] == 2000].sample(4, replace = True).iloc[:, 2:]
.sort_values()
# order a DataFrame or Series by a spcified rule
babynames.sort_values(by = "Count", ascending=False).head()
# automatically sorts values in order according to numeric value or alphabetical order
# Sort names by count in year 2021
# Recall that '.head(5)' displays the first five rows in the DataFrame
babynames[babynames["Year"] == 2021].sort_values("Count", ascending=False).head()
# specifying the key parameter of .sort_values
# Here, a lambda function is applied to find the length of each value, 'x', in the "Name" column
babynames.sort_values("Name", key=lambda x: x.str.len(), ascending=False).head(5)
'Computer Science 🌋 > Machine Learning🐼' 카테고리의 다른 글
Aggregating Data with GroupBy in Pandas (0) | 2023.05.23 |
---|---|
Add & Remove Columns (0) | 2023.05.23 |
Conditional Selection in Pandas (0) | 2023.05.23 |
Indexing in Pandas (0) | 2023.05.22 |
Basics in Pandas (0) | 2023.05.22 |