'Computer Science 🌋' 카테고리의 글 목록 (3 Page)

Aggregation in Pandas

2023.05.23

GroupBy(), Continued As we learned last lecture, a groupby operation involves some combination of splitting a DataFrame into grouped subframes, applying a function, and combining the results. Organizes all rows with the same year into a subframe for that year. Creates a new DataFrmae with one row representing each subframe year. Combines all integer rows in each subframe using the sum function. ..

Computer Science 🌋/Machine Learning🐼

Aggregating Data with GroupBy in Pandas

2023.05.23

GroupBy # aggregate all rows in babynames for a given year babynames.groupby("Year") # Output: ※ The reason for strange output: calling .groupby has generated a GroupBy object! .agg ''' .agg method takes in a function as its argument; this function is then applied each column of a "mini" grouped DataFrame. We end up with a new DataFrame with one aggregated row per subframe ''' # return the numbe..

Computer Science 🌋/Machine Learning🐼

Add & Remove Columns

2023.05.23

Add columns # specify the name of the new column -> dataframe["new_columns"] # Add a column named "name_lengths" that includes the length of each name babynames["name_lengths"] = babynames["Names"].str.len() babynames.head(5) Sort by the temporary column # Sort by the temporary column babynames = babynames.sort_values(by = "name_lengths", ascending=False) babynames.head() .map # First, define a ..

Computer Science 🌋/Machine Learning🐼

Handy Utility Functions in Pandas

2023.05.23

Numpy bella_counts = babynames[babynames["Name"] == "Bella"]["Count"] # Average number of babies named Bella each year np.mean(bella_counts) # Max number of babies named Bella born on a given year max(bella_counts) .shape & .size # return a tuple containing the number of rows and columns babynames.shape # return the total number of elements in a structure, equivalent to the number of rows times ..

Computer Science 🌋/Machine Learning🐼

Conditional Selection in Pandas

2023.05.23

Conditional Selection # Ask yourself: why is :9 is the correct slice to select the first 10 rows? babynames_first_10_rows = babaynames.loc[:9, :] # Notice how we have exactly 10 elements in our boolean array argument babynames_first_10_rows[[True, False, True, False, True, False, True, False, True, False]] To make things easier, we can instead provide a logical condition as an input to .loc or [..

Computer Science 🌋/Machine Learning🐼

Indexing in Pandas

2023.05.22

# elections.loc[0, "Candidate"] - Previous approach elections.iloc[0, 1] DataFrame is a collection of Series that all shares the same index. Index doesn't have to be an integer, nor does it have to unique. # this sets the index to the "Candidate" column elections.set_index("Candidate", inplace=True) elections.index ''' Index(['~', '~',,,,'~"], dtype='object', name='Candidate', length=182) ''' # ..

Computer Science 🌋/Machine Learning🐼

Basics in Pandas

2023.05.22

Data cleaning: Data cleaning corrects issues in the structure and formatting of data, including missing values and unit conversions. Exploratory data analysis (EDA): EDA describe the process of transforming raw data to insightful observations. It is open-ended analysis of transforming, visualizaing, and summarizing patterns in data. # 'pd' is the conventional alias for Pandas, as 'np' is for Num..

Computer Science 🌋/Machine Learning🐼

Data Science Lifecycle

2023.05.22

1. Ask a Question What do we want to know? A question that is too ambiguous may lead to confusion. What problems are we trying to solve? The goal of asking a question should be clear in order to justify your effors to stakeholders. What are the hypotheses we want to test? This gives a clear perspective from which to analyze final results. What are the metrics for our success? This gives a clear ..

Computer Science 🌋/Git

Git Commands

2023.05.06

Git Client Install(Git 클라이언트 설치) $ sudo apt install git Git Contribution Setting (Github서버에 올릴때 작성자) $ git config --global user.name "~" $ git config --global user.email "~@gmail.com" Set VScode as a default editor(VScode를 기본 에디터로 지정) $ git config --global croe.editor code-oss Set Root folder as a local storage (해당 루트폴더를 로컬 저장소로 지정) $ git init Command to check $ ls -a // sucess if you see .git D..

Computer Science 🌋/Network Programming 🛰️

MAC Protocols (3)

2023.05.06

Today, I learned about the CSMA collisions, "Taking turns" MAC protocols and Cable Internet. Compared to ALOHA, CSMA is more polite protocol algorithm. CSMA: Carrier Sense Multiple access CSMA: listen before transmit: If channel sensed idle: transmit entire frame. If channel sensed busy, defer transmission. In human analogy, "don't interrupt others!" Collision can still occur: Due to propagation..

티스토리툴바