Computer Science ๐ŸŒ‹

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Aggregation in Pandas

GroupBy(), Continued As we learned last lecture, a groupby operation involves some combination of splitting a DataFrame into grouped subframes, applying a function, and combining the results. Organizes all rows with the same year into a subframe for that year. Creates a new DataFrmae with one row representing each subframe year. Combines all integer rows in each subframe using the sum function. ..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Aggregating Data with GroupBy in Pandas

GroupBy # aggregate all rows in babynames for a given year babynames.groupby("Year") # Output: โ€ป The reason for strange output: calling .groupby has generated a GroupBy object! .agg ''' .agg method takes in a function as its argument; this function is then applied each column of a "mini" grouped DataFrame. We end up with a new DataFrame with one aggregated row per subframe ''' # return the numbe..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Add & Remove Columns

Add columns # specify the name of the new column -> dataframe["new_columns"] # Add a column named "name_lengths" that includes the length of each name babynames["name_lengths"] = babynames["Names"].str.len() babynames.head(5) Sort by the temporary column # Sort by the temporary column babynames = babynames.sort_values(by = "name_lengths", ascending=False) babynames.head() .map # First, define a ..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Handy Utility Functions in Pandas

Numpy bella_counts = babynames[babynames["Name"] == "Bella"]["Count"] # Average number of babies named Bella each year np.mean(bella_counts) # Max number of babies named Bella born on a given year max(bella_counts) .shape & .size # return a tuple containing the number of rows and columns babynames.shape # return the total number of elements in a structure, equivalent to the number of rows times ..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Conditional Selection in Pandas

Conditional Selection # Ask yourself: why is :9 is the correct slice to select the first 10 rows? babynames_first_10_rows = babaynames.loc[:9, :] # Notice how we have exactly 10 elements in our boolean array argument babynames_first_10_rows[[True, False, True, False, True, False, True, False, True, False]] To make things easier, we can instead provide a logical condition as an input to .loc or [..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Indexing in Pandas

# elections.loc[0, "Candidate"] - Previous approach elections.iloc[0, 1] DataFrame is a collection of Series that all shares the same index. Index doesn't have to be an integer, nor does it have to unique. # this sets the index to the "Candidate" column elections.set_index("Candidate", inplace=True) elections.index ''' Index(['~', '~',,,,'~"], dtype='object', name='Candidate', length=182) ''' # ..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Basics in Pandas

Data cleaning: Data cleaning corrects issues in the structure and formatting of data, including missing values and unit conversions. Exploratory data analysis (EDA): EDA describe the process of transforming raw data to insightful observations. It is open-ended analysis of transforming, visualizaing, and summarizing patterns in data. # 'pd' is the conventional alias for Pandas, as 'np' is for Num..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Data Science Lifecycle

1. Ask a Question What do we want to know? A question that is too ambiguous may lead to confusion. What problems are we trying to solve? The goal of asking a question should be clear in order to justify your effors to stakeholders. What are the hypotheses we want to test? This gives a clear perspective from which to analyze final results. What are the metrics for our success? This gives a clear ..

Computer Science ๐ŸŒ‹/Git

Git Commands

Git Client Install(Git ํด๋ผ์ด์–ธํŠธ ์„ค์น˜) $ sudo apt install git Git Contribution Setting (Github์„œ๋ฒ„์— ์˜ฌ๋ฆด๋•Œ ์ž‘์„ฑ์ž) $ git config --global user.name "~" $ git config --global user.email "~@gmail.com" Set VScode as a default editor(VScode๋ฅผ ๊ธฐ๋ณธ ์—๋””ํ„ฐ๋กœ ์ง€์ •) $ git config --global croe.editor code-oss Set Root folder as a local storage (ํ•ด๋‹น ๋ฃจํŠธํด๋”๋ฅผ ๋กœ์ปฌ ์ €์žฅ์†Œ๋กœ ์ง€์ •) $ git init Command to check $ ls -a // sucess if you see .git D..

Computer Science ๐ŸŒ‹/Network Programming ๐Ÿ›ฐ๏ธ

MAC Protocols (3)

Today, I learned about the CSMA collisions, "Taking turns" MAC protocols and Cable Internet. Compared to ALOHA, CSMA is more polite protocol algorithm. CSMA: Carrier Sense Multiple access CSMA: listen before transmit: If channel sensed idle: transmit entire frame. If channel sensed busy, defer transmission. In human analogy, "don't interrupt others!" Collision can still occur: Due to propagation..

KB0129
'Computer Science ๐ŸŒ‹' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (3 Page)