Data Science

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Life Cycle and Design Review

SRS Review Q1: Suppose we have 6 people named A, B, C, D, E and F and we take an SRS of size 2. What is P(A in sample) = ? How about P(C and D in sample) = ? A: AB, AC, AD, AE, AF BC, BD, BE, BF CD, CE, CF DE, DF EF P(CD) = 1/15 P(A) = 1/3 Q2: We have two classrooms: D8 and D100. D8 has 10 students not named Sam. D100 has 4 students, one is named Sam. Suppose we flip a fair coin to pick a classr..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Basics in Pandas

Data cleaning: Data cleaning corrects issues in the structure and formatting of data, including missing values and unit conversions. Exploratory data analysis (EDA): EDA describe the process of transforming raw data to insightful observations. It is open-ended analysis of transforming, visualizaing, and summarizing patterns in data. # 'pd' is the conventional alias for Pandas, as 'np' is for Num..

Computer Science ๐ŸŒ‹/Machine Learning๐Ÿผ

Data Science Lifecycle

1. Ask a Question What do we want to know? A question that is too ambiguous may lead to confusion. What problems are we trying to solve? The goal of asking a question should be clear in order to justify your effors to stakeholders. What are the hypotheses we want to test? This gives a clear perspective from which to analyze final results. What are the metrics for our success? This gives a clear ..

KB0129
'Data Science' ํƒœ๊ทธ์˜ ๊ธ€ ๋ชฉ๋ก