Machine Learning Sandpit:
- Cover all related topics for the AWS ML certification
#ML Data Basics:
#Pre-processing cheat sheet
- Missing values
- Categorical values
- Normalize data
- Standardize data
- Feature extraction
- Feature selection
#Data ingestion & validation
- Missing value treatment
- Imputation
- Encoding (binary, one-hot)
- Feature scaling
- Feature engineering
- Class imbalance
#Pandas Basics:
- df = pd.read_csv('data/test.csv')
- "df" is a Data Frame (A container object in pandas for holding structured data), pd.read returns this
- df.isnull().sum() is returning a structured output on the dataset for any fields with null values (zero)