## Description

**Course Outline:**

**Course Outline:**

**1.Introduction Data Science**

- CRISP – DM Framework
- Technology stack for Data Science

**2.RDBMS (Oracle ) with SQL**

**2.RDBMS (Oracle ) with SQL**

- SQL Introduction (DDL, DML)
- Joins
- Views, Triggers and Procedures
- Advanced SQL for Analytics

**3.Python programming**

**3.Python programming**

- Variables and data types
- Standard I/O
- Operators
- Control flow (if else, for, while, break and continue)
- Data Structures ( Lists, Tuples, Sets, Dictionary and Strings)
- Functions ( recursive, lambda functions, map, filter and reduce)
- Modules and Packages
- Working with Python Libraries ( OS, datetime, system)
- Exception Handling
- Object Oriented Programming ( Classes, Objects, oops )

**4.Exploratory Data Analysis**

**4.Exploratory Data Analysis**

- Basic statistics
- Hypothesis testing
- Data distributions (Central Limit Theorem )
- Introduction to visualization
- Plotting with Matplotlib and seaborn
- Introduction to Tableau for Reporting
- Percentiles and Quartiles
- IQR, box-plot and whiskers
- Bar Charts, Pie Charts, Line and Pair charts
- Uni variate, bi variate and multi variate analysis
- EDA case study

**5.Python For Data Science**

**5.Python For Data Science**

- Introduction to numpyand operations on numpy
- Getting started with Pandas and operations on pandas
- Sampling techniques
- Data Preprocessing with Pandas (excel, csv and pdf)
- Missing value analysis ( NULL value treatment)
- Data Normalization and standardization
- Outlier analysis and treatment
- Web scrapping using beautifulsoup, word clouds

**6.Machine Learning with Python**

**6.Machine Learning with Python**

**a) Linear Regression:**

**a) Linear Regression:**

- Algebra for regression
- Assumptions of Linear regression
- Multiple regression
- Feature Selection ( VIF and P-statistic)
- Model building
- Parameter tuning for regression
- Model validation ( Accuracy, Variance, R-squared)
- Bias variance tradeoff
- Case study on regression

**b) Logistic Regression:**

- Logistic regression intuition
- Sigmoid function, mathematics behind logistic regression
- Feature engineering and collinearity
- Regularization (L1 and L2) and parameter tuning
- Case study on logistic regression

**c) Decision Trees:**

- Decision trees introduction
- Homogeneity, GINI index and Information gain
- Building decision trees and parameter tuning
- Truncating and Pruning trees
- Random forest (ensembles)
- OOB (out of bag error)
- Cross validation, bagging and boosting (XG boost, ada boost and GBM)
- Case study on decision tree and random forest

**d) K nearest neighbor for classification**

**e) Model deployment with PMML, H5 and pickle**