• support@conveytechlabs.com

online_training [woocommerce_currency_switcher_drop_down_box]

500.00$

Category:

Description

Course Outline:

1.Introduction Data Science

 

  • CRISP – DM Framework
  • Technology stack for Data Science

2.RDBMS (Oracle ) with SQL

  • SQL Introduction (DDL, DML)
  • Joins
  • Views, Triggers and Procedures
  • Advanced SQL for Analytics

3.Python programming

  • Variables and data types
  • Standard I/O
  • Operators
  • Control flow (if else, for, while, break and continue)
  • Data Structures ( Lists, Tuples, Sets, Dictionary and Strings)
  • Functions ( recursive, lambda functions, map, filter and reduce)
  • Modules and Packages
  • Working with Python Libraries ( OS, datetime, system)
  • Exception Handling
  • Object Oriented Programming ( Classes, Objects, oops )

4.Exploratory Data Analysis

  • Basic statistics
  • Hypothesis testing
  • Data distributions (Central Limit Theorem )
  • Introduction to visualization
  • Plotting with Matplotlib and seaborn
  • Introduction to Tableau for Reporting
  • Percentiles and Quartiles
  • IQR, box-plot and whiskers
  • Bar Charts, Pie Charts, Line and Pair charts
  • Uni variate, bi variate and multi variate analysis
  • EDA case study

5.Python For Data Science

  • Introduction to numpyand operations on numpy
  • Getting started with Pandas and operations on pandas
  • Sampling techniques
  • Data Preprocessing with Pandas (excel, csv and pdf)
  • Missing value analysis ( NULL value treatment)
  • Data Normalization and standardization
  • Outlier analysis and treatment
  • Web scrapping using beautifulsoup, word clouds

6.Machine Learning with Python

a) Linear Regression:

  • Algebra for regression
  • Assumptions  of Linear regression
  • Multiple regression
  • Feature Selection ( VIF and P-statistic)
  • Model building
  • Parameter tuning for regression
  • Model validation ( Accuracy, Variance, R-squared)
  • Bias variance tradeoff
  • Case study on regression

b) Logistic Regression:

  • Logistic regression intuition
  • Sigmoid function, mathematics behind logistic regression
  • Feature engineering and collinearity
  • Regularization (L1 and L2) and parameter tuning
  • Case study on logistic regression

c) Decision Trees:

  • Decision trees introduction
  • Homogeneity, GINI index and Information gain
  • Building decision trees and parameter tuning
  • Truncating and Pruning trees
  • Random forest (ensembles)
  • OOB (out of bag error)
  • Cross validation, bagging  and boosting (XG boost, ada boost and GBM)
  • Case study on decision tree and random forest

d) K nearest neighbor for classification

e) Model deployment with PMML, H5 and pickle

Course Description

Data Science is the study of the extraction of knowledge from data. Being a data scientist requires an integrated skill set spanning mathematics, statistics, machine learning, databases and other branches of computer science. This program is designed for fresh graduates and professionals looking to build their career into Data Science and Analytics. Candidates from the program are able to transition to roles such as Data Scientists, business analysts, data analysts, Machine Learning Engineer, etc. by learning relevant data science techniques, tools and technologies and hands-on application through industry case studies.

Course Information

  • Duration: 40 hrs
  • Timings: Weekdays 1-2 Hours per day (or) Weekends: 2-3 Hours per day
  • Method: Online/Classroom Training
  • Study Material: Soft Copy

Course Content

DATA SCIENCE | MACHINE LEARNING | BIG DATA ANALYTICS

(With Python)

 

Objectives:

  • Describe what Data Science is and the skill sets needed to be a data scientist.
  • Gain  knowledge on every phase of Data Science lifecycle
  • Getting Familiar with Python Programming  and using python for data science
  • Learn RDBMS and SQL  using SQL for Analytics
  • Learn Statics and Mathematics required for data analytics and machine learning
  • Explain the significance of exploratory data analysis (EDA) in data science. Apply basic tools (plots, graphs, summary statistics) to carry out EDA.
  • Machine learning (Regression, Classification, Clustering, and Forecasting) with industry relevant case studies to implement
  • Gain knowledge on deep learning, text mining and cloud computing
  • Getting introduced to Big data and using Apache Spark for Machine Learning

 

Audience:

  • Professionals seeking advancement in their Data Science career
  • Experienced Project Managers looking to update their skills
  • Fresh Graduates seeking bright career into Data Science

 

Prerequisites:

  • None

 

Course Description:

Data Science is the study of the extraction of knowledge from data. Being a data scientist requires an integrated skill set spanning mathematics, statistics, machine learning, databases and other branches of computer science. This program is designed for fresh graduates and professionals looking to build their career in Data Science and Analytics. Candidates from the program are able to transition to roles such as Data Scientists, business analysts, data analysts, Machine Learning Engineer, etc. by learning relevant data science techniques, tools and technologies and hands-on application through industry case studies.

Course Timings:

  • Weekdays – Daily one hour

 

Course Duration:

  • 3 months

 

Course Outline

 

  1. Introduction of Data Science
  • CRISP-DM Framework
  • Technology stack for Data Science

 

  1. RDBMS (Oracle ) with SQL
  • SQL Introduction (DDL, DML)
  • Joins
  • Views, Triggers, and Procedures
  • Advanced SQL for Analytics
  • Python programming
  • Variables and data types
  • Standard I/O
  • Operators
  • Control flow (if else, for, while, break and continue)
  • Data Structures ( Lists, Tuples, Sets, Dictionary, and Strings)
  • Functions ( recursive, lambda functions, map, filter and reduce)
  • Modules and Packages
  • Working with Python Libraries ( OS, datetime, system)
  • Exception Handling
  • Object-Oriented Programming ( Classes, Objects, oops )

 

  1. Exploratory Data Analysis
  • Basic statistics
  • Hypothesis testing
  • Data distributions (Central Limit Theorem )
  • Introduction to visualization
  • Plotting with Matplotlib and seaborn
  • Introduction to Tableau for Reporting
  • Percentiles and Quartiles
  • IQR, box-plot, and whiskers
  • Bar Charts, Pie Charts, Line and Pair charts
  • Uni variate, bi variate and multi variate analysis
  • EDA case study

 

  1. Python For Data Science
  • Introduction to numpyand operations on numpy
  • Getting started with Pandas and operations on pandas
  • Sampling techniques
  • Data Preprocessing with Pandas (excel, csv, and pdf)
  • Missing value analysis ( NULL value treatment)
  • Data Normalization and standardization
  • Outlier analysis and treatment
  • Web scrapping using beautifulsoup, word clouds

 

  1. Machine Learning with Python
  2. Linear Regression:
  • Algebra for regression
  • Assumptions  of Linear regression
  • Multiple regression
  • Feature Selection ( VIF and P-statistic)
  • Model building
  • Parameter tuning for regression
  • Model validation ( Accuracy, Variance, R-squared)
  • Bias variance tradeoff
  • Case study on regression

 

  1. Logistic Regression:
  • Logistic regression intuition
  • Sigmoid function, mathematics behind logistic regression
  • Feature engineering and collinearity
  • Regularization (L1 and L2) and parameter tuning
  • Case study on logistic regression

 

  1. Decision Trees:
  • Decision trees introduction
  • Homogeneity, GINI index, and Information gain
  • Building decision trees and parameter tuning
  • Truncating and Pruning trees
  • Random forest (ensembles)
  • OOB (out of bag error)
  • Cross-validation, bagging  and boosting (XG boost, ada boost and GBM)
  • Case study on decision tree and random forest

 

  1. K nearest neighbor for classification
  2. Model deployment with PMML, H5, and pickle
  3. Clustering:
  • K means clustering
  • Hierarchical clustering
  • Principle Component Analysis (PCA)
  • Visualizing clusters
  • Case study: PCA and clustering
  1. SVM (support vector machines)
  2. Market basket analysis
  3. Working with time series data

 

  • Introduction Deep Learning
  • Mathematics behind deep learning
  • Forward and backward propagation
  • Tensorflow  and Keras frameworks
  • Activation functions ( sigmoid, RELU)
  • Hyper parameter tuning for neural networks
  • CNN and RNN
  • Case study: Neural networks

 

  • Hadoop for Data Science
  • Introduction to distributed computing
  • Map reduce framework
  • SQOOP
  • HIVE

 

  1. Spark for Data Science
  • Introduction to in memory computing
  • Working with spark data frames
  • PySpark and Spark SQL
  • Spark ML lib for Machine Learning

 

Key Features

  • Career oriented training.
  • One to One live interaction with a trainer.
  • Demo project end to end explanation.
  • Interview guidance with resume preparation.
  • Support with the trainer through E-mail.

Live Traffic

Live Traffic Feed

Registration

Enquery

conveytechlabs