Learning outcomes:
- Build strong understanding of programing using Python
- Learn to analyze data using Power BI
- Build strong understanding of data wrangling and machine learning
- Learn to build machine learning models using scikit-learn
Python Programming
1. Introduction to Python
- Useful Python Resources
- Python Tools and Utilities
- Python Features
2. Python Environment
- Local Environment Setup
- Downloads and Installations
- Setting up Environment Path
3. Executing Python
- Interactive Mode
- Scripting Mode
- Integrated Development Environment
4. Python Basic Syntax
- Python Identifiers
- Reserved Words
- Lines and Indentation
5. Python Variable Types
- Assigning Values to Variables
- Multiple Assignment
- Standard Data Types
- Data Type Conversion
6. Python Basic Operators
- Arithmetic Operators
- Comparison Operators
- Assignment Operators
- Bitwise Operators
- Logical Operators
- Membership Operators
- Identity Operators
- Operators Precedence
7. Python Decision Making
- IF statements
- IF...ELIF...ELSE Statements
- Nested IF statements
8. Python Loops
- While loop
- For loop
- Nested loop
- Break control statement
- Continue statement
- Pass statement
9. Python Numbers
- Number type conversion
- Mathematical function
- Random number function
- Trigonometric function
10. Python Strings
- String special operators
- String formatting operator
- Built-in string methods
11. Python Lists
- Basic list operations
- Indexing and slicing
- Built-in functions and methods
12. Python Tuples
- Basic tuple operations
- Indexing and slicing
- Built-in functions
13. Python Dictionary
- Basic Dictionary operations
- Built-in Functions and Methods
- Use cases
14. Python Functions
- Pass by reference and value
- Function Arguments
- Scope of variables
- Default Argument Values
- Keyword Arguments
- Arbitrary Argument Lists
- Unpacking Argument Lists
- Lambda Expressions
- Documentation Strings
15. Python Modules
- Importing Modules
- Namespaces and scoping
- Packages
16. Python Files I/O
- Writing and Parsing Text Files
- Parsing Text Using Regular Expressions
- Writing and Parsing XML Files
- Writing and Parsing JSON Files
- Writing and Parsing CSV Files
17. Python Exceptions
- The except clause with multiple exceptions
- The try-finally clause
- Argument of an Exception
- Raising an exception
- User-Defined Exceptions
18. Python Classes and Objects
- Creating Classes
- Creating instance objects
- Destroying Objects (Garbage Collection)
- Custom Classes
- Attributes and Methods
- Inheritance and Polymorphism
- Using Properties to Control Attribute Access
19. Functional Programming
- Lambda
- Filter
- Map
- Functools
20. Iterators and Generators
- Itertools
- Generators
- Decorators
21. Collections
- Deque
- Counter
- OrderedDict
- ChainMap
23. Debugging, Testing
24. Regular Expressions
- Characters and Character Classes
- Quantifiers
- Grouping and Capturing
- Assertions and Flags
- The Regular Expression Module
25. Deploying Python Applications
- Pip
- Virtualenv
- The init.py files
- The setup.py file
- Installing the package
- Software deployment in Python
Data Analysis
1. Data Quality
- Introduction to Data Quality
- Handling different Data Quality Issues
2. Phases of Data Analysis
- Understanding different phases of a typical Data Analytics Project
3. Understanding of Data
- Intro to types of data
- Derived Facts/Dimensions
- Building dimensions from Facts (Binning)
- Granularity of Data
4. Understanding Data Operations
- Select and Filter
- Simple vs Complex
- Sort
- Group and Aggregate
- Merge
- Pivot
- Unpivot
- Windowing
5. Data Modeling
- Understanding: Unique Keys, Key References, Cardinality, ER Diagram
- Introduction to Data Quality
- The Six Dimensions of Data Quality
6. Excel Refresher
- Frequently used Excel Functions
- Useful Shortcuts for Faster Excel Analysis
- Tables in Excel
- Data Formatting in Excel
- Visualization with Excel
7. Power Query Essentials
- Data Ingestion in PowerQuery
- Data Quality Checks
- Text Processing
- Data Transformations in PowerQuery
8. Power BI Essentials
- Overview of Power BI Tools
- Handling Data Types and Formats
- Handling Special Data Category
- Creating Hierarchical Dimensions
- KPI Cards
- Bar Charts / Column Charts
- Filters (Simple vs Complex)
- Slicers
- Formatting & Aesthetics
- Publishing and Sharing your Dashboard
- Exploring different Chart Options
- Understanding Important Terms in a Given Visual
- Pivot/Matrix Tables
- Creating Drilldown Reports
- Introduction to DAX
- Commonly used DAX Functions
- Applications of DAX Concepts
- Exploring different types of visuals
- Publishing Modified Dashboard
9. Probability Theory
- Types of Events
- Idea of a Random Events
- Understanding via Example Datasets
- Discrete vs Continous Random Variables
- Nominal, Ordinal, Ratio/Interval Data
- Basic Probability Theory
- Idea of MECE events
- Idea of Conditional events / Independent Events
- Idea of Bayes Theorem
10. Descriptive Statistics
- Different Types of Distributions
- Understanding the Normal distribution
- Parameters defining a Normal distribution
- What is a standard normal distribution?
- The Central limit theorem
- The techniques of data summarization in Statistics
- Measures of central tendencies for univariate data
-
- Mean, Median, Mode, Variance, Co-variance, Standard Deviation etc.
- Skewness & Kurtosis of a distribution
- Meaning of left, right skewed data
11. Visualizing univariate data
- Histograms, Box-and-whiskers plot, Violin plots, Frequency distributions
- Bi-variate analysis
- Visualizing bi-variate data
12. Inferential Statistics
- Sampling - Why & How
- Understanding confidence interval and p-value
- Null & Alternate Hypothesis
- Tests of Significance
- ANOVA
- Chi-Square Test
- The Bayes Theorem
- Decision Tree - Why & How in Excel
- Multi-variate Analysis
- Applying Concepts of Stats in Regression analysis
- One-tailed vs 2-tailed tests
- understanding R-Squared
- A/B Testing
Data Wrangling
1. Black Box Introduction to Machine Learning
- What is not Machine Learning
- What is Machine Learning
- Types of ML - Supervised, Unsupervised
- Supervised - Classification, Regression
- Unsupervised - Clustering, Association
- Machine Learning Pipeline
2. Essential NumPy
- Introduction to NumPy
- Creation
- Access
- Stacking and Splitting
- Methods
- Broadcasting
3. Pandas for Machine Learning
- Introduction to Pandas
- Understanding Series & DataFrames
- Loading CSV,JSON
- Connecting databases
- Descriptive Statistics
- Accessing subsets of data - Rows, Columns, Filters
- Handling Missing Data
- Dropping rows & columns
- Handling Duplicates
- Function Application - map, apply, groupby, rolling, str
- Merge, Join & Concatenate
- Stacking, Unstacking & Melting
- Pivot-tables
- Normalizing JSON
- Application - EDA on Employee data, sales data
4. Understanding Visualization:
- Introduction to matplotlib & seaborn
- Basic Plotting
- Title, Labels, Legends, Grid, colormap, xticks, yticks
- Color, linewidth
- Sub Plotting
- Scatter plot
- Histogram
- Bar Graphs
- Plotting distributions
- Plotting 3D data
- Fundamentals of Tableau
Machine Learning
1. Linear Models for Classification & Regression
- Simple Linear Regression using Ordinary Least Squares
- Gradient Descent Algorithm
- Regularized Regression Methods - Ridge, Lasso, Elastic Net
- Logistic Regression for Classification
- OnLine Learning Methods - Stochastic Gradient Descent & Passive Aggressive
- Robust Regression - Dealing with outliers & Model errors
- Polynomial Regression
- Bias-Variance Tradeoff
- Application - House Price, Cancer Prediction, Insurance Prediction
2. Preprocessing for Machine Learning
- Introduction to Preprocessing
- StandardScaler
- MinMaxScaler
- RobustScaler
- Normalization
- Binarization
- Encoding Categorical (Ordinal & Nominal) Features
- Imputation
- Polynomial Features
- Custom Transformer
- Text Processing
- CountVectorizer
- TfIdf
- HashingVectorizer
- Image using skimage
3. Decision Trees
- Introduction to Decision Trees
- The Decision Tree Algorithms
- Decision Tree for Classification
- Decision Tree for Regression
- Advantages & Limitations of Decision Trees
- Application - Cloth Prediction
4. Naive Bayes
- Introduction Bayes' Theorem
- Naive Bayes Classifier
- Gaussian Naive Bayes
- Multinomial Naive Bayes
- Bernoulli’s Naive Bayes
- Naive Bayes for out-of-core
- Application - Text Classification, Sentiment Analysis and Spam & Non-spam classification
5. Composite Estimators using Pipelines & FeatureUnions
- Introduction to Composite Estimators
- Pipelines
- Transformed Target Regressor
- FeatureUnions
- ColumnTransformer
- GridSearch on pipeline
- Application - Author classification
6. Model Selection & Evaluation
- Cross Validation
- Hyperparameter Tuning
- Model Evaluation
- Model Persistence
- Validation Curves
- Learning Curves
7. Feature Selection & Dimensionality Reduction
- Introduction to Feature Selection
- Variance Threshold
- Chi-squared stats
- ANOVA using f_classif
- Univariate Linear Regression Tests using f_regression
- F-score vs Mutual Information
- Mutual Information for discrete value
- Mutual Information for continues value
- SelectKBest
- SelectPercentile
- SelectFromModel
- Recursive Feature Elimination
- PCA
- SVD
- Application - Credit Risk Prediction
8. Nearest Neighbors
- Fundamentals of Nearest Neighbor Algorithm
- Unsupervised Nearest Neighbors
- Nearest Neighbors for Classification
- Nearest Neighbors for Regression
- Nearest Centroid Classifier
- Application - Nearest neighbour for face inpainting
9. Clustering Techniques
- Introduction to Unsupervised Learning
- Clustering
- Similarity or Distance Calculation
- Clustering as an Optimization Function
- Types of Clustering Methods
- Partitioning Clustering - KMeans & Meanshift
- Hierarchical Clustering - Agglomerative
- Density Based Clustering - DBSCAN
- Measuring Performance of Clusters
- Comparing all clustering methods
- Application - Grouping similar customers
10. Anomaly Detection
- What are Outliers ?
- Statistical Methods for Univariate Data
- Using Gaussian Mixture Models
- Fitting an elliptic envelope
- Isolation Forest
- Local Outlier Factor
- Using clustering method like DBSCAN
- Application - Anomaly detection for credit risk prediction
11. Support Vector Machines
- Introduction to Support Vector Machines
- Maximal Margin Classifier
- Soft Margin Classifier
- SVM Algorithm for Classification
- SVM for Regression
- Hyper-parameters in SVM
- Application - Face recognition and breast cancer classification
12. Dealing with Imbalanced Classes
- What are imbalanced classes & their impact?
- OverSampling
- UnderSampling
- Connecting Sampler to pipelines
- Making classification algorithm aware of Imbalance
- Anomaly Detection
- Application - Fraud detection
13. Ensemble Methods
- Introduction to Ensemble Methods
- RandomForest
- AdaBoost
- Gradient Boosting Tree
- VotingClassifier
- XGBoost
- Application - Malicious data detection
14. Recommendation Engine
- Understanding distance vector calculation - cosine, euclidean, manhattan
- Types of Recommendation Engines
- Recommendation based on similarity
- Application - Grouping videos based on description, user rating prediction
15. Time Series Modeling
- Simple Average & Moving Average
- Single Exponential Smoothing
- Holt’s linear trend method
- Holt’s winter seasonal method
- ARIMA
16. Packaging & Deployment
- Creating Python Package
- Deploy trained model behind REST interface
- Deploy model behind API call
- Deploy on AWS cloud (optional)
Mindset for Problem Solving
1. Mathematical Aptitude
- Percentages
- Profit and Loss
- Simple Interest and Compound Interest
- Work And Time
- Probability
- Permutation and Combination
- Profit and Loss
- Time & Speed
- Ratios and Proportions
- Data Interpretation
2. Art of Learning Anything
- What is Intelligence
- Relation of success with intelligence
- Illusion of Learning
- Focussed Mode vs Diffused Mode
- Procrastination
- Improving Recall
- Creating Brain Links
- Visual memory & Data Memory
- Slow Thinking
3. Computational Thinking
- Thinking before Doing/Coding
- Problem Identification
- Decomposition
- Pattern Recognition
- Abstraction
- Algorithm Design
- Computational Thinking Use Case 1
- Computational Thinking Use Case 2
4. Technical Puzzles
- Why are Puzzles part of interviews?
- The Art of solving puzzles
- Approach more important than the solution
- Puzzles for Vertical Thinking
- Puzzles for Horizontal Thinking
Productivity and Decision Making
1. Art of being Super Productive
- Start with Why to make objectives clear
- Thinking Limitless
- The magic of computing returns
- Deciding what to work on
- Time Management Skills
- Measuring what matters
- Choosing wisely habits to inculcate
2. Effective Decision Making
- Why is decision making a key skill?
- Components of Decision Making
- Understanding common biases
- Letting emotions not clutter decision making
- Difference between quick decision making & slow decision making.
Professional Communication
1. Reading comprehension & Short writing
- Building vocabulary
- Extracting insights from the textual information
- Drawing inferences from multiple stories
- Writing you inferences for others to understand
2. Book Reading & Writing Reviews
- Reading 10 books during the entire course & writing book reviews
- 2 Biographies
- 2 Fictions
- 6 Non-Fictions
3. Effective Understanding & Articulation
- Watching 20 movies from our suggested list
- Writing 1000 words essay on those movies
- Writing a summary of the movies
4. Group Discussion for decision making
- Understanding why GD is so important in personal & professional life
- The objective of GD - Collectively making the right decision
- 5 GD on various topics
5. Writing Professional chat/E-mail
- Writing as the most common method of professional communication
- Factors to keep in mind before starting to write
- Points to consider while writing
- Activities after writing
- Difference between chat writing & email writing
6. Making Impressive Presentation
- Why making a presentation is a professional job
- The objective of the presentation
- Attributes of good presentation
- Why research is key to the presentation
- Making a presentation interactive
- Doing 10 video/live presentation
Computer Fundamentals
1. Operating System Concepts
- Operating System Architecture
- Processes and Process Management
- Threads and Concurrency control
- Scheduling
- Memory Management
- Inter-Process Communication
- Synchronization Constructs
- I/O Management
- Resource Virtualization
- Remote Services
- Distributed Systems
- Introduction to Data Center Technologies
2. Linux Administration
- Introduction to Linux Operating Systems
- Basic Linux Commands
- File Management and Security
- The directory structure of Unix
- User Management
- Groups
- Shell types and basic commands
- Permissions
- sudo
- Systemd Services Start and Stop
- Resource Mgmt with systemctl
- Process Management (top, ps)
- Package Management(yum, apt, rpm)
- Managing disks (lsblk, df, mount, umount,du)
- File systems
3. Data Structures and Algorithms
- Built-in Data Type
- Integers
- Boolean
- Floating
- Character and Strings
- Derived Data Type
- Linked List
- Singly Linked List
- Doubly Linked List
- Circular Linked List
- Array
- Stack
- Queue
- Tree
- Basic Operations
- Traversing
- Searching
- Sorting
- Hashing
- Insertion
- Deletion
- Merging
- Searching techniques
- Binary search
- Linear search
- Recursion
- Fibonacci series
- Sorting Algorithm
- Bubble sort
- Insertion sort
- Selection sort
- Quick sort
- Merge sort
- Bucket sort
4. Database concepts
- Introduction to Databases
- Entity Relationship Model
- Relational Model
- Relational Algebra
- Normalization
- Transactions and Concurrency Control
- DBMS Architecture 2-level 3-level
- Data Abstraction and Data Independence
- Database Objects
- Entity-Relationship Model
- Generalization
- Specialization
- Aggregation
- Entity Relationship Diagrams
- Keys in Relational Model
- Candidate key,
- Super key
- Primary key
- Alternate key
- Foreign key
- Strategies for Schema design
- Schema Integration
- Data modelling
- Star Schema in Data Warehouse modelling
- Data Warehouse Modeling
5. Basic SQL - Syntax
- Data Types
- Operators
- Expressions
- Create Database
- Drop Database
- Select Queries
- Create Table
- Drop Table
- Other Table Operations
- Insert Query
- Where Clause
- AND & OR Clauses
- Update operations
- Delete operations
- Order By clause
- Group By Clause
- Sorting operations
- SQL Constraints
- Type of Joins
- Unions Clause
- NULL Values
- Indexing
- Views
6. Software Engineering
- Software Engineering Overview
- Features of Good Software:
- Operational Features
- Transitional Features
- Maintenance Features
- Software Development:
- Requirement Gathering
- Software Design
- Programming
- Software Design
- Design
- Maintenance
- Programming
- Programming:
- Coding
- Testing
- Integration
- Software Development Life Cycle
- Requirement Gathering
- System Analysis
- Software Design
- Coding
- Testing
- Integration
- Deployment
- Operation and Maintenance
- Types of SDLC
- Waterfall model
- Iterative Model
- Spiral model
- V Model
- Agile Concepts
- DevOps Concepts
- Microservices Architecture
- Features of Microservices Architecture
- Software Requirements
- Software Design Basics
- Analysis & Design Tools
- Data Flow Diagram
- Flow Chart
- Design Strategies
- Function-Oriented Design
- Object-Oriented Design
- User Interface Design
- Command Line Interface(CLI)
- Graphical User Interface (GUI)
- Design Complexity
- Software Testing Overview
- Manual Vs Automated Testing
- Testing Approaches
- Black-box testing
- White-box testing
- Unit Testing
- Integration Testing
- Functionality testing
- Acceptance Testing
- Regression Testing
- Quality Control
- Deployment Methods
- Blue-Green Deployment
- Rolling Deployment
- Software Monitoring
- Software Maintenance
7. Tools
- Git
- What is Git?
- Installing Git
- First-Time Git Setup
- Git Basics
- Getting a Git Repository
- Recording Changes to the Repository
- Viewing the Commit History
- Undoing Things
- Working with Remotes
- Tagging
- Git Branching
- Basic Branching and Merging
- Branch Management
- Branching Workflows
- Remote Branches
- Rebasing
- Putty
- Installation
- Types of connections
- Connecting to a remote server
- Using Auth keys
- Customizing putty
- Vim
- Vim Basics
- Insert Mode
- Visual Mode
- Command Mode
- Create and Edit a file
- Search and replace in Vim
- Vim diff
- Copy operations
- vimrc file
- Vim Commands