This course has two Highlights.

- The course will help you learn the techniques used by industry experts and working professionals.
- The course is delivered in simple English language. Most of the complicated terminology is avoided to help you understand easily.

At the end of the course, you are expected to finish a final project.

The topics covered in this course include:

- Deep Learning / Neural Networks (MLP's, CNN's, RNN's)
- Regression analysis
- K-Means Clustering
- Principal Component Analysis
- Train/Test and cross-validation
- Bayesian Methods
- Decision Trees and Random Forests
- Multivariate Regression
- Multi-Level Models
- Support Vector Machines
- Reinforcement Learning
- Collaborative Filtering
- K-Nearest Neighbor
- Bias/Variance Tradeoff
- Ensemble Learning
- Frequency / Inverse Document Frequency
- Experimental Design and A/B Tests

Prior experience in any coding will be required for this course. However, Python is not mandatory. This course starts with a crash course where you can learn Python easily if you have any coding experience.

- A desktop computer (Windows, Mac, or Linux) which supports Enthought Canopy 1.6.2 or newer. You will learn the installation process during this course.
- Coding or scripting experience is required.
- Minimum of high school level math skills will be required.

- Develop using Python notebooks
- Understand statistical measures such as standard deviation
- Visualize data distributions, probability mass functions, and probability density functions
- Visualize data with matplotlib
- Use covariance and correlation metrics
- Apply conditional probability for finding correlated features
- Use Bayes' Theorem to identify false positives
- Make predictions using linear regression, polynomial regression, and multivariate regression
- Understand complex multi-level models
- Use train/test and K-Fold cross-validation to choose the right model
- Build a spam classifier using Naive Bayes
- Use decision trees to predict hiring decisions
- Cluster data using K-Means clustering and Support Vector Machines (SVM)
- Build a movie recommender system using item-based and user-based collaborative filtering
- Predict classifications using K-Nearest-Neighbor (KNN)
- Apply dimensionality reduction with Principal Component Analysis (PCA) to classify flowers
- Understand reinforcement learning - and how to build a Pac-Man bot
- Clean your input data to remove outliers
- Implement machine learning, clustering, and search using TF/IDF at massive scale with Apache Spark's MLLib
- Design and evaluate A/B tests using T-Tests and P-Values

According to Glassdoor, in 2016 data science was the highest paid field to get into. Of course, this follows the basic laws of economics - supply and demand. The demand for data science is very high, while the supply is too low.

What are some examples of data science?

**Google**. They are the definition of data science. Everything they do is data driven by their search engine (google.com), through their YouTube efforts, maximization of ad revenue, etc.**Amazon**. Each product recommendation that you get comes from Amazon’s sophisticated data science algorithms.**Facebook.**Facebook is generating ad revenue like crazy since it has all that personal data for all its users. Since you interact with the platform, they know if you prefer cat videos or dog videos, so they know if you are a cat person or a dog person.

*“A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.”*

On any given day, a data scientist may be required to:

- Conduct undirected research and frame open-ended industry questions
- Extract huge volumes of data from multiple internal and external sources
- Employ sophisticated analytics programs, machine learning and statistical methods to prepare data for use in predictive and prescriptive modeling
- Thoroughly clean and prune data to discard irrelevant information
- Explore and examine data from a variety of angles to determine hidden weaknesses, trends and/or opportunities
- Devise data-driven solutions to the most pressing challenges
- Invent new algorithms to solve problems and build new tools to automate work
- Communicate predictions and findings to management and IT departments through effective data visualizations and reports
- Recommend cost-effective changes to existing procedures and strategies

Every company will have a different take on job tasks. Some treat their data scientists as glorified **Data Analysts** or combine their duties with **Data Engineers**; others need top-level analytics experts skilled in intense machine learning and data visualizations.

*Source: Indeed.com*

Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically…

Get your team access to WIISE top 2,000 courses anytime, anywhere.

- Introduction 00:02:41
- Getting What You Need 00:02:36
- Installing Enthought Canopy 00:06:51
- Python Basics, Part 1 00:15:58
- Python Basics, Part 2 00:09:41
- Running Python Scripts 00:03:55
- Introducing the Pandas Library 00:10:14

- Types of Data 00:06:59
- Mean, Median, Mode 00:05:26
- Using mean, median, and mode in Python 00:08:30
- Variation and Standard Deviation 00:11:12
- Probability Density Function; Probability Mass Function 00:03:27
- Common Data Distributions 00:07:45
- Percentiles and Moments 00:12:33
- A Crash Course in matplotlib 00:13:46
- Covariance and Correlation 00:11:31
- Conditional Probability 00:10:16
- Bayes' Theorem 00:05:23

- Linear Regression 00:11:01
- Polynomial Regression 00:08:04
- Multivariate Regression, and Predicting Car Prices 00:09:52
- Multi-Level Models 00:04:36

- Supervised vs. Unsupervised Learning, and Train/Test 00:08:57
- Using Train/Test to Prevent Overfitting a Polynomial Regression 00:05:47
- Bayesian Methods: Concepts 00:03:59
- Implementing a Spam Classifier with Naive Bayes 00:08:05
- K-Means Clustering 00:07:23
- Clustering people based on income and age 00:05:14
- Measuring Entropy 00:03:09
- Decision Trees: Concepts 00:08:43
- Decision Trees: Predicting Hiring Decisions 00:09:47
- Ensemble Learning 00:05:59
- Support Vector Machines (SVM) Overview 00:04:27
- Using SVM to cluster people using scikit-learn 00:05:36

- User-Based Collaborative Filtering 00:07:57
- Item-Based Collaborative Filtering 00:08:15
- Finding Movie Similarities 00:09:08
- Improving the Results of Movie Similarities 00:07:59
- Making Movie Recommendations to People 00:10:22
- Improve the recommender's results 00:05:29

- K-Nearest-Neighbors: Concepts 00:03:44
- Using KNN to predict a rating for a movie 00:12:29
- Dimensionality Reduction; Principal Component Analysis 00:05:44
- PCA Example with the Iris data set 00:09:05
- Data Warehousing Overview: ETL and ELT 00:09:05
- Reinforcement Learning 00:12:44

Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically…

Get your team access to WIISE top 2,000 courses anytime, anywhere.

**Project Description**

Using the concepts of basics of Python, develop the following projects.

**1. Dice Rolling Simulator**

The Goal: Like the title suggests, this project involves writing a program that simulates rolling dice. When the program runs, it will randomly choose a number between 1 and 6. (Or whatever other integer you prefer — the number of sides on the die is up to you.) The program will print what that number is. It should then ask you if you’d like to roll again. For this project, you’ll need to set the min and max number that your dice can produce. For the average die, that means a minimum of 1 and a maximum of 6. You’ll also want a function that randomly grabs a number within that range and prints it.

*Concepts to keep in mind:*

- Random
- Integer
- While Loops

A good project for beginners, this project will help establish a solid foundation for basic concepts. And if you already have programming experience, chances are that the concepts used in this project aren’t completely foreign to you. Print, for example, is similar to Javascript’s console.log.

**2. Mad Libs Generator**

The Goal: Inspired by Summer Son’s Mad Libs project with Javascript. The program will first prompt the user for a series of inputs a la Mad Libs. For example, a singular noun, an adjective, etc. Then, once all the information has been inputted, the program will take that data and place them into a premade story template. You’ll need prompts for user input, and to then print out the full story at the end with the input included.

*Concepts to keep in mind:*

- Strings
- Variables
- Concatenation

A pretty fun beginning project that gets you thinking about how to manipulate user inputted data. Compared to the prior projects, this project focuses far more on strings and concatenating. Have some fun coming up with some wacky stories for this!