Artificial Intelligence: Reinforcement Learning in Python course will teach students about stock trading and online advertising applications. Students will also learn about Markov Decision Processes (MDPs) and ways to calculate means and moving averages and their relationship to stochastic gradient descent.
The course will teach students how to use OpenAI Gym, with zero code changes. The course is usually available at INR 2,999 on Udemy but now you can get Artificial Intelligence: Reinforcement Learning in Python for up to 85% off i.e. INR 455
Who all can opt for this course?
- Students who wants to learn about artificial intelligence, data science, machine learning, and deep learning
- Professionals who wants to gain expertise in artificial intelligence
Course Highlights
Key Highlights | Details |
---|---|
Registration Link | Apply Now! |
Price | INR 2,999 ( |
Duration | 14.5 Hours |
Rating | 4.8/5 |
Student Enrollment | 43,633 students |
Instructor | Lazy Programmer Team https://www.linkedin.com/in/lazyprogrammerteam |
Topics Covered | Python programming, Reinforcement learning, Markov Decision process, Dynamic programming |
Course Level | Intermediate |
Total Student Reviews | 9,610 |
Learning Outcomes
- Employ reinforcement learning techniques based on supervised machine learning gradients
- Technical understanding of reinforcement learning
- Recognize the connection between psychology and reinforcement learning
- Apply 17 different algorithms for reinforcement learning
- Apply gradient-based machine learning methods to reinforcement learning
Course Content
S.No. | Module (Duration) | Topics |
---|---|---|
1. | Welcome (40 minutes) | Introduction |
Course Outline and Big Picture | ||
Where to get the Code | ||
How to Succeed in this Course | ||
Warmup | ||
2. | Return of the Multi-Armed Bandit (02 hours 56 minutes) | Section Introduction: The Explore-Exploit Dilemma |
Applications of the Explore-Exploit Dilemma | ||
Epsilon-Greedy Theory | ||
Calculating a Sample Mean (pt 1) | ||
Epsilon-Greedy Beginner’s Exercise Prompt | ||
Designing Your Bandit Program | ||
Epsilon-Greedy in Code | ||
Comparing Different Epsilons | ||
Optimistic Initial Values Theory | ||
Optimistic Initial Values Beginner’s Exercise Prompt | ||
Optimistic Initial Values Code | ||
UCB1 Theory | ||
UCB1 Beginner’s Exercise Prompt | ||
UCB1 Code | ||
Bayesian Bandits / Thompson Sampling Theory (pt 1) | ||
Bayesian Bandits / Thompson Sampling Theory (pt 2) | ||
Thompson Sampling Beginner’s Exercise Prompt | ||
Thompson Sampling Code | ||
Thompson Sampling With Gaussian Reward Theory | ||
Thompson Sampling With Gaussian Reward Code | ||
Exercise on Gaussian Rewards | ||
Why don’t we just use a library? | ||
Nonstationary Bandits | ||
Bandit Summary, Real Data, and Online Learning | ||
(Optional) Alternative Bandit Designs | ||
Suggestion Box | ||
3. | High Level Overview of Reinforcement Learning (16 minutes) | What is Reinforcement Learning? |
From Bandits to Full Reinforcement Learning | ||
4. | Markov Decision Proccesses (01 hour 59 minutes) | MDP Section Introduction |
Gridworld | ||
Choosing Rewards | ||
The Markov Property | ||
Markov Decision Processes (MDPs) | ||
Future Rewards | ||
Value Functions | ||
The Bellman Equation (pt 1) | ||
The Bellman Equation (pt 2) | ||
The Bellman Equation (pt 3) | ||
Bellman Examples | ||
Optimal Policy and Optimal Value Function (pt 1) | ||
Optimal Policy and Optimal Value Function (pt 2) | ||
MDP Summary | ||
5. | Dynamic Programming (02 hours 04 minutes) | Dynamic Programming Section Introduction |
Iterative Policy Evaluation | ||
Designing Your RL Program | ||
Gridworld in Code | ||
Iterative Policy Evaluation in Code | ||
Windy Gridworld in Code | ||
Iterative Policy Evaluation for Windy Gridworld in Code | ||
Policy Improvement | ||
Policy Iteration | ||
Policy Iteration in Code | ||
Policy Iteration in Windy Gridworld | ||
Value Iteration | ||
Value Iteration in Code | ||
Dynamic Programming Summary | ||
6. | Monte Carlo (58 minutes) | Monte Carlo Intro |
Monte Carlo Policy Evaluation | ||
Monte Carlo Policy Evaluation in Code | ||
Monte Carlo Control | ||
Monte Carlo Control in Code | ||
Monte Carlo Control without Exploring Starts | ||
Monte Carlo Control without Exploring Starts in Code | ||
Monte Carlo Summary | ||
7. | Temporal Difference Learning (37 minutes) | Temporal Difference Introduction |
TD(0) Prediction | ||
TD(0) Prediction in Code | ||
SARSA | ||
SARSA in Code | ||
Q Learning | ||
Q Learning in Code | ||
TD Learning Section Summary | ||
8. | Approximation Methods (01 hour 13 minutes) | Approximation Methods Section Introduction |
Linear Models for Reinforcement Learning | ||
Feature Engineering | ||
Approximation Methods for Prediction | ||
Approximation Methods for Prediction Code | ||
Approximation Methods for Control | ||
Approximation Methods for Control Code | ||
CartPole | ||
CartPole Code | ||
Approximation Methods Exercise | ||
Approximation Methods Section Summary | ||
9. | Interlude: Common Beginner Questions (07 minutes) | This Course vs. RL Book: What’s the Difference? |
10. | Stock Trading Project with Reinforcement Learning (01 hour 21 minutes) | Beginners, halt! Stop here if you skipped ahead |
Stock Trading Project Section Introduction | ||
Data and Environment | ||
How to Model Q for Q-Learning | ||
Design of the Program | ||
Code pt 1 | ||
Code pt 2 | ||
Code pt 3 | ||
Code pt 4 | ||
Stock Trading Project Discussion | ||
11. | Setting Up Your Environment (FAQ by Student Request) (37 minutes) | Anaconda Environment Setup |
How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow | ||
12. | Extra Help With Python Coding for Beginners (FAQ by Student Request) (42 minutes) | How to Code by Yourself (part 1) |
How to Code by Yourself (part 2) | ||
Proof that using Jupyter Notebook is the same as not using it | ||
Python 2 vs Python 3 | ||
13. | Effective Learning Strategies for Machine Learning (FAQ by Student Request) (59 minutes) | How to Succeed in this Course (Long Version) |
Is this for Beginners or Experts? Academic or Practical? Fast or slow-paced? | ||
Machine Learning and AI Prerequisite Roadmap (pt 1) | ||
Machine Learning and AI Prerequisite Roadmap (pt 2) | ||
14. | Appendix / FAQ Finale (08 minutes) | What is the Appendix? |
BONUS |
Resources Required
- Understanding of Calculus (derivatives)
- Know about Markov models and probability
- Knowledge of Numpy, Matplotlib
- Beneficial to have experience with at least a few supervised machine learning methods
- Gradual ascent
- Excellent programming skills in object-oriented
Featured Review
Luca D’Alessandro (5/5) : One of the best courses I’ve ever taken on the topic. Super useful to learn building basic environments and RL agent dynamics, and to develop intuitions for more sophisticated models. The mathematical rigor comes together with a nice teaching strategy for the coding part.
Pros
- Hermon Alfaro (5/5) : it has the perfect equilibrium between the conceptual explanations and the code.
- Harrison Yoon (5/5) : The provided code is masterfully written and worked without a hitch.
- Donal John (4/5) : This is probably the best course that I have found on the topic of reinforcement learning.
- Marcin Soboci?ski (5/5) : the explanations given (like in Bellman examples chapter) are just the best I could find anywhere.
Cons
- Javid Jamae (1/5) : Instead of just giving you the problems and solutions and walking you through everything (the way EVERY other Udemy instructor does it) he gives vague descriptions of what you have to do and then says that he expects that you should go figure it out yourself.
- Jonathan Hogg (2/5) : Well, Lazy Programmer will tell you repeatedly throughout all of his courses.
- Niels Pichon (2/5) : The name of the guy is well chosen: Everywhere where he can be lazy he is.
- Con Land (1/5) : To the “Lazy Programmer”, I recommend that you become an “Expert Programmer” before selling courses.
About the Author
The instructor of this course is Lazy Programmer Team who is a Artificial Intelligence and Machine Learning Engineer. With 4.7 Instructor Rating and 51,767 Reviews on Udemy, he/she offers 17 Courses and has taught 188,509 Students so far.
- Instructor have also been recognised as a data scientist, big data engineer, and full stack software engineer, Instructor currently spend the majority of his time as an artificial intelligence and machine learning engineer with an emphasis on deep learning
- Instructor earned his first master’s degree in computer engineering with a focus on machine learning and pattern recognition more than ten years ago
- Instructor’s second master’s degree in statistics with a focus on financial engineering was awarded to him
- Data scientist and big data engineer with experience in online advertising and digital media (optimising click and conversion rates) (building data processing pipelines)
- Instructor routinely use big data technologies like Hadoop, Pig, Hive, MapReduce, and Spark
- Instructor has developed deep learning models for text modelling, image and signal processing, user behaviour prediction, and click-through rate estimation
- Instructor work with recommendation systems, I’ve used collaborative filtering and reinforcement learning, and we validated the findings using A/B testing
- Instructor have taught students at universities like Columbia University, NYU, Hunter College, and The New School in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics
- Instructor’s web programming skills have helped numerous businesses
- Instructor handle all of the server-side backend work, frontend HTML/JS/CSS work, and operations/deployment work
Comparison Table
Parameters | Artificial Intelligence: Reinforcement Learning in Python | Advanced AI: Deep Reinforcement Learning in Python | Deep Learning Prerequisites: Linear Regression in Python |
---|---|---|---|
Offers | INR 455 ( | INR 455 ( | INR 455 ( |
Duration | 14.5 hours | 10.5 hours | 6.5 hours |
Rating | 4.8 /5 | 4.6 /5 | 4.6 /5 |
Student Enrollments | 43,633 | 36,717 | 31,439 |
Instructors | Lazy Programmer Team | Lazy Programmer Team | Lazy Programmer Inc. |
Register Here | Apply Now! | Apply Now! | Apply Now! |
Leave feedback about this