Investigating the Performance and Reliability, of the Q-Learning Algorithm in Various Unknown 2 Dimensional Grid Environments

This Jupyter notebook provides a platform to evaluate the performance and reliability of the Q-learning algorithm in various 2-dimensional environments using the OpenAI gymnasium library. The related paper was published at the Proceedings of the 11th RSI International Conference on Robotics and Mechatronics (ICRoM 2023), held in Tehran, Iran from December 19-21, 2023. You can find the paper and the PowerPoint presentation related to the conference in this repository.

In addition to Q-learning, this Jupyter Notebook also includes an implementation of a value iteration algorithm from scratch, which is accessible to users. This repository is a great opportunity for beginners to experiment with value iteration, Q-learning, and OpenAI gymnasium.

Acknowledgements

Make sure to cite the paper by Amirhossein Nourian et al. if you use this code for your research:

Nourian, A., & Sadedel, M. (2023, December). Investigating the Performance and Reliability, of the Q-Learning Algorithm in Various Unknown Environments. In 2023 11th RSI International Conference on Robotics and Mechatronics (ICRoM) (pp. 21-28). IEEE.

Description

This repository contains the following items:

OpenAI frozen lake environment from Gymnasium.
Implementation of the Value Iteration algorithm.
Implementation of the Q-Learning algorithm.
Visualization of Q-Learning Algorithm results.
Published paper related to the code.
Presentation related to this code in the ICROM conference.

Setup

To run the script you'll need the following dependencies:

which should all be available through Pip.

No additional setup is needed, so clone the repo:

git clone https://github.com/amirhnourian/Open_AI_Frozenlake.git
cd Open_AI_Frozenlake

Usage

The input data consists of the FrozenLake maps and the hyperparameters for the Q-learning algorithm. The output data is the Q-table after solving Q-learning. You can test the policy with the provided function. Lastly, there is a code available for finding the correct number of policies in a set of given episodes with a sampling coefficient.

To use the code, you need to run each cell in the correct order and follow the instructions provided in the Jupyter file. I recommend that you first read the related paper to familiarize yourself with the code and its purpose. If you have any further questions, please let me know.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.vscode		.vscode
Paper and ICRoM presentation		Paper and ICRoM presentation
Sample CSV Files and Figures		Sample CSV Files and Figures
Main.ipynb		Main.ipynb
README.md		README.md
plot_style.txt		plot_style.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Investigating the Performance and Reliability, of the Q-Learning Algorithm in Various Unknown 2 Dimensional Grid Environments

Acknowledgements

Description

Setup

Usage

About

Releases

Packages

Languages

amirhnourian/Open_AI_Frozenlake

Folders and files

Latest commit

History

Repository files navigation

Investigating the Performance and Reliability, of the Q-Learning Algorithm in Various Unknown 2 Dimensional Grid Environments

Acknowledgements

Description

Setup

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages