Optimising MiniGPT with Supervised Learning and RL Constraints
Optimising MiniGPT with Supervised Learning and RL Constraints
Created using ChatSlide
MiniGPT is a project focusing on solving sequence reversal tasks using PyTorch frameworks. The architecture involves a simple yet effective model, pre-trained for initial understanding followed by reinforcement learning for enhanced fine-tuning. Initial results highlight the model's efficiency when combined with reinforcement learning techniques, overcoming some challenges associated with synthetic datasets. Continued research aims to refine these methods further, addressing dataset...