reinforcement learning prompts

very few results