Creating open learning systems could be the goal in developing creativity in artificial intelligence. With the help of large language models, priorities can be set, narrowing the areas of research. Although some fear that open AI could become out of control, developments have begun.
Two new ones appeared in May. Studyingwhich discusses research by computer scientist Jeff Clune at the University of British Columbia. Both build on Clune’s previous projects. In 2018, he and his colleagues created a system called Go-Explore that can play video games and evolve through a process of trial and error called reinforcement learning. The system occasionally saves the agent’s progress and later identifies interesting saved states and progresses from there. The selection of states is based on hard-coded rules. This is an improvement over random selection.
Intelligence glue
This year, Clone Lab developed Intelligent Go-Explore (IGE), which uses GPT-4 instead of hand-coded rules to select promising cases from the archive. The language model is able to intelligently choose operations that help explore the system, and it is able to judge whether the resulting cases are interesting enough to archive.
LLM courses are like “intelligence glue”: if we are looking for something, we apply LLM courses and they work!
The researchers tested Intelligent Go-Explore on three types of tasks that required multi-step solving and text processing. In one, the system had to perform mathematical operations to produce the number 24. In another, it had to perform tasks on a two-dimensional grid, such as moving objects based on text descriptions and instructions. And in the third, it played games based on text instructions, such as cooking or collecting coins in a maze.
The researchers compared IGE to four other methods. The others sampled actions randomly, while the others fed the current game state and history into the LLM.
IGE has surpassed all expectations and other methods. If you had to collect coins, you won 22 out of 25 games, and none of the other games.
The system simulates human creativity.
Finding new research directions is in many ways the central problem of reinforcement learning, according to study co-author Kong Lu, a computer scientist at the University of British Columbia. According to Clune, these systems allow AI to go further by relying on large human datasets.
He's not human, so he's dangerous.
The second new system OMNI (Opening up through interesting human concept models) Not only does it look for ways to solve given tasks, it builds on another method created by Clune Lab last year.
In a given virtual environment, a large language model suggested tasks to the AI agent based on tasks it had previously completed or reviewed. But OMNI was limited to hand-built virtual environments, so computer scientists created OMNI-EPIC, with the environment programmed into OMNI code.
The archive is loaded with sample tasks, each represented by a natural language description and the computer code for the task. The way it works is that OMNI-EPIC picks a task, uses LLMs to generate a description and code for a new version, and then deploys another LLM to decide whether the new task is novel or creative, and how easy or difficult it is to use. If it’s interesting enough, an AI agent trains the task via reinforcement learning and feeds it into the archive.
The process is repeated over and over again, creating a system of new and more complex tasks and training AI agents to perform them.
OMNI-EPIC search aims to automatically find new and learnable tasks. OMNI-EPIC has generated over 200 problems, including math and literature problems.
According to Jacob Forster, a computer scientist at the University of Oxford, the systems are not truly open because they use MBAs trained on human data. That’s why they can’t be that creative. Some people think that openness is essential for AI, and that a good, open algorithm might innovate and do unknown things, deviate from human roots, and produce new and interesting ideas that don’t come from human thinking. However, many experts worry that such super-intelligent AI, especially if it’s not aligned with human values, could be dangerous, especially since openness is one of the most dangerous areas of machine learning.