
The first open source equivalent of OpenAI’s ChatGPT has arrived, but good luck running it on your laptop – if at all.
This week, Philip Wang, the developer responsible for reverse engineering closed-source AI systems, including Meta’s Make-A-Video, released PaLM + RLHF, a text-generating model that behaves similarly to ChatGPT. The system combines PaLM, a large language model from Google, and a technique called Reinforcement Learning with Human Feedback – RLHF for short – to create a system that can perform virtually any task ChatGPT can, including composing emails and presenting computer code.
But PaLM + RLHF is not pre-trained. That is, the system is not trained on the sample data from the Internet that it needs to actually work. Downloading PaLM + RLHF won’t magically install a ChatGPT-like experience – that would require compiling gigabytes of text from which the model can learn and finding hardware powerful enough to handle the training workload.
Like ChatGPT, PaLM + RLHF is essentially a statistical word prediction tool. When a huge number of examples are fed in from training data, for example posts from Reddit, news articles and e-books, PaLM + RLHF learns how likely words are to occur based on patterns such as the semantic context of surrounding text.
ChatGPT and PaLM + RLHF share a special sauce in Reinforcement Learning with Human Feedback, a technique that aims to better align language models with what users want them to achieve. RLHF involves training a language model – in the case of PaLM + RLHF, PaLM – and refining it on a data set that contains prompts (e.g., “Explain machine learning to a six-year-old”) in conjunction with some human volunteers from the model expect to say (e.g. “Machine learning is a form of AI…”). The above prompts are then fed into the refined model, which generates different responses, and the volunteers rank all responses from best to worst. Finally, the rankings are used to train a ‘reward model’ that takes the answers from the original model and sorts them in order of preference, filtering for the best answers to a given prompt.
It is an expensive process to collect the training data. And training itself is not cheap. PaLM is 540 billion parameters in size, “parameters” that refer to the parts of the language model learned from the training data. A 2020 study estimated the cost of developing a text-generating model with just 1.5 billion parameters to be as much as $1.6 million. And to train the Bloom open source model, which has 176 billion parameters, it took three months with 384 Nvidia A100 GPUs; a single A100 costs thousands of dollars.
Running a trained model of the size of PaLM + RLHF is also non-trivial. Bloom requires a dedicated PC with about eight A100 GPUs. Cloud alternatives are pricey, with back-of-the-envelope math finding the cost of running OpenAI’s text-generating GPT-3 — which has about 175 billion parameters — on a single Amazon Web Services instance at about $87,000 per year .
Sebastian Raschka, an AI researcher, points out in a LinkedIn post about PaLM + RLHF that scaling up the necessary development workflows can also prove challenging. “Even if someone gives you 500 GPUs to train this model, you’re still dealing with infrastructure and a software framework that can handle that,” he said. “It’s possible, of course, but it’s a big effort right now (of course we’re developing frameworks to make that simpler, but it’s still not trivial).”
That is to say, PaLM + RLHF will not replace ChatGPT today – unless a well-funded company (or person) takes the trouble to train and make it publicly available.
In better news, several other attempts to replicate ChatGPT are progressing rapidly, including one led by a research group called CarperAI. In partnership with open AI research organization EleutherAI and startups Scale AI and Hugging Face, CarperAI plans to release the first ready-to-use, ChatGPT-like AI model trained with human feedback.
LAION, the nonprofit that provided the first dataset used to train Stable Diffusion, is also leading a project to replicate ChatGPT using the latest machine learning techniques. Ambitious, LAION aims to build an “Assistant of the Future” – one that will not just write emails and cover letters, but “do meaningful work, use APIs, dynamically examine information, and much more.” It’s in the early stages. But a few weeks ago, a GitHub resource page for the project went live.