Scaling RCL Language Models

April 21, 2022

Authored by: Dr. Morten Middelfart, Sam Martin, and Ben Martin

Abstract

In this paper, we demonstrate that RCL scales sublinearly with dataset size and linearly with compute. We also indicate how RCL both differs from transformer architecture and promises better performance. RCL does not employ neural networks; unlike deep learning, RCL training time is a function of only dataset size and compute, and RCL model size is a function of dataset size. Additionally, we demonstrate that RCL continues to scale linearly whether or not we control for entropy. Finally, be employing a shared-nothing architecture and running on CPU, RCL scales without theoretical limit as 1) dataset size increases or 2) processes are distributed across more machines.

Read the Whitepaper