Unraveling the Commoditization of Large Language Models
Large Language Models (LLMs) have stirred excitement not just among tech enthusiasts and scholars, but also among the general populace. OpenAI ChatGPT, at the forefront, has spurred the emergence of a plethora of open-source models. In this article, we delve into the forces propelling the commoditization of LLMs.
Low Switching Costs
One crucial element facilitating the commoditization of Large Language Models is their low switching costs. Transitioning from one LLM to another is relatively straightforward owing to the common language (English) used for queries. This uniformity results in minimal costs when switching, akin to navigating between different e-commerce websites. While LLM providers may offer varying APIs, these distinctions are not substantial enough to significantly elevate switching costs.
In stark contrast, shifting between different database systems involves a significant expenditure and complexity. It necessitates data migration, configuration updates, traffic management, adaptation to diverse query languages, and resolving performance issues. Incorporating long-term memory into LLMs could enhance their value for businesses but might make switching providers more expensive. Nonetheless, for applications that only require basic LLM functions without memory, the costs of switching remain negligible.
Competition among Pioneering Organizations
OpenAI’s GPT 3.5 initially captured public attention, followed swiftly by the more advanced GPT 4.0. Concurrently, rivals like Anthropic with Claude 3, Meta with Llama 3.0, Google with Gemini 1.5 Pro, and others have introduced various models, some of which match or surpass OpenAI’s offerings according to benchmarks.
The abundance of large datasets on the web, used to train these models, has fostered this rapid progression. However, processing and cleansing this data entail substantial investments in hardware and human resources. Acknowledging the strategic significance of AI, major organizations are keen on reducing reliance on a few providers, thus investing heavily in developing these technologies. This investment has fostered fierce competition among these entities, compelling them to frequently unveil enhanced LLM versions and bolster the tools for their utilization. With new models surfacing almost monthly, enhancements in performance and cost reductions are inevitable, resulting in even fewer distinctions between providers’ offerings.
Open Source Revolution
Large Language Models essentially function as software applications running on hardware, akin to other software products. The software industry has significantly democratized technology through open-source endeavors such as Linux and Android.
In the realm of artificial intelligence, intensified rivalry among organizations has made open-sourcing LLMs an enticing strategy to level the playing field. Open-source models like Llama and Mistral enable multiple infrastructure providers to enter the market, boosting competition and lowering the cost of AI services. These models also benefit from community-driven enhancements, which, in turn, aid the original developers.
Moreover, open-source LLMs lay the groundwork for future research, rendering experimentation more affordable and diminishing the possibility of differentiation among competing products. This mirrors Linux’s impact in the server domain, where its ascent facilitated a range of providers to offer standardized server solutions at reduced costs, thus commoditizing server technology.
In Closing
The factors elucidated above point towards a potential future where LLMs could become commoditized. Software practitioners can leverage this insight to assess how LLMs can effectively tackle specific business challenges in a cost-efficient manner. Meanwhile, researchers can exploit this trend to explore new realms of study harnessing LLMs.
References
[1] Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, and Ion Stoica. 2024. Chatbot arena: An open platform for evaluating llms by human preference. arXiv.org. Retrieved May 4, 2024 from https://arxiv.org/abs/2403.04132
[2] klu.ai. Retrieved May 4, 2024 from https://klu.ai/_next/image?url=%2F_next%2Fstatic%2Fmedia%2Fklu-large-language-models-timeline.e9fde945.png&w=750&q=100
[3] Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, and Lianwen Jin. 2024. Datasets for Large Language Models: A Comprehensive Survey. arXiv.org. Retrieved May 4, 2024 from https://arxiv.org/abs/2402.18041
[4] Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, and Furu Wei. 2023. Augmenting Language Models with Long-Term Memory. arXiv.org. Retrieved May 4, 2024 from https://arxiv.org/abs/2306.07174
[5] LLM Leaderboard 2024. Retrieved May 4, 2024 from https://www.vellum.ai/llm-leaderboard#model-comparison

Dhiren Amar Navani is a Senior Software Engineer at Zillow. Visit his blog and newsletter at https://www.softwarebytes.dev/