Unlocking the Power of Generative AI with SageMaker Model Registry
Generative artificial intelligence (AI) foundation models (FMs) are revolutionizing the way businesses operate by offering versatility and the ability to address various use cases. The true potential of FMs lies in their adaptation to domain-specific data. However, managing these models throughout the business and model lifecycle can be complex. Operationalizing these pipelines becomes crucial as FMs are tailored to different domains and datasets.
Amazon SageMaker, a fully managed service for building, training, and deploying machine learning (ML) models, has seen a surge in adoption for customizing and deploying FMs that power generative AI applications. SageMaker offers rich features to create automated workflows for deploying models at scale. One of the standout features for operational excellence in model management is the Model Registry. The Model Registry helps catalog and manage model versions, enabling collaboration and governance. Trained and evaluated models can be stored in the Model Registry for effective model management.
Recently, Amazon SageMaker introduced new features in the Model Registry to simplify the versioning and cataloging of FMs. Customers can leverage SageMaker to train or fine-tune FMs, including Amazon SageMaker JumpStart and Amazon Bedrock models, and manage these models within the Model Registry. As organizations scale generative AI applications across different use cases, the number of models can multiply rapidly. The SageMaker Model Registry serves as a central inventory for tracking models, versions, and associated metadata.
Exploring New Features of Model Registry
The latest features in the Model Registry have streamlined FM management, allowing for the registration of unzipped model artifacts and the acceptance of End User License Agreements (EULA) without user intervention.
Overview
The existing Model Registry worked well for traditional models of smaller sizes, but faced challenges when handling large FMs that required user intervention for EULA acceptance. With the new features, registering a fine-tuned FM in the Model Registry has become more straightforward, enabling seamless deployment for actual use.
The model development lifecycle is iterative, involving numerous experimentation cycles to achieve optimal model performance. Once trained, models can be registered in the Model Registry where they are recognized as versions. Models can be grouped, versions compared for quality metrics, and approval status assigned indicating deployability.
After manual approval, a continuous integration and continuous deployment (CI/CD) pipeline can be triggered to deploy models to production. Alternatively, the Model Registry can serve as a repository of approved models that can be accessed and deployed by various teams to build applications around them.
Enhancements in Model Registry
The Model Registry released two key features – ModelDataSource and Source model URI – that expedite deployment and simplify the registration of proprietary models.
ModelDataSource for Improved Deployment
Previously, model artifacts had to be stored along with the inference code in a compressed format when registering models in the Model Registry. For FMs with billions of parameters, this posed challenges as the larger size resulted in increased latency during endpoint startup. The model_data_source
parameter now accepts the location of unzipped model artifacts in Amazon S3, streamlining the registration process and reducing latency during model deployment.
Public JumpStart models and certain FMs require EULA acceptance before use, preventing their storage in the Model Registry. The new EULA acceptance flag support within the model_data_source
parameter now allows the registration of such models, enabling cataloging, versioning, and metadata association in the Model Registry.
Source model URI for Simplified Registration
The Model Registry now supports automatic population of inference specification files for recognized model IDs, including select AWS Marketplace models, hosted models, or versioned model packages in Model Registry. The SourceModelURI facilitates the registration of proprietary JumpStart models from providers like AI21 labs, Cohere, and LightOn without the need for an inference specification file upfront.
Previously, users had to provide complete inference specifications for registering trained models in the SageMaker Model Registry. With source_uri
support, users can register any model by providing a source model URI, making the registration process hassle-free. Subsequently, models can be packaged with the necessary inference specification for deployment.
Conclusion
As businesses embrace generative AI for various applications, robust model management and versioning are crucial. The Model Registry enables organizations to achieve version control, tracking, collaboration, lifecycle management, and governance of FMs effectively. These new features now empower users to better adopt generative AI and drive transformational outcomes.
For more information on Model Registry, visit the SageMaker console and explore the possibilities of managing generative AI models seamlessly across the model lifecycle.
About the Authors
Chaitra Mathur – Principal Solutions Architect at AWS
Kait Healy – Solutions Architect II at AWS
Saumitra Vikaram – Senior Software Engineer at AWS
Siamak Nariman – Senior Product Manager at AWS
Selva Kumar – Software Engineer at AWS