Exploring the Use of Software Product Lines for the Combination of Machine Learning Models
Gomez-Vazquez M., Cabot J.
ACM International Conference Proceeding Series, pp. 26-29, 2024
The size of Large Language Models (LLMs), and Machine Learning (ML) models in general, is a key factor of their capacity and quality of their responses. But it comes with a high cost, both during the training and the model execution phase. Recently, various model merging techniques and Mixture of Experts (MoE) architectures are gaining popularity as they enable the creation of large models by combining other existing ones (the "experts" in the MoE approach). Creating these combinations remains a deep technical task with many possible configurations to consider. In this sense, this paper aims to democratize the creation of combined ML models by presenting a product line approach to the specification and training of this type of ML architectures from an initial feature model that helps users define, among other aspects, the type of models they want to combine, the combination strategy and even, for the MoE approach, the tasks that should be associated to each expert.