C2M3: Cycle-Consistent Multi-Model Merging

التفاصيل البيبلوغرافية
العنوان: C2M3: Cycle-Consistent Multi-Model Merging
المؤلفون: Donato Crisostomi, Marco Fumero, Daniele Baieri, Florian Bernard, Emanuele Rodola
المساهمون: Crisostomi, Donato, Fumero, Marco, Baieri, Daniele, Bernard, Florian, Rodola, Emanuele
سنة النشر: 2024
المجموعة: Sapienza Università di Roma: CINECA IRIS
مصطلحات موضوعية: model merging, neural networks, cycle consistency, frank-wolfe, matching
الوصف: In this paper, we present a novel data-free method for merging neural networks in weight space. Our method optimizes for the permutations of network neurons while ensuring global coherence across all layers, and it outperforms recent layer-local approaches in a set of challenging scenarios. We then generalize the formulation to the -models scenario to enforce cycle consistency of the permutations with guarantees, allowing circular compositions of permutations to be computed without accumulating error along the path. We qualitatively and quantitatively motivate the need for such a constraint, showing its benefits when merging homogeneous sets of models in scenarios spanning varying architectures and datasets. We finally show that, when coupled with activation renormalization, the approach yields the best results in the task.
نوع الوثيقة: conference object
اللغة: English
Relation: ispartofbook:Advances in Neural Information Processing Systems; Thirty-eighth Annual Conference on Neural Information Processing Systems; https://hdl.handle.net/11573/1726455
الاتاحة: https://hdl.handle.net/11573/1726455
رقم الانضمام: edsbas.8A365732
قاعدة البيانات: BASE