Scalable continual deep learning with computational cost considerations

<p>Continual Learning (CL) is a emerging field that focuses on developing models capable of learning continuously from a incoming stream of data, as opposed to hundreds of passes over static, curated datasets. These models aim to retain previously acquired knowledge while seamlessly integrati...

詳細記述

書誌詳細
第一著者:	Prabhu, AP
その他の著者:	Torr, P
フォーマット:	学位論文
言語:	English
出版事項:	2025
主題:	Deep learning (Machine learning)

その他の書誌記述
要約:	<p>Continual Learning (CL) is a emerging field that focuses on developing models capable of learning continuously from a incoming stream of data, as opposed to hundreds of passes over static, curated datasets. These models aim to retain previously acquired knowledge while seamlessly integrating new information, often with constraints like limited storage capacities. To advance this field, we first pinpoint the current research paradigm's limitations. We address these by (i) implementing more realistic constraints, such as optimizing learning within limited computational resources, and (ii) demonstrating the effectiveness of simple, straightforward algorithms. Additionally, we address flaws in existing metrics and enhance data collection methods to improve the efficiency and applicability of continual models in practical scenarios. We detail our contributions below:</p> <p><strong>1. General Formulation & Concerns on Relevance of the Online Stream.</strong> We pose a general formulation of the CL problem for classification, inspired by the open-set recognition problem, which requires the model to adapt continually to new classes from a stream of data. We detail how some assumptions in continual learning potentially oversimplify the problem, thus reducing their practical relevance. Our GDumb model demonstrates that even simple ablations like not using the online stream can match or outperform continual learning algorithms designed for specific continual learning scenarios, challenging the progress in this field.</p> <p><strong>2. Challenges in Continual Representation Learning.</strong> Our RanDumb model explores continual representation learning by using a fixed random transform and a simple linear classifier, revealing that using no representation learning outperforms continual learned representations using deep networks across a variety of standard continual learning benchmarks. This finding suggests a need for a principled re-evaluation of how we design and train models for effective continual learning.</p> <p><strong>3. Introducing Computational Constraints.</strong> Current literature often overlooks the practical constraints of computational and time budgets, focusing instead on unrestricted access to data. We develop large-scale datasets like ImageNet2K and Continual Google Landmarks V2, and extensively evaluate continual learning under compute-constrained settings, and show that traditional CL methods, even with strategic sampling and distillation, cannot outperform even the simple experience replay baselines. This finding indicates that existing CL approaches are too compute-intensive for real-world applications.</p> <p><strong>4. Online Continual Learning without Storage Constraints.</strong> We propose a simple algorithm that updates a kNN classifier continually with a fixed, pretrained feature extractor, suitable for scenarios with rapid data stream changes and minimal computational budgets. This method significantly outperforms existing methods in accuracy while maintaining low computational and storage requirements.</p> <p><strong>5. Rapid Adaptation Metrics in Online Continual Learning.</strong> We critique the traditional metrics used in online continual learning. Our findings suggest that existing methods might not genuinely adapt but rather memorize irrelevant data patterns. We alleviate this by proposing a new metric focused on 'near-future' sample accuracy to avoid learning spurious label correlations and better reflects the ability of a model to genuinely adapt to new, incoming data.</p> <p><strong>6. Leveraging Webly-Supervised Data.</strong> Tackling the impracticality of continually sourcing large, annotated datasets, we introduce a novel continual data-acquisition paradigm that allows models to adapt to new categories using only category names. We propose a simple method which uses uncurated webly-supervised data, and not only reduces reliance on costly manual annotation but also demonstrates the potential of internet-sourced data to support effective learning. This is evidenced by our creation of the EvoTrends dataset, which reflects real-world trends with minimal cost budget.</p> <p>Overall, this thesis lays the foundations and advocates for a shift towards more computationally efficient continual learning methods that are better suited for real-world applications.</p>

Scalable continual deep learning with computational cost considerations

類似資料