Posts by Collection

portfolio

publications

NTKCPL: Active Learning on Top of Self-Supervised Model by Estimating True Coverage

Published in ArXiv preprint, 2023

This paper is about approximating empirical risk over the whole active learning pool by eNTK and CPL (clustering pseudo-labels) to widen the effective budget range of active learning.

Recommended citation: Ziting Wen, Oscar Pizarro, and Stefan Williams. "NTKCPL: Active Learning on Top of Self-Supervised Model by Estimating True Coverage." arXiv preprint arXiv:2306.04099 (2023).
Download Paper

Active self-semi-supervised learning for few labeled samples

Published in Neurocomputing, 2024

This paper is about how to achieve good performance with rare human annotation (avg. 1~4 labels per class). Prior pseudo-labels transfer information from pre-trained models to semi-supervised training, active learning enhances accuracy of prior pseudo-labels.

Recommended citation: Ziting Wen, Oscar Pizarro, and Stefan Williams. "Active self-semi-supervised learning for few labeled samples." Neurocomputing (2024): 128772.
Download Paper

Feature Alignment: Rethinking Efficient Active Learning via Proxy in the Context of Pre-trained Models

Published in TMLR, 2024

This paper is about balancing overall cost (labeling cost and training cost) and active learning sampling time. Some intriguing empirical analysis on which part of sample selection difference between the proxy model (used in efficient AL) and the fine-tuned model (used in standard AL) contribute to AL performance drops and why.

Recommended citation: Ziting Wen, Oscar Pizarro, and Stefan B. Williams. "Feature Alignment: Rethinking Efficient Active Learning via Proxy in the Context of Pre-trained Models." Transactions on Machine Learning Research (2024).
Download Paper

talks

teaching