ℓ1,2-Norm and CUR Decomposition based Sparse Online Active Learning for Data Streams with Streaming Features Conference

Chen, Z, He, Y, Wu, D et al. (2024). ℓ1,2-Norm and CUR Decomposition based Sparse Online Active Learning for Data Streams with Streaming Features . 384-393. 10.1109/BigData62323.2024.10825278

cited authors

  • Chen, Z; He, Y; Wu, D; Zuo, L; Li, K; Zhang, W; Deng, Z

authors

abstract

  • Aiming at learning from a sequence of data instances over time, online learning has attracted increasing attention in the big data era. As two important variants, sparse online learning has been extensively explored by facilitating sparse constraints for online models such as truncated gradient, ℓ1-norm regularization, ℓ1-ball projection, and regularized dual averaging; while online active learning aims to build an online prediction model with a limited number of labeled instances, deploying the so called query strategies to select informative instances over time. However, most existing studies consider sparse online learning or online active learning with fixed feature spaces, whereby in real practice the features may be dynamically evolved over time. To the end, we propose a novel unified one-pass online learning framework named OASF for simultaneously online active learning and sparse online learning tailored for data streams described by open feature spaces, where new features can emerge constantly, and old features may be vanished over various time spans. Specifically, we technically develop an effective online CUR matrix decomposition based on the ℓ1,2 mixed norm constraint for simultaneously selecting important up-to-date samples in a sliding window and facilitating stable and meaningful features in open feature spaces over time. If the loss function is simultaneously Lipschitz and convex, a sub-linear regret bound of our proposed algorithm is guaranteed with. Extensive experiments that are conducted with multiple streaming datasets have demonstrated the effectiveness of the proposed OASF compared with state-of-the-art online active learning and sparse online learning methods.

publication date

  • January 1, 2024

Digital Object Identifier (DOI)

start page

  • 384

end page

  • 393