[作者-王欣芮|审核-刘心月] 近日,本课题组22级博士生张学林和24级硕士生杨永康分别发表了题为“S2MAM: Semi-supervised Meta Additive Model for Robust Estimation and Variable Selection”和“Towards Understanding Generalization of Federated Adversarial Learning: Perspective of Algorithmic Stability”两篇论文,上述论文均被人工智能领域顶级国际会议ICML 2026(Forty-Third International Conference on Machine Learning,CCF-A类)收录。

论文: S2MAM: Semi-supervised Meta Additive Model for Robust Estimation and Variable Selection.
上述论文系统性地提出了一种能够同时处理特征选择与鲁棒性估计的新方法。流形正则化半监督学习是一种联合利用有标签和无标签数据的经典框架。然而,传统的图拉普拉斯矩阵严重依赖于预先指定的相似性度量。在处理包含冗余或噪声输入变量的真实数据时,往往会产生不恰当的惩罚,从而严重降低流形正则化方法的预测能力。为解决上述问题,本文提出了一种基于双层优化方案的新型半监督元加性模型(S2MAM)。该模型将元学习与稀疏加性模型巧妙地引入流形正则化半监督学习框架中。S2MAM能够通过概率元策略自动为输入变量分配掩码,筛选出真正包含信息的变量,并在更新相似度矩阵的同时实现可解释的预测。理论上,本文为S2MAM提供了计算收敛性保证以及统计泛化误差界证明,指出该方法能够实现泛化误差的多项式级衰减。在包含不同程度和类别噪声污染的4个合成数据集和12个真实世界数据集上的实验表明,S2MAM在保持可解释性和计算效率的同时,能够有效抵抗特征噪声,其预测性能足以媲美最先进的深度半监督基线模型。
【英文摘要】Semi-supervised learning with manifold regularization is a classical framework for jointly learning from both labeled and unlabeled data, where the key requirement is that the support of the unknown marginal distribution has the geometric structure of a Riemannian manifold. Typically, the Laplace-Beltrami operator-based manifold regularization can be approximated empirically by the Laplacian regularization associated with the entire training data and its corresponding graph Laplacian matrix. However, the graph Laplacian matrix depends heavily on the prespecified similarity metric and may lead to inappropriate penalties when dealing with redundant or noisy input variables. To address the above issues, this paper proposes a new Semi-Supervised Meta Additive Model (S2MAM) based on a bilevel optimization scheme that automatically identifies informative variables, updates the similarity matrix, and simultaneously achieves interpretable predictions. Theoretical guarantees are provided for S2MAM, including the computing convergence and the statistical generalization bound. Experimental assessments across 4 synthetic and 12 real-world datasets, with varying levels and categories of corruption, validate the robustness and interpretability of the proposed approach.
华中农业大学张学林为本论文第一作者,陈洪教授为通讯作者。中国石油大学(华东)王英杰、西安交通大学龚铁梁、吉林大学顾斌参与了该论文的研究工作。
论文:Towards Understanding Generalization of Federated Adversarial Learning: Perspective of Algorithmic Stability.
【中文摘要】联邦对抗学习(FAL)通过将对抗训练整合到联邦学习框架中,有效增强了模型的鲁棒性尽管近期取得了进展并提出了急剧的FAL算法,但现有的高效工作主要集中在收敛特性上,由此泛化能力的理论理解仍然十分有限为了揭开这个谜题,基于文献算法稳定性视角,提出了一个分析FAL泛化能力的统一理论框架。我们首先分析了基于随机突变下降(SGD)的一般FAL算法,并推导出对扰动的泛化界的依赖,揭示了更强的对抗攻击会导致泛化性能衰退的现象为了缓解干扰动带来的负面影响,我们引入了Moreau包络优化,并建立了一个独立于干扰的理论界限,证明了其在同时提升鲁棒性与泛化能力方面的显著功效。最后,我们将分析了更多实际应用价值的黑盒场景,证明了在无法获取局部模糊信息的情况下,零阶优化技术仍然能够有效地平衡并维持模型的鲁棒性与泛化性在基准数据集上的广泛实验结果也进一步证实了本文的理论发现。
【英文摘要】Federated Adversarial Learning (FAL) enhances model robustness by integrating adversarial training into the federated learning framework. Despite recent advances proposing efficient FAL algorithms, existing work has mainly focused on convergence properties, with limited understanding of their generalization capabilities. To address this, we present the unified theoretical framework for analyzing FAL generalization through the lens of algorithmic stability. We first analyze general FAL algorithms based on stochastic gradient descent (SGD) and derive perturbation-dependent generalization bounds, which reveal that stronger adversarial attacks can lead to degraded generalization. To mitigate the impact of adversarial perturbations, we leverage Moreau envelope optimization and establish a perturbation-independent bound, demonstrating its efficacy in simultaneously enhancing both robustness and generalization. Finally, we extend our analysis to the practical black-box setting, demonstrating that zeroth-order optimization techniques can effectively maintain both robustness and generalization even without local gradient access.