针对集合值数据频繁项集挖掘的PemSet-FIM双阶段自适应算法
DOI:
作者:
作者单位:

1.辽宁工业大学电子与信息工程学院;2.辽宁工业大学理学院

作者简介:

通讯作者:

中图分类号:

TP309

基金项目:

国家自然科学基金(62203201);2024年辽宁省属本科高校基本科研业务费专项资金资助(LJZZ212410154025);辽宁省科技厅联合计划项目(2025JH2/101800231)


PemSet-FIM: A Two-Stage Adaptive Framework with Local Differential Privacy for Frequent Itemset Mining over Set-Valued Data
Author:
Affiliation:

1.College of Electronics &2.Information Engineering,Liaoning University of Technology;3.College of Science,Liaoning University of Technology

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为解决现有本地差分隐私频繁项集挖掘方法在高维稀疏数据上候选质量低、噪声累积严重及频率估计精度不足的问题,本文提出了一种基于本地差分隐私的双阶段自适应扰动频繁项集挖掘算法,并通过域自适应扰动方法研究了频繁项集在高维稀疏环境下的高精度挖掘机制。结果表明:在候选生成阶段动态选择扰动机制能够有效降低初始候选噪声并提升前缀筛选质量;在频率估计阶段依据子域规模自适应选择机制可以显著抑制局部方差并优化整体频率重建;在四个真实数据集上的实验显示,该算法在频率估计精度、频繁项发现能力及运行效率方面均优于现有方法,并在高稀疏度数据集上将Precision与Recall分别提升最高达15%。可见,所提出的双阶段自适应扰动算法能够在保证本地差分隐私的前提下显著提升高维稀疏数据上的频繁项集挖掘性能。

    Abstract:

    In order to address the issues of low-quality candidate sets, severe noise accumulation, and insufficient frequency-estimation accuracy in existing local differential privacy–based frequent itemset mining methods for high-dimensional sparse data, a two-stage adaptive perturbation algorithm incorporating a domain-adaptive perturbation strategy was used to investigate high-accuracy frequent itemset mining mechanisms under high-dimensional sparsity. The results show that initial candidate noise is effectively reduced and prefix-based filtering quality is improved by dynamically selecting perturbation mechanisms in the candidate-generation stage; local variance is significantly suppressed and global frequency reconstruction is optimized by adaptively choosing mechanisms according to subdomain size in the frequency-estimation stage; and superior performance of the proposed algorithm over existing methods in estimation accuracy, frequent-item discovery ability, and runtime is demonstrated by experiments on four real-world datasets, with up to a 15% improvement in Precision and Recall achieved on highly sparse datasets. It is concluded that frequent itemset mining performance on high-dimensional sparse data can be substantially enhanced by the proposed two-stage adaptive perturbation algorithm while preserving local differential privacy.

    参考文献
    相似文献
    引证文献
引用本文

焦敬哲,李晓会,张兴,等. 针对集合值数据频繁项集挖掘的PemSet-FIM双阶段自适应算法[J]. 科学技术与工程, , ():

复制
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-12-02
  • 最后修改日期:2026-04-27
  • 录用日期:2026-05-09
  • 在线发布日期:
  • 出版日期:
×
2026年会通知 | “技术经济学驱动智能经济生态构建与治理变革”——中国技术经济学会第三十三届学术年会(2026)会议通知暨征文启事(第一轮)
亟待确认版面费归属稿件,敬请作者关注