计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (20): 253-262.DOI: 10.3778/j.issn.1002-8331.2006-0059

• 工程与应用 • 上一篇    下一篇

BESIII实验软件事例级并行化研究

马震太,张晓梅,孙功星   

  1. 1.中国科学院 高能物理研究所,北京 100049
    2.中国科学院大学,北京 100049
  • 出版日期:2021-10-15 发布日期:2021-10-21

Event Level Parallelization Research of BESIII Experimental Software

MA Zhentai, ZHANG Xiaomei, SUN Gongxing   

  1. 1.Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
    2.University of Chinese Academy of Sciences, Beijing 100049, China
  • Online:2021-10-15 Published:2021-10-21

摘要:

针对BESIII实验软件作业级并行内存消耗严重,序列级并行排序过程复杂等弊端,提出事例级并行化的解决方案,因各个事例的数据相互独立,故采用以事例组为单位的粗粒度加锁技术,在线程并行度带来的性能提升和线程交互导致的开销中取得最佳平衡。通过在内存中创建事例组先进先出队列,为事例组空闲、数据就绪、处理完成三种状态设置对应的信号量,使文件输入线程、文件输出线程、事例循环处理线程进行交互,进而建立映射表为事例处理线程分配事例并更新上下文,上述机制保证了事例数据的原序流动,避免了复杂的排序工作;为避免无效数据导致的内存浪费,应用了数据访问延迟加载技术;针对事例级并行的元组输出,建立三层映射,使得每个线程只需填充对应的树即可;最终内存消耗降低46.5%,执行性能获得显著提升。

关键词: 排序, 事例组先进先出队列, 事例组分配, 线程交互, 三层映射

Abstract:

The job level parallel of BESIII experimental software has the disadvantage of huge memory consumption, the sequence level parallel needs complex sorting work. Aiming at solving these problems, this article puts forward parallel solution at event level, since each event data is independent, data parallelization is selected. Coarse-grained locking of event group provides the best balance between the performance benefits of thread parallelism and the overhead of thread interaction. Creating event group FIFO queue, setting corresponding semaphore for event group state make file output thread, file input thread, event processing threads interact effectively. The mapping table is established to allocate event for event processing threads and update corresponding context. As a result, the data can flow in the original order, avoiding the sorting work. The lazy loading technique is applied to reduce memory waste caused by invalid data. For tuple output of event level parallel, three-layer mapping makes each thread fill the corresponding tree. The experimental results show that event level parallel solution reduces memory consumption 46.5%, the performance improves significantly.

Key words: sort, event group FIFO queue, event group allocation, thread interaction, three layers of mapping