Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (8): 175-181.

### Parallel Coordinate Improvement Method for Category Data Analysis

CHEN Hongqian1，2, CHENG Zhongjuan1，2, YANG Qianyu1，2, LI Hui3

1. 1.School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, China
2.Beijing Key Laboratory of Big Data Technology for Food Safety, School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, China
3.College of Management, Beijing Union University, Beijing 100101, China
• Online:2019-04-15 Published:2019-04-15

### 一种针对类别数据分析的平行坐标改进方法

1. 1.北京工商大学 计算机与信息工程学院，北京 100048
2.北京工商大学 计算机与信息工程学院 食品安全大数据技术北京市重点实验室，北京 100048
3.北京联合大学 管理学院，北京 100101

Abstract: Aiming at the problem of the overlap between the category data and the traditional parallel coordinate system, a parallel coordinate method of category statistics and data accumulated offset mapping is proposed. The method first counts the frequency of each category data in the multidimensional data, uses the histogram to show the distribution of the detection results and the number of records, and combines the histogram with the parallel coordinates to propose improved parallel coordinates. And then it proposes a data accumulation formula offset algorithm, the data mapped at one point is evenly distributed in a certain area on the coordinate axis, and the range of the area is determined according to the number of data records. Finally, a visual analysis system is designed and implemented, filtering of the data set, cross analysis, analysis of inter-category data and analysis of inter-dimensional data can be accomplished by improving parallel coordinates. It comparatively analyzes every two dimensions by the linkage view and chord diagram, and shows the number of records of each dimension in the data set through the word cloud. The experimental results of case datasets show that the proposed method can simultaneously display the contrasts, sorts, and the analysis of the selected datasets in different dimensions in parallel coordinates, and can visually display the association between the categories of data types.