RIASSUNTO
At present, cluster analysis and recognition of big data has caused widespread concern. This paper focuses on the classification and identification of moving targets at sea. we use a large amount of raw data of moving targets and big data analysis, data visualization, cluster analysis and prediction methods to achieve identification and classification of moving targets. Similarity measurement, fleet formation identification strategy and analysis of activity patterns are also conducted.First, we classify the targets with navigation data of 7 days or more, explore and analyze the data, remove redundant and “dirty data”. Then we set seven eigenvalues to describe the trajectory according to the cluster analysis algorithm. Combining the similarity measure of eigenvalues, we use the error squared SSE as the objective function of measuring clustering quality for two different clustering results, and select the classification result with smaller square sum of errors. All moving targets are classified into five categories (fishing boats, sailboats, cargo ships, cruise ships, warships) in combination with ship behavior characteristics and class analysis is performed on specific moving targets which are given.Secondly, the target analysis of the navigation data less than 7 days is carried out. To begin with, the obtained data is subjected to one-dimensional linear regression analysis to obtain the corresponding fitted linear regression relationship. The classification recognition model which is based on the navigation data of more than 7 days is linear. To obtain a clustering center of a single target, the regression processed data is subjected to eigenvalue analysis and similarity metrics. Combining the five categories obtained from the navigation data from more than 7 days, the correlation coefficient method is used to calculate the cluster center of the single target and the cluster center of the five clusters. The linear regression prediction method is used to select the classification with the largest correlation coefficient result.After that, through the previous analysis, we study the characteristics of the ship formation, judge the possible ship formation, give the formation ship code, discovery time and navigation trajectory, and analyze its activity law. Using the selection sequence of whether each ship goes out to sea every day from April 1 to April 21 of each ship, the ship is initially screened to obtain a ship combination with the same selection sequence before determining if the ships in the combination are in the same category. Then, the ship combination that meets the conditions is used to measure the similarity of the ship's speed and acceleration, then we obtain the ship formation.Last but not least, we use the square error method to detect the accuracy of different results. Is it consistent with the actual situation? If there is a non-conformity, where is the reason? Then, based on the results, the model used is evaluated: What are the advantages and disadvantages of this model in the solution of this problem? It is necessary for the reader to refer.