Main Article Content

Ahmad Hassanat
Mutah University
Jordan
Biography
Vol. 7 No. 3 (2018), Articles, pages 15-30
DOI: https://doi.org/10.14201/ADCAIJ2018731530
Accepted: Sep 13, 2018
Copyright

Abstract

Finding the diameter of a dataset in multidimensional Euclidean space is a well-established problem, with well-known algorithms. However, most of the algorithms found in the literature do not scale well with large values of data dimension, so the time complexity grows exponentially in most cases, which makes these algorithms impractical. Therefore, we implemented 4 simple greedy algorithms to be used for approximating the diameter of a multidimensional dataset; these are based on minimum/maximum l2 norms, hill climbing search, Tabu search and Beam search approaches, respectively. The time complexity of the implemented algorithms is near-linear, as they scale near-linearly with data size and its dimensions. The results of the experiments (conducted on different machine learning data sets) prove the efficiency of the implemented algorithms and can therefore be recommended for finding the diameter to be used by different machine learning applications when needed.

Downloads

Download data is not yet available.

Article Details

References

Agarwal, P. K., Har-Peled, S. & Varadarajan, K. R., 2005. Geometric approximation via coresets. Combinatorial and computational geometry, Volume 52, pp. 1-30.

Agarwal, P. K., Matoušek, J. & Suri, S., 1992. Farthest neighbors, maximum spanning trees and related problems in higher dimensions. Computational Geometry, 1(4), pp. 189-201. - https://doi.org/10.1016/0925-7721(92)90001-9

Aggarwal, C. C., 2016. Outlier Analysis. Second Edition ed. New York: Springer.

Bespamyatnikh, S., 1998. An efficient algorithm for the three-dimensional diameter problem. s.l., s.n., p. 137-146.

Broder, A. Z., Glassman, S. C., Manasse, M. S. & Zweig, G., 1997. Syntactic clustering of the web. Computer Networks and ISDN Systems, 29(8), pp. 1157-1166. - https://doi.org/10.1016/S0169-7552(97)00031-7

Chan, T. M., 2002. Approximating the diameter, width, smallest enclosing cylinder, and minimum-width annulus. International journal of computational geometry and applications, 12(1), pp. 67-85. - https://doi.org/10.1142/S0218195902000748

Chazelle, B., Edelsbrunner, H., Guibas, L. & Sharir, M., 1992. Diameter, width, closest line pair, and parametric searching. s.l., ACM, pp. 120-129. - https://doi.org/10.1145/142675.142702

Clarkson, K. L. & Shor, P. W., 1989. Applications of random sampling in computational geometry, II. Discrete & Computational Geometry, 4(5), pp. 387-421. - https://doi.org/10.1007/BF02187740

Eg?eciog?lu, Ö. & Kalantari, B., 1989. Approximating the diameter of a set of points in the Euclidean space. Information Processing Letters, 32(4), pp. 205-211. - https://doi.org/10.1016/0020-0190(89)90045-8

Figueroa, K., Navarro, G. & Chavez, E., 2007. Metric Spaces Library. [Online]

Available at: http://www.sisap.org/Metric_Space_Library.html

Finocchiaro, D. V. & Pellegrini, M., 2002. On computing the diameter of a point set in high dimensional Euclidean space. Theoretical Computer Science, Volume 287, p. 501-514. - https://doi.org/10.1016/S0304-3975(01)00258-4

Gudivada, V. N. & Raghavan, V. V., 1995. Content based image retrieval systems. Computer, 28(9), pp. 18-22. - https://doi.org/10.1109/2.410145

Har-Peled, S., 2001. A practical approach for computing the diameter of a point set. s.l., ACM, pp. 177-186. - https://doi.org/10.1145/378583.378662

Hassanat, A. B. & Tarawneh, A. S., 2016. Fusion of Color and Statistic Features for Enhancing Content-Based Image Retrieval Systems. Journal of Theoretical & Applied Information Technology, 88(3).

Imanparast, M., Hashemi, S. N. & Mohades, A., 2016. An efficient approximation for point set diameter in fixed dimensions. preprint arXiv:1610.08543 [cs.CG].

Le Bourdais, F., 2015. The Farthest Neighbors Algorithm. [Online]

Available at: https://flothesof.github.io/farthest-neighbors.html

[Accessed 1 Feb 2018].

Lichman, M., 2013. [Online]

Available at: http://archive.ics.uci.edu/ml

Matoušek, J. & Schwarzkopf, O., 1993. A deterministic algorithm for the three-dimensional diameter problem. s.l., ACM, pp. 478-484. - https://doi.org/10.1145/167088.167217

Pagh, R., Silvestri, F., Sivertsen, J. & Skala, M., 2015. Approximate furthest neighbor in high dimensions. s.l., Springer, pp. 3-14. - https://doi.org/10.1007/978-3-319-25087-8_1

Pagh, R., Silvestri, F., Sivertsen, J. & Skala, M., 2017. Approximate furthest neighbor with application to annulus query. Information Systems, Volume 64, pp. 152-162. - https://doi.org/10.1016/j.is.2016.07.006

Preparata, F. & Shamos, M., 1985. Computational Geometry: an Introduction. New York: Springer. - https://doi.org/10.1007/978-1-4612-1098-6

Ramos, E., 1997. Construction of 1-d lower envelopes and applications. s.l., s.n., p. 57-66. - https://doi.org/10.1145/262839.262864

Ramos, E. A., 2000. Deterministic algorithms for 3-D diameter and some 2-D lower envelopes. s.l., ACM, pp. 290-299. - https://doi.org/10.1145/336154.336215

Supowit, K. J., 1990. New techniques for some dynamic closest-point and farthest-point problems. s.l., Society for Industrial and Applied Mathematics, pp. 84-90.

Williams, R., 2018. On the Difference Between Closest, Furthest, and Orthogonal Pairs: Nearly-Linear vs Barely-Subquadratic Complexity. s.l., Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1207-1215. - https://doi.org/10.1137/1.9781611975031.78

Yao, A. C. C., 1982. On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM Journal on Computing, 11(4), pp. 721-736. - https://doi.org/10.1137/0211059