Tutorial: Evolutionary Feature Reduction for Machine Learning

In the era of big data, vast amounts of high-dimensional data have become ubiquitous in various domains, such as social media, healthcare, and cybersecurity. Training machine learning algorithms on such high-dimensional data is not practical due to the curse of dimensionality. Furthermore, the high-dimensional data might contain redundant and/or irrelevant features that blur useful information from relevant features. Feature reduction can address the above issues by building a smaller but more informative feature set.

Feature selection (FS) and feature construction (FC) are two main approaches to feature reduction. FS aims to select a small subset of original (relevant) features. FC aims to create a small set of new high-level (informative) features based on the original feature set. Although both approaches are essential pre-processing steps, they are challenging due to their large search spaces. The search spaces of FS and FC are also rough, where a slight change in the new feature set leads to a substantial change in the learning performance. A variety of heuristic search techniques have been applied to feature reduction, but most of the existing methods still suffer from stagnation in local optima and/or high computational cost. Due to the powerful search abilities and flexible solution encoding/representation schemes, there has been increasing interest in using evolutionary computation (EC) techniques to address feature reduction. EC techniques have now become an essential means for handling high dimensionality issues.

This tutorial firstly introduces the main concepts and the general framework of feature reduction. Then, we will show how EC techniques, such as particle swarm optimisation, genetic programming, ant colony optimisation, and evolutionary multi-objective optimisation, can address challenges in feature reduction. The effectiveness of EC-based feature reduction is illustrated through several applications such as bioinformatics, image analysis and pattern classification, and cybersecurity. The tutorial concludes with existing challenges for future research.

Potential audiences: The tutorial is designed as an introductory tutorial that provides the main concepts and the overall system of evolutionary feature reduction. At the same time, the attendees will also learn the pros/cons of applying different evolutionary algorithms to achieve feature reduction and state-of-the-art evolutionary feature reduction algorithms. Some potential future research directions are also explained in the tutorial.

The outline of the proposed tutorial is as follows:
  1. Introduction to feature reduction (20 minutes)
    • What is feature redunction?
    • Why feature reduction?
    • Feature selection and feature construction
    • General framework of feature reduction
    • Filter/Wrapper/Embedded feature reduction
  2. Feature selection (25 minutes)
    • Graph-based representation
    • Tree-based representation
    • Vector-based representation
  3. Feature construction (15 minutes)
    • Why GP for feature construction?
    • Single-tree representations
    • Multi-tree representations
  4. Hybridisation of feature selection and feature construction (5 minutes)
  5. Real-world applications of feature reduction (5 minutes)
  6. Existing issues and challenges (10 minutes)

  • Dr Bach Nguyen, School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand
  • Prof Bing Xue, School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand
  • Prof Mengjie Zhang, School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand
  • Biography of the organisers:
    Bach

    Bach Nguyen is currently a Postdoctoral Research Fellow in School of Engineering and Computer Science at VUW. He has over 20 papers published in fully refereed international journals and conferences. His research focuses mainly on evolutionary computation, machine learning, classification, feature selection, transfer learning, and multi-objective optimisation.

    Dr Nguyen is currently chairing the IEEE Task Force on Evolutionary Feature Selection and Construction and the Young Professionals of IEEE New Zealand Central Section. He is also a Vice-Chair of Data Mining and Big Data Analytics Technical Committee, IEEE Computational Intelligence Society.

    Dr Nguyen co-chaired of IEEE Symposium on Computational Intelligence in Data Mining in IEEE Symposium on Computational Intelligence (SSCI) 2021. He organised Special Session on Evolutionary Feature Selection, Construction, and Extraction and delivered a Tutorial on Evolutionary Feature Reduction in IEEE Congress on Evolutionary Computation (IEEE CEC) 2021. He also delivered a Tutorual on Evolutionary Feature Reduction in IEEE International Conference on Data Mining (ICDM) Workshop 2021. He has been serving as a program committee member for over 10 international conferences including AAAI, IJCAI, IEEE CEC, GECCO, and IEEE SSCI. He has been serving as a reviewer for over 10 international journals including IEEE Transactions on Evolutionary Computation and IEEE Transactions on Cybernetics.

    Bing

    Bing Xue is currently a Professor and Program Director of Science in School of Engineering and Computer Science at VUW. She has over 200 papers published in fully refereed international journals and conferences and her research focuses mainly on evolutionary computation, machine learning, classification, symbolic regression, feature selection, evolving deep neural networks, image analysis, transfer learning, multi-objective machine learning.

    Dr Xue is currently the Vice-Chair of Evolutionary Computation Technical Committee, Vice-Chair of IEEE Task Force on Transfer Learning & Transfer Optimization, Vice-Chair (and founding Chair) of IEEE Task Force on Evolutionary Feature Selection and Construction, and Vice-Chair of IEEE CIS Task Force on Evolutionary Deep Learning and Applications.

    Prof Xue is the organiser of the special session on Evolutionary Feature Selection and Construction in IEEE Congress on Evolutionary Computation (CEC) 2015, 2016, 2017, 2018 2019, and 2020. Prof Xue has been a chair for a number of international conferences including the Chair of Women@GECCO 2018 and a co-Chair of the Evolutionary Machine Learning Track for GECCO 2019 and 2020. She is the Lead Chair of IEEE Symposium on Computational Intelligence in Feature Analysis, Selection, and Learning in Image and Pattern Recognition (FASLIP) at SSCI 2016, 2017,2018, 2019, 2020, and 2021, a Program Co-Chair of the 7th International Conference on Soft Computing and Pattern Recognition (SoCPaR2015), a Program Chair of the 31th Australasian Joint Conference on Artificial Intelligence (AI 2018), and Finance Chair for 2019 IEEE Congress on Evolutionary Computation.

    She is an Associate Editor or Member of the Editorial Board for seven international journals, including IEEE Transactions of Evolutionary Computation, IEEE Computational Intelligence Magazine, and ACM Transactions on Evolutionary Learning and Optimisation.

    Meng

    Mengjie Zhang is a Fellow of Royal Society of New Zealand, a Fellow of IEEE, a Panel Member of the Marsden Fund (New Zealand Government Funding), and currently Professor of Computer Science at Victoria University of Wellington, where he heads the interdisciplinary Evolutionary Computation Research Group. He is a member of the University Academic Board, a member of the University Postgraduate Scholarships Committee, Associate Dean (Research and Innovation) in the Faculty of Engineering, and Chair of the Research Committee of the Faculty of Engineering and School of Engineering and Computer Science.

    His research is mainly focused on evolutionary computation, particularly genetic programming, particle swarm optimisation and learning classifier systems with application areas of feature selection/construction and dimensionality reduction, computer vision and image processing, evolutionary deep learning and transfer learning, job shop scheduling, multi-objective optimisation, and clustering and classification with unbalanced and missing data. He is also interested in data mining, machine learning, and web information extraction. Prof Zhang has published over 500 research papers in refereed international journals and conferences in these areas.

    He has been serving as an Associated Editor or Editorial Board Member for over 10 international journals including IEEE Transactions on Evolutionary Computation, IEEE Transactions on Cybernetics, the Evolutionary Computation Journal (MIT Press), ACM Transactions on Evolutionary Learning and Optimisation, Genetic Programming and Evolvable Machines (Springer), IEEE Transactions on Emergent Topics in Computational Intelligence, Applied Soft Computing, and Engineering Applications of Artificial Intelligence, and as a reviewer of over 30 international journals. He has been a major chair for 8 international conferences. He has also been serving as a steering committee member and a program committee member for over 80 international conferences including all major conferences in evolutionary computation. Since 2007, he has been listed in the top five world genetic programming researchers by the GP bibliography (http://www.cs.bham.ac.uk/~wbl/biblio/gp-html/index.html).

    He is the Tutorial Chair for GECCO 2014, an AIS-BIO Track Chair for GECCO 2016, an EML Track Chair for GECCO 2017, and a GP Track Chair for GECCO 2020. Since 2012, he has been co-chairing several parts of IEEE CEC, SSCI, and EvoIASP/EvoApplications conference (he has been involving major EC conferences such as GECCO, CEC, EvoStar, SEAL). Since 2014, he has been co-organising and co-chairing the special session on evolutionary feature selection and construction at IEEE CEC and SEAL, and also delivered a keynote/plenary talk for IEEE CEC 2018, IEEE ICAVSS 2018, DOCSA 2019, IES 2017 and Chinese National Conference on AI in Law 2017.

    Prof Zhang was the Chair of the IEEE CIS Intelligent Systems Applications, the IEEE CIS Emergent Technologies Technical Committee, and the IEEE CIS Evolutionary Computation Technical Committee; a Vice-Chair of the IEEE CIS Task Force on Evolutionary Feature Selection and Construction, the IEEE CIS Task Force on Evolutionary Computer Vision and Image Processing, and the IEEE CIS Task Force on Evolutionary Deep Learning and Applications; and also the founding chair of the IEEE Computational Intelligence Chapter in New Zealand.