On how to not misuse hierarchical clustering on principal components to define clinically meaningful patient subgroups. Response to: ‘On using machine learning algorithms to define clinical meaningful patient subgroups’ by Pinal-Fernandez and Mammen

Alain Meyer; Lionel Spielmann; François Séverac

doi:10.1136/annrheumdis-2019-215868

Article Text

Correspondence response

On how to not misuse hierarchical clustering on principal components to define clinically meaningful patient subgroups. Response to: ‘On using machine learning algorithms to define clinical meaningful patient subgroups’ by Pinal-Fernandez and Mammen

Free

Alain Meyer1,2,
http://orcid.org/0000-0003-1057-6890Lionel Spielmann3,
François Séverac4,5

¹ Exploration Fonctionnelle Musculaire, Service de physiologie, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
² Centre National de Référence des Maladies Auto-Immunes Systémiques Rares de l'Est et du Sud-Ouest, Service de rhumatologie, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
³ Service de Rhumatologie, Hôpitaux Civils de Colmar, Colmar, France
⁴ Service de Santé Publique, GMRC, CHU de Strasbourg, Strasbourg, France
⁵ iCUBE, UMR 7357, équipe IMAGeS, Université de Strasbourg, Strasbourg, France

Correspondence to Dr Lionel Spielmann, Service de Rhumatologie, Hospices Civils de Colmar, Colmar 68024, Alsace (Région), France; lionel.spielmann{at}ch-colmar.fr

https://doi.org/10.1136/annrheumdis-2019-215868

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

We thank Pinal-Fernandez and Mammen for their interesting methodological comment on our work in which we used hierarchical clustering on principal components to define clinically meaningful subgroups of patients with anti-Ku antibodies.1 2

We fully agree with the conclusion of the authors: ‘machine learning methods may be fundamentally flawed if a cornerstone of the analysis depends upon the incorrect use of a complex biostatistical technique’.

In this regard, the example of hierarchical clustering on principal components they provide in their comment is an illustration on how this statistical tool can be misused and generate false discoveries:

First, hierarchical clustering on principal components is a descriptive method that is fitted to describe heterogeneous datasets. Prior …

View Full Text

Linked Articles

Correspondence
On using machine learning algorithms to define clinically meaningful patient subgroups

Iago Pinal-Fernandez Andrew Lee Mammen
Annals of the Rheumatic Diseases 2019; 79 e128-e128 Published Online First: 21 Jun 2019. doi: 10.1136/annrheumdis-2019-215852

Log in using your username and password

Main menu

Log in using your username and password

You are here

Statistics from Altmetric.com

Request Permissions

Linked Articles

Read the full text or download the PDF:

Log in using your username and password