Welcome to the HandPD dataset

This home-page aims at presenting the HandPD dataset, which comprises handwritten exams from two groups of individuals: (i) Healthy Group and (ii) Patient Group, being the latter one composed by individuals affected by Parkinson's Disease (PD). The handwritten exams were collected at Botucatu Medical School, São Paulo State University - Brazil. The main task consists, essentially, in filling out a form composed by four spirals and four meanders, which are then cropped out from the form and stored in "jpg" image format. Follow, below, some examples extracted from the dataset:

(a)
(b)
(c)
(d)
Figure 1. Some examples of spirals extracted from HandPD dataset: (a) 58-years old male and (b) 28-years old female individuals of control group, and (c) 56-years old male and (d) 65-years old female individuals of patient group.
(a)
(b)
(c)
(d)
Figure 2. Some examples of meanders extracted from HandPD dataset: (a) 58-years old male and (b) 28-years old female individuals of control group, and (c) 56-years old male and (d) 65-years old female individuals of patient group.

The dataset contains 92 individuals, divided into 18 healthy people (CHealthy Group) and 74 patients (Patients Group). Follows, below, a brief description about each group:

Therefore, the entire dataset is composed of 736 images labeled in two groups : the healthy group containing 72 images, and the patient group containing 296 images, having a dataset with 368 images from each drawing, i.e. spirals and meanders. The images are labeled as follows: ID_EXAM-ID_IMAGE.jpg, in which ID_EXAM stands for the exam's identifier, and ID_IMAGE denotes the number of the image of that exam. Notice we may have more than one ID_EXAM for the same individual, since some of them have filled out four images by form. The Spiral_HandPD dataset (images) can be downloaded here and Meander_HandPD images can be dowloaded here.

We have also made available the dataset with the features extracted according to the aforementioned paper in two formats: Spiral-LibOPF text file and Meander-LibOPF text file . To binary format, we have the Spiral-LibOPF binary file and Meander-LibOPF binary file. The LibOPF project aims at developing pattern recognition classifiers based on optimum-path forest. The Optimum-Path Forest (OPF) classifier is a framework for fast and simple implementation of graph-based classifiers. Mode details can be found here.

In order to provide full information about the dataset, we made available a Spiral-csv file and Meander-csv file with all detailed information about the individuals (except their names and personal information). The aforementioned file is organized in columns, as follows:

If you use HandPD dataset, please cite the following paper: CMPB-2016

Also, you can download the paper here. The source-code to extrat te features proposed in the aforementioned paper can be downloaded here.

The NewHandPD dataset

We designed an improved version of HandPD dataset that is composed of 66 individuals divided in two groups: Healthy and Patient. The first one comprises 35 individuals, as well as the second group contains 31 individuals. Each individual was asked to draw 12 exams, being 4 of them related to spirals, 4 related to meanders, 2 circled movements (one circle in the air and another on the paper), and left and right-handed diadochokinesis. During the exam, we also recorded the handwritten dynamics by means of a smart pen (BiSP), which means we have images from spirals (4), meanders (4), circle on the paper (1), and signals for all 12 exams. In short, for each individual, we have 9 images and 12 signals, which means the reader can use both information to obtain a more descriptive data about each individual. Also, NewHandPD is more balanced then original HandPD dataset. Below, a brief description about each group:

Therefore, the NewHandPD dataset is composed of 264 images (104 female and 160 male), 420 signals of healthy individuals and 372 signals from patients. The images are labeled in order to provide an easy access to each exam, for instance sp1-H1.jpg means that sp1 represents the number of the spiral image and H1 denotes the healthy individual. Similarly, sp1-P1.jpg stands for the spiral image number 1 from patient number 1. In regard to the signals, file circA-H1.txt represents the healthy individual who takes the circle A exam, and file sigMea1-H1.txt concerns the healthy individual who takes the meander exam (first image).

The files concerning healthy individuals can be downloaded below:

The files concerning patients individuals can be downloaded below:

Also, the datasets are available in the csv format following the very same form used in the original HandPD: NewSpiral-csv and NewMeander-csv. We also made available the datasets with the features extracted in the txt format: NewSpiral-LibOPF-txt and NewMeander-LibOPF-txt. Their binary versions are NewSpiral-LibOPF-binary and NewMeander-LibOPF-binary, respectively.

Notice the feature-based datasets are available in the LibOPF format.

if you use NewHandPD dataset, please cite the following paper: SIBGRAPI-2016

Also, you can download the paper here

For more information about HandPD and NewHandPD dataset, please fell free to keep in touch with Clayton (claytontey AT gmail DOT com) or João Paulo (papa.joaopaulo AT gmail DOT com).


Last updated on April 26, 2017.