Professional Documents
Culture Documents
Supervisors:
Author: Prof. Daniel C. Alexander
Răzvan V. Marinescu Dr. Sebastian Crutch
Dr. Neil P. Oxtoby
April 8, 2019
I, Răzvan Valentin Marinescu, confirm that the work presented in this thesis is my
own. Where information has been derived from other sources, I confirm that this has
been indicated in the thesis.
Abstract
In order to find effective treatments for Alzheimer’s disease (AD), a devastating neurode-
generative disease affecting millions of people worldwide, we need to identify subjects
at risk of AD as early as possible. To this end, disease progression models have been
recently developed, which not only to perform early diagnosis, but also estimate a unique
disease signature that is used to predict the subjects’ disease stages and future evolution.
However, these models have not yet been applied to rare neurodegenerative diseases,
are not suitable to understand the complex dynamics of biomarkers, work only on large
multimodal datasets, and their predictive performance has not been objectively validated.
In this work I developed novel models of disease progression and applied them to
estimate the progression of Alzheimer’s disease and Posterior Cortical atrophy, a rare
neurodegenerative syndrome causing visual deficits. My first contribution is a study on
the progression of Posterior Cortical Atrophy, using models already developed: the Event-
based Model (EBM) and the Differential Equation Model (DEM). My second contribution
is the development of DIVE, a novel spatio-temporal model of disease progression that es-
timates fine-grained spatial patterns of pathology, potentially enabling us to understand
complex disease mechanisms relating to pathology propagation along brain networks.
My third contribution is the development of Disease Knowledge Transfer (DKT), a novel
disease progression model that estimates the multimodal progression of rare neurodegen-
erative diseases from limited, unimodal datasets, by transferring information from larger,
multimodal datasets of typical neurodegenerative diseases. My fourth contribution is the
development of novel extensions for the EBM and the DEM, and the development of
novel measures for performance evaluation of such models. My last contribution is the
organization of the TADPOLE challenge, a competition which aims to identify algorithms
and features that best predict the evolution of AD.
Impact Statement
The work presented in this thesis furthers our understanding of the temporal evolution
of Posterior Cortical Atrophy and Alzheimer’s disease. The disease progression models
and evaluation techniques that we developed can help towards understanding underlying
disease mechanisms, aid patient stratification and drug evaluation in clinical trials for
Alzheimer’s disease and Posterior Cortical Atrophy, and can be used in clinical practice
for predicting the future evolution of subjects that are at risk of developing Alzheimer’s
disease.
I published the work in this PhD thesis in two first-author papers (DIVE and TAD-
POLE chapters), and will soon submit another two papers (DKT and PCA chapters).
I have also communicated my results in international conferences. I have also engaged
with the broader scientific community by organising the TADPOLE Challenge, as well
as a couple of hackathons at the PyConUK conference and the CMIC Summer School.
Acknowledgements
There are many great people who have helped my PhD project become reality. First of all,
I’d like to thank my supervisor Daniel Alexander, for his great advice, ideas and research
directions. He has always encouraged me to pursue interesting ideas and supported me
in developing them. Secondly, I’d also like to thank Alexandra Young and Neil Oxtoby
for teaching me disease progression modelling, especially in the early years of my PhD.
I’d also like to thank Sebastian Crutch, Tim Shakespeare, Keir Yong, and other DRC
collaborators, for their help and advice on Posterior Cortical Atrophy and other clinical
aspects of my work. I’d like to thank Marco Lorenzi, for trying to explain mathematics
to a wanna-be mathematician like myself. Marco and Neil are also great guitar players,
which I had the opportunity to hear a few times. I’d also like to thank Sara Garbarino,
for her great spirit, for taking the time to repeatedly listen to my presentations when
rehearsing them, and for reminding me that I was probably the biggest nerd in CMIC.
I’d further like to thank the POND group, for the help they offered me throughout my
PhD, for the great coffees we had after our meetings, and for reminding me that I can’t
deal with non-working technology in hotels during our trips in the Netherlands. I’d like
to thank Gary Zhang for coaching me on how to present my work without putting half
of the audience to sleep, as well as others in MIG and CMIC, for teaching me about
diffusion MRI, machine learning and other imaging techniques. I remember coming to
those meetings in early days of my PhD and not understanding what was being discussed.
In terms of the social aspect, I had a wonderful time at UCL. I’ll miss the trips
organised by Pawel Markiewicz around Wales and Cornwall, where we had a lot of fun
surfing, playing frisbee and BBQ-ing on the beach. I’ll also miss the great camping trips
with the CMIC folks in Peak District and Lake district, when I attempted driving –
successfully! – for the first time in the UK! I’ll also miss the great time I had with Thore
Bucking, Emma Hill and Kin Quan during the MRes year. I’ll miss the dinners and
lunches such as the EuroPOND celebratory lunch, when we got so excited that we each
ordered 4 glasses of champagne, which got me tipsy. When we came back to UCL after
lunch I realised I was actually breaking the code instead of doing anything useful.
Finally, I’d like to thank my parents, Aurora and Dan Marinescu, for their love and
support, without which I wouldn’t have been able to start the PhD in the first place.
My brother Robert Marinescu, for his funny jokes and good spirit. My grandmother
Anghelută Constantina, for her funny and charismatic character. And my friends and
housemates, in particular Vibhav Mishra, Carlos Gavidia and Mikael Brudfors, for the
wonderful time spent in the Ifor residence, as well as Georgiana Ghetie, Alexandru Barbu
and Oana Lang, for their light-hearted spirit, conversations and for the fun we had in the
last few months of my PhD.
Contents
List of Figures 15
List of Tables 17
1 Introduction 19
1.1 Alzheimer’s Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2 Posterior Cortical Atrophy . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 Disease Progression Models . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5 Justification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5.1 Longitudinal Modelling of Posterior Cortical Atrophy . . . . . . . 21
1.5.2 Current Disease Progression Models Cannot Model Complex Dy-
namics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5.3 Comparative Performance of Different Disease Progression Models 22
1.6 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6.1 Longitudinal Neuroanatomical Progression of Posterior Cortical At-
rophy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6.2 DIVE: A Spatiotemporal Progression Model of Brain Pathology in
Neurodegenerative Disorders . . . . . . . . . . . . . . . . . . . . . 23
1.6.3 Disease Knowledge Transfer across Neurodegenerative Diseases . . 23
1.6.4 Novel Extensions to the Event-based Model and Differential Equa-
tion Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.6.5 TADPOLE Challenge: Prediction of Longitudinal Evolution in
Alzheimer’s Disease . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.7 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.2 Causes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.3 Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.4 Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.5 Neuroimaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3.6 Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
9 Conclusions 149
9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.2.1 Applications to Neurodegenerative Diseases . . . . . . . . . . . . 151
9.2.2 Applications to Clinical Trials . . . . . . . . . . . . . . . . . . . . 154
9.2.3 Methodological Developments . . . . . . . . . . . . . . . . . . . . 154
9.2.4 Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
F Bibliography 193
14 Contents
List of Figures
6.5 DIVE estimated clusters and trajectories over the 10 cross-validation folds 116
6.6 Scatter plot of DIVE-derived DPS scores vs cognitive tests . . . . . . . . 117
7.1 Diagram of the proposed framework for joint modelling of multiple diseases.123
7.2 The algorithm for estimating the DKT parameters . . . . . . . . . . . . . 126
7.3 DKT Simulation Results - Comparison between true and DKT-estimated
biomarker trajectories and subject time-shifts. . . . . . . . . . . . . . . . 129
7.4 Estimated biomarker trajectories for the ”synthetic PCA” disease, plotted
alongside true trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.5 DKT results - biomarker trajectories in the occipital unit and dysfunction-
ality scores for tAD and PCA . . . . . . . . . . . . . . . . . . . . . . . . 131
7.6 Estimated multi-modal trajectories for the PCA cohort. . . . . . . . . . . 132
A.1 Labels of the different areas analysed in the EBM progression snapshots . 157
A.2 EBM bootstrap samples of the atrophy sequence for PCA and tAD . . . 160
A.3 Hypothesis testing of ordering of events within PCA and tAD . . . . . . 161
A.4 Positional variance diagram estimated by the event-based model, for three
PCA sugroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
A.5 EBM bootstrap samples of the atrophy sequence, for three PCA subgroups 163
A.6 Hypothesis testing of the ordering of events within the three PCA subgroups.164
A.7 Testing for statistically significant differences in positions of each biomarker
in the EBM abnormality sequences, for both PCA and typical AD. . . . 168
A.8 Testing for statistically significant differences in biomarker positions in the
EBM sequences of PCA subgroups. . . . . . . . . . . . . . . . . . . . . . 169
B.1 DIVE: Error in DPS scores and trajectory estimation in simulations . . . 172
C.1 Estimated biomarker trajectories for the ”synthetic AD” disease, plotted
alongside true trajectories. . . . . . . . . . . . . . . . . . . . . . . . . . . 186
List of Tables
6.1 Demographics of the four cohorts from ADNI and DRC . . . . . . . . . . 110
6.2 Performance evaluation of DIVE and two simplified models on the ADNI
MRI dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.1 The format of the forecasts for three example subjects. Participants have
to predict, for each subject, the probability of clinical diagnosis (CN/M-
CI/AD), the ADAS-Cog13 score and Ventricle volume, as well as the 50%
confidence range. RID - Roster ID is the unique identifier for ADNI sub-
jects, ADAS - ADAS-Cog13, CI - confidence range. Note that, even if the
CN/MCI/AD probabilities don’t sum to one, we will normalise them anyway.140
8.2 Subject statistics and available data in the TADPOLE datasets D1, D2,
D3 and D4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.3 Types of TADPOLE submissions that can be made by participants. . . . 144
8.4 TADPOLE prize allocation scheme using funds from AD charities . . . . 147
B.1 Comparison of DIVE with two more simplistic models on the ADNI MRI
dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Chapter 1
Introduction
with hippocampal sparing and limbic predominant cases reported in the literature [14].
This heterogeneity can help us understand disease causes and underlying mechanisms,
and identify risk- and protective-factors. For example, it has been observed that differ-
ent speeds of progression can be due to differences in amyloid-β fibrils among subjects
[15]. Another example is that different ages of onset in familial AD are associated with
different underlying mutations in the PSEN1 gene [16].
A notable example of phenotypic heterogeneity in Alzheimer’s disease is given by
Posterior Cortical Atrophy (PCA). PCA, also called Benson’s syndrome [17], is a neu-
rodegenerative disease similar to AD that results in disruptions of the visual and motor
systems. Early symptoms include blurred vision, inability to read, difficulty with depth
perception and problems navigating through space [18, 19], while late-stage symptoms can
include inability to recognise familiar faces and objects as well as visual hallucinations.
Neuroanatomically, PCA is characterised by atrophy in the superior parietal, occipital
and posterior temporal regions [20, 21]. However, due to the rarity of the disease, only a
limited number of small studies have been done in PCA [18].
disease [29] and showed increased performance in predictions compared to standard ap-
proaches [30], they have some limitations that need to be addressed. First of all, they
have not been applied to some rare neurodegenerative diseases such as Posterior Cor-
tical Atrophy. Secondly, they are not suitable for modelling the complex dynamics of
biomarkers. This is because they work on extracted features, which generally lack impor-
tant information present in the brain’s morphology; also, they cannot exploit biomarker
relationships shared across related diseases. Third, it is not yet clear how to measure the
performance of such models, and no previous literature study has been done to establish
the comparative performance of such models at different prediction tasks.
• Current disease progression models are not appropriate for modelling the complex
dynamics of biomarker measurements.
The work I present in this thesis tries to address these three aspects.
1.5 Justification
1.5.1 Longitudinal Modelling of Posterior Cortical Atrophy
The longitudinal neuroanatomical progression of Posterior Cortical Atrophy has not been
quantified in a comprehensive study so far. Several case studies have been published,
which described the brain pathological progression of PCA [31, 32, 33, 34, 35, 36]. The
only longitudinal study of PCA [37] showed widespread gray matter loss in both PCA and
tAD. However, the numbers were small (17 PCA and 16 tAD) and the time interval was
short (1 year). Larger longitudinal studies are therefore required to robustly estimate the
progression of brain pathology in PCA, which is important for understanding underlying
disease mechanisms and for stratification of subjects clinical trials.
regions of interest, it has been shown that patterns of pathology in different types of de-
mentia are dispersed and disconnected, as they follow underlying brain networks [38].
In order to study the link between neuroanatomical pathology and brain networks, we
need to develop spatio-temporal models of disease progression that account for changes
over the brain structure, as well as over the disease timeline. Such spatio-temporal mod-
els can help us understand more complex disease mechanisms and enable more accurate
predictions of disease risk, which can aid stratification in clinical trials.
Another limitation of current disease progression models is that it is challenging to
apply them to study rare types of dementia such as PCA. These models generally require
large multimodal datasets which are often not available for rare dementias. Therefore,
there is a need to develop models that can transfer information from larger multimodal
datasets. In particular for PCA, these transfer-learning approaches can enable us to esti-
mate robust, multimodal biomarker trajectories, and to make more accurate predictions
for each subject.
• I estimated the ordering in which brain regions show volume reductions using the
event-based model, and also estimated the rate and extent of volume loss using the
1.6. Thesis Contributions 23
differential equation model. I contrasted these between PCA and tAD, and showed
differences both qualitatively and quantitatively, which were further supported by
statistical tests.
• I showed that three cognitively-defined PCA subgroups show different phenotype-
specific patterns of early atrophy. This was the first study to show quantitative
evidence of heterogeneity within PCA.
• I developed four novel performance metrics that were used to assess the performance
of all the models evaluated.
• I showed that the extended models had better or similar performance compared to
the standard models.
• My results also indicate that the novel performance metrics are more sensitive than
standard approaches based on the prediction accuracy of clinical diagnosis.
• I helped build the website and I created the main training dataset.
• Chapter 4 contains the clinical analysis regarding the progression of Posterior Cor-
tical Atrophy as compared to typical Alzheimer’s disease.
• Chapter 5 presents novel extensions in the event-based model and differential equa-
tion model, which are evaluated against
• Chapter 6 presents the DIVE model formulation and results on four different
datasets, along with model validation.
• Chapter 7 presents the DKT model formulation, along with results on simulated
data and patient data, and model validation. standard implementations based on
performance metrics that I proposed.
1.7. Thesis Structure 25
• Chapter 9 presents a summary of the work in this thesis, and proposes directions
for further research.
26 Chapter 1. Introduction
Chapter 2
2.1.1 Symptoms
Symptoms of AD vary depending on the stage of the disease. Some authors [43] split the
symptoms into several categories: pre-dementia stage, mild dementia, moderate dementia
and severe dementia.
Figure 2.1: Prevalence of dementia around the world, along with forecasts for 2030 and
2050. Source: http://www.worldalzreport2015.org/
28 Chapter 2. Background – Alzheimer’s Disease
In the pre-dementia stage, the first symptoms are usually attributed to stress and ageing.
Careful neuropsychological investigations may reveal very mild cognitive impairment five
years before the establishment of clinical diagnosis [43]. The performance of complex
tasks might be reduced, and alterations of behaviour including social withdrawal and
depressive dysphoria might also be already present [43].
In the mild dementia stage, significant impairment of learning and memory are present
[43]. However, short-term and implicit memory are less affected compared to declara-
tive memory. Neuropsychological tests can reveal problems with object naming [44, 45],
semantic difficulties with word generation [44, 45] and inability to draw figures (i.e. con-
structional apraxia) [46]. Non-cognitive disturbances are also present at this stage [47],
where depression has been observed in these mild stages [48].
At the moderate dementia stage, the predominant features are severe short-term memory
impairment [49], along with difficulties in logical reasoning, planning, language [50], read-
ing [51] and writing [52]. More complex actions and activities such as using household
appliances, dressing and eating are gradually lost. Vision-related symptoms triggered by
cognitive deficits also develop, such as spatial disorientation, inability to recognise fa-
miliar faces or illusionary misidentification [53]. Around 20% of patients also experience
visual hallucinations, which may be associated with cholinergic deficits [54].
Patients at this stage cannot survive in their community without help from caregivers.
However, hospital or nursing home admission can be delayed if there is a good support
system in place at the patient’s home.
Figure 2.2: Diagram showing the amyloid hypothesis. Amyloid precursor protein is split
by α-secretase resulting in sAPPα, which might have a neuroprotective role. On the
other hand, splitting by β-amyloid cleaving enzyme (BACE) results in amyloid-β, of
which amyloid-β42 is more prone to self-aggregate and lead to pathogenesis. On the left,
many other factors are shown that are believed to influence this pathway and lead to
more pathology. Reproduced with permission from [11].
1
happening early in the chain of events leading to AD
30 Chapter 2. Background – Alzheimer’s Disease
increases in amyloid toxicity, but no clear evidence of neuronal loss [62, 63] and tau
aggregation as predicted by the amyloid hypothesis [61].
Figure 2.3: Diagram showing different genes which increase the risk for AD (y-axis), as
well as their frequency within the population (x-axis). EOAD genes APP, PSEN1 and
PSEN2 (top-left) give a near-certain risk of developing AD, but are found in a very small
minority of the AD population. APOE4 has a moderate risk, while the other genes have a
lower risk, yet are found in a much larger population. Reproduced from [78], CC BY-NC.
presenilin 1 and 2 respectively [81]. Sporadic AD is the most common, late-onset form
of AD (LOAD), characterised by more complex, non-Mendelian transmission.
In familial AD, several genetic risk factors have been identified so far. In 1980s,
the discovery of amyloid-β peptides in AD senile plaques and the identification of these
peptides in the brains of people with Down’s Syndrome, caused by abnormalities in chro-
mosome 21 and where dementia was also observed, led the the hypothesis that mutations
of a gene located on chromosome 21 might cause AD in people without Down’s syndrome
[82]. A few years later, a linkage peak was indeed found on chromosome 21 [83], and the
APP gene was identified [84] and confirmed in EOAD families [85]. However, the amount
of heterogeneity observed in EOAD suggested additional genes were involved, and further
genetic linkage analyses led to the discovery of PSEN1 [86] and PSEN2 genes [87]. As of
March 2014, 40, 197 and 25 mutations were reported in APP, PSEN1 and PSEN2 genes
respectively, all with autosomal dominant transmission with complete penetrance, with
the exception of one mutation in the APP gene [81].
In sporadic, late-onset AD, the genetic landscape is much more complex. The most im-
portant risk factor is given by mutations in genes coding for Alipoprotein E (APOE) [81].
APOE is a protein whose key function is to transport lipids and cholesterols throughout
the body, and has three major isoforms called APOE2, APOE3 and APOE4, correspond-
ing to alleles 2, 3 and 4. Increased risk of AD due to APOE4 has been established in
1993 in three key studies [88, 89, 90]. Until 2005, more than 500 candidate genes other
than APOE have been identified using association studies, with various pathways involved
including tau phosphorylation, vacuolar sorting, glucose and insulin metabolism, nitrous
oxide synthesis, oxidative stress, growth factors, inflammation and lipid-related pathways
[81]. However, after the advent of genome-wide association studies (GWAS) in 2005, the
32 Chapter 2. Background – Alzheimer’s Disease
Figure 2.4: Diagram showing different risk factors for AD related to lifestyle and the
associated level of evidence. Reproduced from [106], CC-BY-NC-ND.
first genes outside the APOE locus were identified in two independent studies [91, 92].
Several genes including CLU [91, 92], CR1 [92], BIN1 [93], PICALM [91], ABCA7 [94]
and CD2AP [95] have been since identified. Moreover, associations were also found with
quantitative endophenotypes, which provide more statistical power than yes/no disease
status, such as early age of onset [96, 97], greater burden of amyloid pathology [98, 99],
abnormal levels of cerebro-spinal fluid (CSF) [100, 101], decrease in total brain volume
[102, 103] and decreased cognitive scores [104, 99, 105].
2.1.4 Biomarkers
The information in this section has been initially written by me for the TADPOLE
Challenge website2 , with feedback from Esther E. Bron and Daniel C. Alexander. The
material has been subsequently adapted for this thesis.
Over the last decades, various biomarkers have been developed to quantify the severity
of Alzheimer’s disease and track its progression:
• Cognitive tests such as the Mini-Mental State Examination (MMSE) [111] are used
to assess memory and cognitive performance (section 2.1.4.1).
2
https://tadpole.grand-challenge.org/Data/
2.1. Alzheimer’s Disease 33
Figure 2.5: Intercalated pentagons used in the Mini-Mental State Examination (MMSE).
Patients with dementia have difficulty drawing them. Image source: Wikipedia3 CC-SA.
• Cerebro-spinal fluid markers: can be used to measure amyloid plaque deposits [114]
and neurofibrillary tangles through CSF total tau and phosphorylated tau [114]
(section 2.1.4.5).
3
https://commons.wikimedia.org/wiki/File:InterlockingPentagons.svg
34 Chapter 2. Background – Alzheimer’s Disease
Figure 2.6: Comparison between the MRI brain scan of a healthy subject (left) and sub-
jects with different types of mild cognitive impairment (MCI) (middle-right), showing
different patterns of atrophy for each group. MRI is a widely used technology for mea-
suring the spatial distribution and extent of atrophy and for tracking the progression of
Alzheimer’s disease (AD). Reproduced from [117], CC-BY license.
Magnetic resonance imaging (MRI) is a technique used to image the anatomy and the
physiological processes of the brain and other body parts. With MRI, brain structures
can be quantified due to different contrast between gray matter (GM), white matter
(WM), cerebrospinal fluid (CSF) and hard tissue such as the skull. The GM is the brain
tissue that consists of the bodies of neurons, while the WM consists of fibres connecting
the neurons. The cerebrospinal fluid is a clear, colourless fluid providing mechanical and
immunological protection to the brain. Within MRI, different types of contrast between
tissues can be obtained through T1, T2, T1-weighted and T2-weighted images.
Brain MRI has been successfully applied to quantify neurodegeneration in Alzheimer’s
disease. Brain atrophy, which is caused by the death of neurons, can be visually assessed
in MRI scans due to shrinkage of the brain (see Fig. 2.6) and can be quantified using
markers of volume, cortical thickness, surface areas, along with changes in these values
between a baseline and a follow-up scan. These quantitative markers can be obtained
with specialised software such as Freesurfer [118].
MRI-derived biomarkers have both advantages and limitations. They are robust and
have less noise compared to cognitive tests, and are non-invasive. Moreover, they are also
a good indicator of progression from MCI to dementia in an individual subject because
they become abnormal slightly earlier than the onset of dementia-specific symptoms [1,
119]. Limitations of these markers are that MRI scans are expensive, require specialised
equipment to be acquired, and can also suffer from motion artefacts.
2.1. Alzheimer’s Disease 35
Figure 2.7: (top) Fluorodeoxyglucose (FDG) PET images for a cognitively normal sub-
ject (left) and a subject with Alzheimer’s disease (right). FDG PET measures cellular
metabolism, which is known to decrease during the development of AD. There is decreased
metabolism in parietal and frontal regions (gray arrows) in the AD subject compared to
the cognitively normal subject. (bottom) Pittsburgh B (PiB) PET image measuring
amyloid uptake in the brain of a healthy control (left) and AD subject (right). There
is widespread amyloid presence in the brain of the AD subject. Image reproduced with
permission from [120].
Figure 2.8: (Left) Diffusion tensor image of a brain showing white matter fibre con-
nections. The colours represent the direction of the connection (red for left-right, blue
for superior-inferior, and green for anterior-posterior). (Middle) Zoomed image into the
small region of interest (ROI), showing the diffusion tensor ellipses. Each ellipse indicates
the direction where water molecules diffused (i.e. moved). (Right) Diagram showing the
difference between isotropic diffusion (i.e. equal in all directions) versus anisotropic dif-
fusion, along with the diffusivity measures that can be computed. Diagram assembled by
me using images from several sources4 .
of the neuron’s transport system and eventually the neuron’s death (see section
2.1.2.2). AV45 PET can be used to measure the level of misfolded tau proteins and
is also one of the earliest markers in AD.
PET-derived biomarkers are important because they give information about molecular
processes that happen in the brain. These are usually the first to become abnormal in
the cascade of events that lead to Alzheimer’s disease, and are therefore important early
markers of the disease that is about to unfold [1, 119].
PET scans have some limitations that need to be acknowledged. One main limitations
is that the patient is exposed to ionising radiation, which limits the number of scans they
can take in a specific time interval. PET scans also have a much lower spatial resolution
compared to MRI scans. One other caveat with AV1451 PET (tau imaging) is that it is a
very new tracer that is still under research, with some studies indicating evidence of some
off-target binding in some tau conformations found in non-AD tauopathies [121, 122].
Image sources:
4 http://fmri.uib.no/index.php?option=com_content&view=article&id=68&Itemid=86
https://commons.wikimedia.org/wiki/File:DTI-axial-ellipsoids.jpg
http://www.diffusion-imaging.com/2012/10/voxel-based-versus-track-based.html
2.1. Alzheimer’s Disease 37
Figure 2.9: Diagram showing the cerebro-spinal fluid (CSF) coloured in blue, which is
found in the subarachnoid space around the brain and spinal cord. Source: Wikipedia5 ,
CC license.
made of diffusion tensors estimated at each voxel (middle). Diffusivities parallel and
perpendicular to the fiber direction can then be measured (right).
DTI is important for analysing the progression of Alzheimer’s disease. It has been
shown that AD affects white matter bundles [123]. DTI has also shown great potential
for aiding the diagnosis of dementia [124, 125]. DTI tractography is also important for
building brain structural connectomes which have been shown to be disrupted by different
types of dementias including Alzheimer’s disease [38, 126].
DTI measures have some limitations. As with other MRI modalities, it is susceptible
to motion artefacts and suffers from partial volume effects, i.e. measures at each voxel are
biased due to averaging across many different cells and types of tissue that are contained
in that voxel. Another limitation is that changes in DTI-derived measures such as FA are
not specific, and can be attributed to many changes in the underlying cytoarchitecture,
such as neurite density or dispersion [127].
5
https://en.wikipedia.org/wiki/File:1317_CFS_Circulation.jpg
38 Chapter 2. Background – Alzheimer’s Disease
2.1.5 Diagnosis
A diagnosis of Alzheimer’s disease is usually given based on the person’s medical history,
behaviour and information provided by the relatives. Medical imaging from Magnetic
Resonance Imaging (MRI), Computer Tomography (CT) or Positron Emission Tomog-
raphy (PET) can help exclude other types of brain pathologies or types of dementia.
Memory tests from neuropsychological batteries can help characterise the stage of the
disease [128].
The most commonly used diagnostic criteria are from the National Institute of Neuro-
logical and Communicative Disorders and Stroke (NINCDS) and the Alzheimer’s Disease
and Related Disorders Association (ADRDA) [115, 116]. This criteria, commonly called
NINCSD-ADRDA, require evidence of cognitive impairment through neuropsychological
testing for establishing a clinical diagnosis of probable AD, while histopathologic confir-
mation is required for definite confirmation [115, 116].
6
Pre-α is one of the layers from the principal stratum (Pre) of the entorhinal cortex. It is characterised
by cellular islands of large projection cells, and it’s connections project to the hippocampus.
2.3. Posterior Cortical Atrophy 39
first Ammon’s horn sector [132]. Finally, stages V-VI were marked by the spread of NFTs
and NTs to almost all isocortical association areas. [132]
2.2.2 Neuroimaging
In AD, Magnetic Resonance Imaging (MRI) shows gray matter atrophy throughout the
brain, in particular in the hippocampus and entorhinal cortex [9]. In terms of atrophy
progression, it starts in the medial temporal lobe and fusifom gyrus at least 3 years
before an AD diagnosis, and then spreads to the posterior temporal lobe, parietal lobe,
and finally to the frontal lobe. However, the sensorimotor cortex, visual cortex and the
cerebellum are relatively spared [9].
Imaging with Positron Emission Tomography (PET) shows reduced metabolism (FDG)
and increased uptake of amyloid (e.g. AV45) proteins [10]. In early stages of AD, hy-
pometabolism affects the parietotemporal association areas, the posterior cingulate gyrus
and the precuneus. In later stages, frontal cortices also become affected, while the stria-
tum, thalamus, primary sensorymotor cortices, visual cortices and the cerebellum seem
to be spared [10]. In terms of amyloid deposition through amyloid PET, early deposits
are found in the precuneus, orbitofrontal, inferior temporal and posterior cingulate, later
followed by the entire prefrontal cortex, lateral temporal and parietal lobes [10]. These
patterns have been validated using autopsy studies[139, 140].
2.3.1 Symptoms
The most common symptoms include general visuospatial and visuoperceptual impair-
ments such as inability to read, blurred vision, light sensitivity, trouble navigating through
space and issues with depth perception [18, 19]. Additional symptoms also include apraxia
(disorder of movement planning), visual agnosia (object recognition deficit) and agraphia
(loss of writing ability) [17, 32]. These symptoms get worse as the disease progresses,
with patients becoming unable to recognise familiar people, objects, difficulty navigating
familiar places and drawing (see Fig. 2.10). Some studies [141, 142, 143] reported visual
hemineglect (difficulty seeing one half of the visual field) to be frequent in PCA patients,
especially if asymmetrical atrophy takes place in the occipital areas.
PCA patients report higher-order visual problems related to object and space per-
ception, compared to more basic visual impairments e.g. in colour and motion, although
impairments in higher order visual functions might be due to lower-level disruption. One
study [144] reported that all PCA subjects showed impairment in at least one low-level
visual process, and that this correlates with higher-order visuospatial and visuoperceptual
functions, but not with non-visual functions of the parietal lobe, including calculations
and spelling.
40 Chapter 2. Background – Alzheimer’s Disease
Figure 2.10: (A) Visual deficits as shown when a 62-year old PCA patient was asked
to copy the intersecting pentagons figure [18]. (B) Structural MRI, FDG PET and PiB
PET scans of the same subject. Structural MRI shows atrophy predominant in the
bilateral parietal, posterior temporal and lateral occipital regions (B, top), FDG PET
shows reduced metabolism in the same regions (B, middle), while PiB-PET shows diffuse
amyloid uptake throughout the entire brain (B, bottom) [18].
2.3. Posterior Cortical Atrophy 41
2.3.2 Causes
The causes of PCA are still unknown, due to the rarity of the disease, gradual onset of
symptoms and no fully accepted diagnostic criteria [19, 18]. The progressive neurode-
generation that characterises PCA is often attributed to Alzheimer’s disease pathology
(i.e. aggregation of amyloid plaques and tau tangles), but alternative causes including
dementia with Lewy bodies, corticobasal degeneration and prion disease have also been
identified [18]. One study reported the PCA syndrome in a 4-sibling family with prion
disease [145], suggesting that prion propagation mechanisms might be involved in PCA.
Genetic factors that underlie PCA are also not well understood [18, 19]. Empirical
findings suggest that there are no significant differences in the number of patients with
a positive family history of PCA and typical AD [18]. Some studies also report no
differences in Alipoprotein E (APOE) genotypes between PCA and typical AD [146, 147,
148, 149], although other studies reported differences in APOE 4 allele status, with
fewer PCA patients being 4-positive [150, 151]. These differences have been attributed
to differences in inclusion criteria of PCA with respect to typical AD [18].
2.3.3 Diagnosis
PCA patients face difficulties in diagnosis due to the young age at onset and the fact that
there are no fully accepted diagnostic criteria. Patients are sometimes misdiagnosed with
depression, anxiety or even malingering in early stages of the disease [18]. They are often
initially referred to opticians and ophthalmologists in the belief that ocular abnormalities
are causing their visual deficits, often leading to unnecessary medical procedures such as
cataract surgery. Neuroimaging modalities such as magnetic resonance imaging (MRI),
positron emission tomography (PET) or single photon emission computed tomography
(SPECT) can aid diagnosis of PCA [152].
There are no widely accepted diagnostic criteria, although two criteria have been pro-
posed so far by Mendez et al. [146] and Tang-Wai et al. [147]. These criteria suggested
presence of visual deficits in absence of other eye diseases, gradual progression, relative
preservation of anterograde memory, absence of stroke or tumour, and other neuropsy-
chological or imaging abnormalities that are related to parietal or occipital functions.
However, these criteria have some limitations. They are yet to be thoroughly val-
idated outside their centres, and need to be linked to underlying pathology, otherwise
inconsistencies between studies and centres will occur. Moreover, the current criteria
provide no guidance to the level of specificity required for a diagnosis of PCA [18]. It has
been suggested that PCA, when caused by underlying AD pathology, lies on a continuum
of phenotypical variation between AD and purely-visual PCA, with no clearly defined
diagnosis boundary [151, 149, 144].
2.3.4 Management
There is no known efficient treatment of PCA that will reverse or stop neurodegeneration
[19]. Patients with PCA are usually treated with the same medication as for AD, namely
cholinesterase inhibitors: tacrine, rivastigmine, galantamine and donepezil [19]. Crutch
et al. [18] suggest that antidepressant drugs might also be appropriate in patients with
low mood, and levodopa or carbidopa could aid individuals with Parkinsonism. However,
there are no studies analysing the efficiency of these drugs in PCA patients [19].
42 Chapter 2. Background – Alzheimer’s Disease
A few non-pharmacological therapies have also been attempted recently in some pa-
tients that included psycho-educative programs [153] or a combination of speech therapy,
occupational therapy and physiotherapy [154].
2.3.5 Neuroimaging
Several MRI studies in PCA have shown damage to posterior brain regions. Studies
by Hof et al. [155] and Tang-Wei et al. [147] show a greater concentration of senile
plaques and neurofibrillary tangles in the occipital and parietal lobes and at the occipito-
temporal junction. Cross-sectional studies using voxel-based morphometry have also
shown significant abnormalities in occipital and parietal lobes, followed by the temporal
lobe [144, 21]. When compared directly to typical AD subjects, PCA have shown greater
atrophy in the right parietal lobe and less in the left temporal and hippocampal regions
[20, 18]. Some DTI studies also seem to suggest white matter damage in posterior regions
[156, 157, 158]. See Fig. 2.10 for MRI scans of a PCA patient, showing the typical
posterior pattern of atrophy.
Non-MRI imaging studies in PCA have also shown similarly posterior abnormality
patterns. Functional imaging studies using single photon emission computer tomography
(SPECT) and FDG PET also show reduced function in occipital and parietal regions
[159, 160, 161, 162]. Amyloid pathology, as measured with PiB-PET, has been found
in occipital and parietal areas, as compared to typical AD subjects [163, 164, 165, 166],
although this finding was not confirmed in two other studies, which found more diffuse
amyloid uptake [148, 167].
2.3.6 Heterogeneity
Some studies [31, 13] have shown that there is considerable heterogeneity within PCA
itself, where three main PCA subgroups have been reported: primary visual (the striate
cortex, caudal), parietal (dorsal) and occipitotemporal (ventral) [31, 13].
Patients with primary visual subtype showed poor vision deficits, with later problems
with memory, attention and presence of visual hallucinations [13, 168]. Imaging showed
reduction in occipital lobe perfusion. In one of the studies, AD diagnosis was confirmed
post-mortem, upon pathological examination [13]. However, evidence for the existence
of this subgroup is very limited, with only two patients identified so far in different case
studies [168, 13], with another study having reported no ”pure” visual deficits within a
cohort of n=21 PCA subjects[144].
Patients with the parietal (dorsal) PCA subtype generally show initial visuospatial
symptoms, agraphia (inability to draw) and dyspraxia, but have preserved visual fields,
basic perceptual abilities, object recognition and reading and show biparietal and occipital
deficits, disrupting the dorsal or ”where” stream [31, 13].
Patients with the occipitotemporal (ventral) PCA subtype generally show symptoms
related to visual distortion, inability to recognise objects, general topography and written
words and show occipitotemporal pathology, disrupting the ventral or ”what” stream
[31, 13].
While all this evidence suggests that there is considerable heterogeneity within the
PCA syndrome, evidence is very limited to a few case studies, with some patients also
having no pathological confirmation of underlying AD pathology. Moreover, some [18,
2.3. Posterior Cortical Atrophy 43
144] have suggested that these subtypes should not be interpreted as distinct groups, but
rather as points on a continuum of phenotypical variation.
44 Chapter 2. Background – Alzheimer’s Disease
Chapter 3
Figure 3.1: Cartoon showing hypothetical biomarker signatures from two diseases, along
with a cross-sectional snapshot of data from a patient (left). For one patient, disease
staging implies finding the optimal time-shift along the horizontal axis that would match
its data. On the negative y-axis, the histogram of possible stages is shown. Differential
diagnosis can performed by evaluating the integral of the distribution of stages on the
negative y-axis, and selecting the disease that has the largest integral. Deriving quan-
titative biomarker signatures using disease progression modelling can help with disease
understanding, staging and differential diagnosis. Image courtesy of Neil Oxtoby and
Daniel Alexander.
Figure 3.2: Dynamic biomarkers of the AD cascade as hypothesised by Jack et al. [1]. Aβ
and tau are thought to become abnormal before the onset of any dementia symptoms,
while brain structure, memory and clinical function are thought to become abnormal
later, during MCI and dementia stages. Reproduced with permission from [1].
ical, theoretical model that is meant to be a guide for future researchers modelling disease
progression in Alzheimer’s disease. Hence, the model is not quantitative and cannot be
used to e.g. stage patients. Another limitation is that the x-axis (disease progression)
and y-axis (biomarker abnormality) are not well-defined. Various implementations that
will be discussed next have made various assumptions about how to define this, such
as computing Z-scores with respect to controls [2], or used percentiles over the observed
biomarker values [3]. In the next sections, we will present the development of quantitative
models that address these limitations.
cortical thinning rate against MMSE scores. Jack et al. [176] also used regression against
MMSE to estimate the shape of biomarker trajectories. Doody et al. [177] regressed
biomarkers against time since baseline visit. Driscoll et al. [178] estimated brain volume
trajectories using a mixed effects model against age, using other demographic variables
such as gender and intracranial volume (ICV) as covariates.
These methods have some limitations. Regression methods against clinical markers
are limited by the fact that they cannot estimate biomarker dynamics in pre-clinical
stages. On the other hand, regression against age or time since baseline visit assume that
all subjects have the same age of disease onset or that disease onset is at baseline visit.
Another method for estimating biomarker trajectories, which is popular in familial
AD, performs non-linear regression of mutation carriers’ data against estimated years
from parent’s onset [179, 180]. However, this method can only be applied to dominantly
inherited AD, which represents only a small percentage of the entire AD population.
Region 1 Region 2
Figure 3.3: Diagram showing the key concepts behind the event-based model. We assume
a toy dataset (top-left) of two region-of-interest biomarkers from three patients, which
are at different stages along a hypothetical disease progression timeline (bottom-left).
The aim is to estimate which region became abnormal earlier in the disease process. The
event-based model solves this by fitting a mixture model to the data (top-right), where
the two distributions are assumed to represent normal and abnormal biomarker values
respectively. The measurements from each patient are then assessed according to each
distribution (middle-right). Finally, the sequence of abnormality is estimated from these
values, by placing earlier in the sequence the regions/biomarkers for which there are more
abnormal values in the dataset. Diagram made by me.
the disease timeline, also called temporal heterogeneity. Some models go a step further
and also estimate differences that are due to spatial heterogeneity of the subjects, using
random effects estimating deviations from the population trajectory. Such combined
modelling is challenging, as it introduces identifiability issues.
3.5.1.1 Theory
The event-based model consists of a series of events E1 , E2 , . . . , EN and an ordering
S = [s(1), . . . , s(N )] which is a permutation of the integers 1, . . . , N creating the event
ordering Es(1) , Es(2) , . . . , Es(N ) . The set of events is specified a-priori. Moreover, the
model uses a dataset X which contains a set of Xi measurements for each subject i.
These measurements Xi are defined as Xi = {xi1 , xi2 , . . . , xiN }, where xij represents the
value of biomarker j in subject i and is informative of event Ej sin subject i.
The event-based model makes two key assumptions: first, measurements are mono-
tonic as the disease progresses and secondly, the event ordering is the same across all
patients. The first assumption fits with the hypothetical model presented by Jack et al.
[1] in fig. 3.2. Therefore, a patient for whom event Ej has occurred cannot revert to
a state where event Ej did not occur. This assumption is essential because it ensures
snapshots are informative about the event ordering [23]. The second assumption is nec-
essary to be able to aggregate information about the event ordering from the entire set
of subjects.
The aim of the event-based model is to find the probability density function p(S|X)
of an event ordering given the biomarker data. One starts by fitting a model for the
likelihood function p(xij |Ej ) the likelihood of measuring xij given event Ei occurred. A
similar fit is obtained for p(xij |¬Ej ), the likelihood of measuring xij given event Ej has
not occurred. More information about mixture model fitting can be found in section
3.5.1.3. If a subject i is at stage k in the disease progression, events Es(1) , . . . , Es(k) have
occurred while events Es(k+1) , . . . , Es(N ) have not occurred. We can therefore define the
likelihood of the data from subject i given ordering S as:
k
Y N
Y
p(Xi |S, k) = p xi,s(j) |Es(j) p xi,s(j) |¬Es(j) (3.1)
j=1 j=k+1
where measurements xij are assumed to be independent. Since the subject could
potentially be at any stage k in the progression, we integrate over k:
N
X
p(Xi |S) = p(k)p(Xi |S, k) (3.2)
k=0
where p(k) is the prior probability of the subject being at position k in the sequence. A
uniform prior is usually assumed here. Further assuming independence of measurements
across patients we get:
P
Y
p(X|S) = p(Xi |S) (3.3)
i=1
S old E1 E2 E3 E4 E5 S old E1 E2 E3 E4 E5
S new E4 E2 E3 E1 E5 S new E2 E3 E4 E1 E5
(a) Fonteijn et al. [23] (b) Young et al. [24]
Figure 3.4: MCMC perturbation rules used by (a) Fonteijn et al. [23] and (b) Young et
al. [24]. Both methods assume randomly selected source and target events. The method
by Fonteijn et al. only swaps the source event (E1) with the target event (E4). On the
other hand, the perturbation used by Young et al. moves a source event after a target
event and slides the other biomarkers accordingly. Diagram made by me.
p(S)p(X|S)
p(S|X) = (3.5)
p(X)
As the marginal distribution p(X) is analytically intractable, one uses a Markov-
chain Monte Carlo (MCMC) algorithm to sample from the posterior distribution p(S|X).
One assumes flat priors on the sequence S as any sequence could be equally likely. In
the MCMC phase, at each iteration the sequence S can be perturbed by swapping two
randomly chosen events. This perturbation rule has been used by Fonteijn et al. [23].
However, another perturbation method used by Young et al. [24] randomly selects a
source and target event and places the source event after the target event, sliding the
other biomarkers accordingly (see Fig. 3.4). The resulting sequence S new is accepted with
probability p = min(1, a) where a = p(X|S new )/p(X|S). Otherwise the old sequence is
stored and the process is repeated. As MCMC depends on accurate initialisation, one also
runs a greedy ascent algorithm in order to find the sequence with the highest likelihood.
The greedy ascent is very similar to the MCMC phase, the only difference being that
a is set to 1 if p(X|S new ) > p(X|S) and to zero otherwise. Depending on the number
of biomarkers, the greedy ascent is run for a few thousand iterations and repeated 10
times, with different random permutations of integers 1, . . . , N as the starting position.
The maximum likelihood sequence obtained from greedy ascent is then used to initialise
MCMC sampling, which usually runs for at least 100,000 iterations, again depending on
problem size.
The resulting MCMC-sampled sequences are usually plotted in a positional variance
matrix M (Fig 3.5), which is a compact way to represent uncertainty in the event ordering.
Each element M (i, j) represents the proportion of times event Es(j) appeared on position
i in the sampled sequences, given some master sequence S. S is usually set to be the
maximum likelihood sequence or the characteristic ordering, which is given by the average
position of the events in the MCMC samples [23].
MCMC samples
1 E2 E1 E4 E3
2 E1 E2 E4 E3
E2
E1
T E2 E1 E3 E4 E4
E3
1 2 3 4
Maximum Likelihood Ordering
Event Position
E2 E1 E4 E3
Figure 3.5: MCMC sampling and positional variance computation. MCMC sampling
finds a series of T samples, which are then used to derive the characteristic ordering,
where events are ordered according to their average position in the MCMC samples.
Entries M (i, j) in the positional variance matrix stores the relative number of times each
event appeared in each position in the sequence. The events in the positional variance
matrix are ordered according to the characteristic ordering. Diagram made by me.
parameters for the Gaussian distribution were set as the mean and standard deviation of
biomarker values corresponding to controls, while the limits of the uniform distribution
were set to be the minimum and maximum observed biomarker values. While this works
in familial AD and Huntington’s disease [23] due to well-defined control populations, this
does not work well in sporadic AD due to the control population being not well-defined
– e.g. some controls can already have abnormal amyloid levels, which could result in the
distribution for normal values encompassing all observed values. Therefore, the approach
of Young et al. [24] for sporadic AD involved optimising the mixture model parameters
based on the subjects’ data in a data-driven manner. In this case, prior constraints
were used on the mixture model parameters, i.e. the mean and standard deviation of
the gaussian distributions, for biomarkers that did not change from healthy to diseased
subjects.
k
Y N
Y
k = arg max p(k)p(Xi |S, k) = p(k) p xi,s(j) |Es(j) p xi,s(j) |¬Es(j) (3.6)
k i=1 i=k+1
As before, the prior p(k) is assumed to be uniform. It should be noted that stages
range from zero to N , the number of events. If a subject is at stage k it means that all
events up to and including k have occurred while the events after k have not occurred.
The event-based model can also be used to classify subjects into controls and AD,
or any other symptomatic subgroups [24]. Given a threshold stage t, one can predict all
subjects having a stage less than or equal to t to be controls and all subjects with stages
3.5. Scalar Biomarker Models 53
greater than t to be patients. The optimal threshold is the one which maximises the
balanced accuracy, defined as follows:
TP + TN
Accuracy = (3.7)
TP + FP + FN + TN
where T P , F P , F N , T N represent the number of true positive, false positive, false
negative and true negative subjects respectively.
3.5.1.5 Discussion
The EBM is a useful tool for modelling the progression of diseases when only limited,
cross-sectional data is available. The model can also be usd to stage subjects, in discreete
units, along the disease progression timeline. The model parameters are estimated using
Markov Chain Monte Carlo sampling, based on optimising a conditional likelihood.
Forward Model
==========⇒
What we want What we have
∆x δx
lim∆t→
−0 = = f (x)
∆t δt
Solve for x using the
Euler method:
t1 = t0 + δt
x1 = x0 + f (x0 )δt
⇐==========
Inverse Problem
Figure 3.6: Diagram of the Differential Equation Model (DEM). (top-left) Hypothetical
biomarker signature that needs to be reconstructed, along with subject measurements.
(top-middle) To make the model more realistic, each subject is made to follow a slightly
different trajectory due to heterogeneity. (top-right) In practice, we don’t know the
disease stage, so we align the measurements at time since baseline visit. (bottom-right)
The DEM model estimates a rate of change model from the slopes of lines fitted to each
subject’s biomarker data. At least two measurements per subject are required in order
to estimate this slope. (top-middle) The DEM then performs a line integral using the
Euler method to recover the biomarker trajectory (top-right). Diagram made by me.
The model given by f (s) can be parametric (e.g. linear, polynomial) or non-parametric
such as Gaussian Processes (GP). We then perform a line integral along f (s) to recover
s(t). More explicitly, if we take the limit as ∆t →
− 0 from Eq. 3.8, we get that:
∆s δs
lim∆t→
−0 = = f (s) (3.9)
∆t δt
Solving this numerically is done using the Euler method. We set an initial (t0 , s0 ) and
small increment step δt and find the next pair (t1 , s1 ) as follows:
t1 = t0 + δt
s1 = s0 + f (s0 )δt (3.10)
This is repeated until the full curve defined by (t0 , s0 ), (t1 , s1 ), . . . , (tn , sn ) is recon-
structed. Since the DEM model is univariate, the process is repeated independently for
3.5. Scalar Biomarker Models 55
Figure 3.7: Biomarker trajectories estimated by the disease progression model by Jedynak
et al. [2]. Reproduced with permission from [2].
1
A multivariate model would’ve been able to use information from other biomarkers to help estimate
such a noisy trajectory, hence are more robust in theory.
56 Chapter 3. Background – Disease Progression Models
• Subjects follow a common disease progression but they have a different age at onset
and progression speed.
• The speed of progression of each subject is the same across the entire disease time-
course.
Biomarker trajectories estimated by the model for typical AD progression are shown
in Fig. 3.7. The model estimates the optimal shape2 of the biomarker trajectories, while
estimating a disease progression score for each subject, which is the stage along the disease
time course. The disease progression score sij for subject i at visit j is defined as a linear
transformation of age tij :
We further define I to be the set of all triplets (i, j, k) for which measurements are
available. Assuming independence across all measurements, we get the following model
conditional likelihood:
Y
p(y|α, β, θ, σ) = p(yijk |αi , βi , θk , σk ) (3.14)
(i,j,k)∈I
2
within a parametric family, in this case sigmoidal family
3.5. Scalar Biomarker Models 57
19 end
Algorithm 1: The optimisation procedure for the disease progression score by [2].
Figure 3.8: Biomarker trajectories estimated using the self-modelling regression approach
by [3]. Reproduced with permission from [3].
which is not true in many heterogeneous datasets such as ADNI. The DPS model can
also suffer identifiability issues when it attempts to stage very early-stage or late-stage
subjects, as in these time-windows the biomarker trajectories are mostly flat. This issue
can generally be addressed by setting priors on the time-shift and progression-speed of
the subjets.
Using the above residuals, the model is fit by initialising γi and iterating the following
steps[3]:
1. Given γi , estimate gj by setting α0ij = α1ij = 0 and iterating the following subrou-
tine:
g
(a) Estimate gj by a monotone curve fit on Rij (t)
α
(b) Estimate α0ij , α1ij using the linear mixed model of Rij (t). Repeat steps a and
b until convergence of each RSSj = it [Yij (t) − gj (t + γi ) − α0ij − α1ij t]2
P
2. Given the estimated gj , set α0ij = α1ij = ij (t) = 0 and estimate each γi with the
average of RijPover all j and t. Steps 1 and 2 are repeated until convergence of the
total RSS = ijt [Yij (t) − gj (t + γi ) − α0ij − α1ij t]2
where
exp(ηi ), ηi ∼ pi=1 N (0, ση2 )
N
α i = N
τi ∼ pi=1 N (0, στ2 ) (3.19)
i,j ∼ i,j N (0, σ 2 )
N
The model assumes that each ηi and τi are independent. The parameters of the model
are θ = [p0 , t0 , v0 , ση , στ , σ]. The model above can be re-written as:
The model has several strengths and weaknesses. The main strength lies in the flexible
Riemannian manifold framework, that allows one to create different models depending
on how the inner product is defined. Moreover, the model estimates subject specific
trajectories γi , time shifts τi and progression speeds αi . However, one of the limitations
of the model is that it assumes a parametric form of the biomarker trajectories (i.e.
sigmoidal).
Figure 3.9: Diagram of the voxelwise disease progression model by Bilgel et al. [4]. The
model places biomarker measurements along a latent ”progression score” axis, and then
models the dynamics of these measurements using linear functions. Reproduced with
permission from [4].
where a = [a1 , . . . , aK ]T , b = [b1 , . . . , bK ]T are the coefficients of the linear model and ij
is the measurement noise that is independent and identically distributed across different
subjects and visits. The matrix R(λ, ρ) is the spatial covariance that is assumed to have
the form R = ΛCΛ, where Λ is a diagonal matrix with diagonal elements λ and C is a
correlation matrix that is parameterised by ρ [4]. This ensures that the matrix R(λ, ρ) is
positive definite. In order to model correlation among voxel measurements, the elements
Ckk0 of matrix C must be a function of the distance d ≡ d(k, k 0 ) between voxels k and k 0 .
Several such options exist:
The model parameters are therefore θ = [m, ν, a, b, λ, ρ]. The model is a mixed effects
model where a, b are the fixed effects and ui are the random effects.
E-step
Let (y, u) be the complete data and θ 0 = [m0 , ν 0 , a0 , b0 , λ0 , ρ0 ] be the parameters es-
timated at the previous EM P iteration. Bilgel et al. [4] show that the E-step integral
0 0
Q(θ, θ ) is proportional to i Φ(ũi ; ûi , Σ0i )l(yi , ũi ; θ)dũi , where Φ is a multivariate nor-
R
!−1 !
X X
û0i = Zij0T R0−1 Zij0 + V 0−1 Zij0T R0−1 (yij − b0 ) + V 0−1 m0 (3.24)
j j
P −1
and covariance matrix Σ0i = 0T 0−1 0
j Zij R Zij +V 0−1
. Evaluating the integral gives
the following final form:
3.6. Spatiotemporal Disease Progression Models 63
1X 1X 1X
Q(θ, θ 0 ) = − yij − Zij û0i − b − T r ZijT R−1 Zij Σ0i −
log|R| −
2 ij 2 ij 2 ij
1X 1X 0 T 1X
ûi − m V −1 û0i − m − T r V −1 Σ0i
log|V | − (3.25)
2 i 2 i 2 i
M-step
At the M-step we need to find θ = arg maxθ Q(θ, θ 0 ). The full derivations are given
in [4], yielding the following updates:
P P P
0 0
P
( i νi ) ij yij sij − ij yij ij sij
a= P P P 2 (3.26)
T 0 02 0
( i νi ) q Σ q
ij ij i ij + s ij − ij ijs
P P P P
T 0 02 0 0
ij yij ij qij Σi qij + sij − ij yij sij ij sij
b= P T 0 P 0 2 (3.27)
02
P
( i νi ) q Σ q
ij ij i ij + s ij − s
ij ij
1X 0
m= û (3.28)
n i i
Figure 3.10: Diagram of the cortical atrophy progression model by Koval et al. [5]. (top)
The model estimates a unique, linear trajectory for the dynamics of cortical thickness
measurements at each point on the brain cortical surface. (bottom) Subject-specific
trajectories ηi and ηj are modelled by a shift of the population trajectory γ0 through
vectors wi and wj . Reproduced with permission from [5].
authors only applied it to a brain surface made of 2,000 nodes, it is unclear whether the
model can scale to higher resolutions.
Figure 3.11: Diagram of the network diffusion model by Raj et al. [6]. The model uses
MRI and DTI data to extract a structural connectome from healthy subjects through
tractography, then computes a connectivity network. Each network is represented as a
graph where nodes represent brain ROIs where there is a certain concentration of toxic
pathogens and edges represent the connectivity strength. Using this matrix, the authors
estimate the eigenvectors of the graph, also called eigenmodes, which are then shown to
correlate with atrophy patterns in normal ageing, AD and bvFTD. More precisely, for
each disease they compute the amount of atrophy within each ROI corresponding to the
graph nodes, and then correlate with the eigenmodes. Reproduced with permission from
[6].
The network diffusion model was introduced by Raj et al. in 2012 [6] and later
extended in 2015 [199]. The model is inspired by evidence that Alzheimer’s disease
pathology spreads along vulnerable pathways in a prion-like manner rather than by spatial
proximity [200, 201, 202]. The model works by simulating the diffusion process of a
pathogenic protein along a structural connectivity graph from healthy controls. Atrophy
and other higher-level pathogenic processes are assumed to be a product of the lower-level
diffusion process. See Fig 3.11 for a diagram of the model.
66 Chapter 3. Background – Disease Progression Models
dx(t)
= βHx(t) (3.33)
dt
where H is the Laplacian matrix of G defined as:
(P
j 0 6=i cij 0 for i = j
H(i, j) = (3.34)
−cij otherwise
We model the cortical atrophy in region k as the accumulation of the disease process:
Z t
φk (t) = xk (τ )dτ (3.35)
0
Raj et al. [6] present evidence to suggest that the eigenmodes ui with the highest cor-
responding eigenvalues λi represent the areas that are normally affected by key neu-
rodegenerative processes or diseases, such as normal ageing, AD and behavioural variant
frontotemporal dementia (bvFTD) respectively. They suggest that these areas are selec-
tively vulnerable to these types of dementia, in line with previous theories in the field
[38, 205, 126].
3.8. Machine Learning Methods 67
The diffusion model by Raj et al. [6] has several advantages. In contrast with the models
presented above, it is able to model the propagation of atrophy along brain connectomes,
which can be used to test prion hypothesis or other related mechanisms. Secondly, this
approach allows one to test for other hypotheses of network-based pathology spread such
as nodal stress, transneuronal spread, trophic failure, and shared vulnerability [126].
The model has several limitations. The model assumes static networks, even though
the network dynamically evolves during the time course of the disease. The model also
assumes a parametric form of the biomarker trajectories, either exponential or sigmoidal.
These discriminative models also have several limitations. First of all, they generally
require labelled data, in the form of a-priori defined clinical categories or stages, which
are usually coarse, inaccurate and biased. These limit the temporal resolution of the
model. Moreover, it is also harder to interpret what these models learn from the data,
which limits their use for understanding the disease process. For some models there is
also a lack of mathematical proofs and guarantees regarding their convergence during
training, as well as behaviour while making predictions.
3.9 Summary
In Fig. 3.1 we show a summary of the main features of data-driven disease progres-
sion models, as well as discriminative models. For each model, we show the trajectory
shape, indicate whether models incorporated latent subject-specific time-shifts (in terms
of intercept or intercept + progression speed), subject-specific trajectories in the form of
random effects as well as spatial correlation. For each model, we also indicate the key
limitation.
We can observe several key differences between the models. In terms of time-shifts,
some models such as the DEM or the network diffusion model do not incorporate any
time-shifts, although these could be extended to incorporate such time-shifts. Other
models do not model subject-specific trajectories through random effects. Moreover, only
spatiotemporal or mechanistic models incorporate correlation between different biomarker
measurements.
In conclusion, over the last few years there have been several models of disease pro-
gression that were developed, starting from the early comparisons based on symptomatic
groups and moving on to more data-driven approaches and spatiotemporal models. Fur-
ther work will focus on developing more mechanistic models that enable understanding
of the underlying disease process, and can help guide drug development. One example
of this is the recent work of [223], which models the dynamics of pathogenic proteins
in a neural network and can help understand the effects of such pathogenic proteins in
neurodegeneration. However, validation of such models is required through in vitro and
in vivo studies.
In the following chapters, I will present the application of some of these models to
estimate the progression of Posterior Cortical Atrophy (chapter 4), as well as the devel-
opment of two novel models of disease progression (chapters 6 and 7).
3.9. Summary 69
Longitudinal Neuroanatomical
Progression of Posterior Cortical
Atrophy
This chapter outlines the clinically applied part of my PhD, which focused on modelling
the progression of Posterior Cortical Atrophy using already developed methods. The con-
tent of this chapter is based on the neuroimaging results from the joint publication below,
where I’ve re-written most of the text for this thesis. I performed all the neuroimaging
work: image pre-processing, statistical analysis with EBM and DEM, and the interpreta-
tion of the results. The data from table 4.1 was gathered by Nicholas Firth. Splitting of
PCA patients into cognitively-defined subgroups was done by Silvia Primativo. Details
in section 4.3.1 regarding patient recruitment, patient numbers, clinical diagnosis and
pathological confirmation along with image acquisition details from section 4.3.2 were
taken from our joint publication.
4.1 Publications
• N. C. Firth*, S. Primativo*, R. V. Marinescu*, T. J. Shakespeare, A. Suarez-
Gonzalez, M. Lehmann, A. Carton, D. Ocal, I. Pavisic, R. W. Paterson, C. F.
Slattery, A. J. M. Foulkes, B. H. Ridha, E. Gil-Nciga, N. P. Oxtoby, A. L. Young, M.
Modat, M. J. Cardoso, S. Ourselin, N. S. Ryan, B. L. Miller, G. D. Rabinovici, E. K.
Warrington, M. N. Rossor, N. C. Fox, J. D. Warren, D. C. Alexander, J. M. Schott,
K. X. X. Yongˆ and S. J. Crutchˆ, Longitudinal neuroanatomical and cognitive
progression of posterior cortical atrophy, Brain, 2019. (*) joint first authors (ˆ)
joint senior authors
In the above manuscript, I preprocessed all the imaging data, performed the mod-
elling and statistical analysis of all the imaging data, and created the figures, tables
and diagrams (including statistical tests in the supplementary). I also drafted the
section of the results which was related to the imaging results. Other authors re-
cruited patients, collected the data, performed the analysis of cognitive tests, and
helped draft the initial version of the manuscript.
4.2 Introduction
Posterior Cortical Atrophy (PCA), already described in section 2.3, is a progressive neu-
rodegenerative syndrome causing predominantly visuospatial and visuoperceptual impair-
ments [18]. In order to understand complex disease mechanisms underlying PCA, and
design efficient clinical trials for finding treatments of PCA, we need to be able to accu-
rately estimate the temporal progression of atrophy in PCA and contrast it with typical
AD (tAD). Previous neuroimaging studies of PCA have shown more atrophy in the supe-
rior parietal, occipital and posterior temporal regions as compared to typical AD [20, 21].
However, these studies are all cross-sectional and cannot map the continuous longitudi-
nal progression of the disease. One longitudinal study of PCA [37] showed widespread
gray matter loss in both PCA and tAD, but the numbers were small (17 PCA and 16
tAD) and the time interval was short (1 year). Larger longitudinal studies are therefore
required to robustly estimate longitudinal progression patterns of PCA as compared to
tAD. Moreover, a second aspect that needs to be clarified is the heterogeneity within
PCA itself. Some studies have so far reported three dominant subgroups: primary visual
(the striate cortex, caudal), parietal (dorsal) and occipito-temporal (ventral) [31, 13].
However, evidence for the existence of these groups is mainly limited to individual case
reports [31, 13] and no previous study looked at the temporal progression of brain atrophy
in such subgroups.
The aim of this study is to estimate the progression of MRI brain volumes in PCA
as compared to tAD. We used the event-based model (EBM, section 3.5.1) and the
differential equation model (DEM, section 3.5.2) to estimate the progression of brain
volumes in 361 individuals (117 PCA, 106 tAD and 138 controls) from three centres in
the UK, Spain and US. We also use the event-based model to estimate the progression of
atrophy in three cognitively-defined PCA subgroups. Compared to previous studies, our
study is the first comprehensive study of atrophy progression in PCA. We also provide
the first glimpse into the early progression of atrophy within PCA subgroups.
4.3 Methods
4.3.1 Participants
117 patients with PCA were recruited from three specialist centres: 100 from the Demen-
tia Research Centre (DRC) UK, 9 patients from the University Hospital Virgen del Rocio
(HUVR) Memory disorders Unit, Spain and 8 patients from the University of California
San Francisco (UCSF) Memory and Aging Center, USA. All PCA participants met two
4.3. Methods 73
Imaging Neuropsychology
Visits Number Age Visit Interval Number Age Visit Interval
PCA (n=117)
All 89 63.52 ± 6.91 N/A 109 64.49 ± 7.54 N/A
2 46 62.11 ± 6.52 1.03 ± 0.47 70 63.64 ± 7.32 1.18 ± 0.48
3 31 62.75 ± 6.5 0.99 ± 0.47 45 62.73 ± 7.26 1.15 ± 0.45
4 15 61.46 ± 4.44 0.86 ± 0.31 20 63.19 ± 7.00 1.14 ± 0.40
5 9 61.73 ± 4.06 0.81 ± 0.33 7 59.44 ± 4.84 1.06 ± 0.45
6 2 62.35 ± 1.65 0.83 ± 0.24 2 57.22 ± 3.49 1.02 ± 0.35
tAD (n=106)
All 66 66.39 ± 8.58 N/A 58 65.68 ± 7.57 N/A
2 37 66.84 ± 8.83 0.83 ± 1.46 28 64.58 ± 7.08 1.35 ± 0.56
3 21 71.0 ± 6.97 0.53 ± 0.39 5 66.08 ± 2.78 1.26 ± 0.43
4 14 70.89 ± 6.33 0.47 ± 0.33 0 N/A N/A
5 4 72.08 ± 4.81 0.49 ± 0.33 0 N/A N/A
6 1 79.9 ± 0.0 0.58 ± 0.40 0 N/A N/A
Controls (n=138)
All 115 61.87 ± 10.43 N/A 49 63.12 ± 5.90 N/A
2 50 61 ± 12.01 0.79 ± 0.66 18 60.00 ± 5.87 0.91 ± 0.27
3 28 65.75 ± 5.96 0.66 ± 0.52 0 N/A N/A
4 17 66.82 ± 4.88 0.45 ± 0.28 0 N/A N/A
5 8 66.11 ± 4.83 0.44 ± 0.25 0 N/A N/A
6 0 N/A N/A 0 N/A N/A
Table 4.1: Demographic details for participants in the PCA study. Number of participants
(n), mean and standard deviation age of participants at baseline visit and mean and
standard deviation of visit interval is shown per number of visits.
widely-accepted Tang-Wai et al. [147] and Mendez, Ghajarania & Perryman [146] crite-
ria. Participants had no clinical features of other neurodegenerative disorders (e.g. visual
hallucinations, pyramidal signs), hence fulfilling the criteria for PCA-pure [36]. 106 tAD
patients and 138 healthy controls recruited from the DRC UK were also used for this
study. tAD subjects all met criteria for probable AD [224]. Available pathological and
molecular analyses for the patients (45/117 = 38% for PCA, 49/106 = 46% for tAD) all
indicated AD pathology.
Of all the study participants, 270 had undergone at least one T1 MRI scan and 216
at least one cognitive assessment. Available neuroimaging and neuropsychology data,
stratified by the number of visits, are shown in table 4.1. PCA, tAD and healthy controls
were age-matched (65.44 ± 7.51 for PCA, 65.67 ± 7.57 for tAD and 63.13 ± 5.94 for
controls). The gender proportion was as follows: 39% male for PCA, 62% male for tAD
and 50% male for controls. PCA and tAD subjects had a similar level of impairment as
measured by MMSE scores at first assessment: 20.88 ± 5.17 for PCA, 19.58 ± 5.08 for
tAD and 29.02 ± 0.98 for controls.
For analysing the heterogeneity within PCA, we split the dataset into three groups
based on performance on a suite of cognitive tests. For each subject we computed the
74 Chapter 4. Longitudinal Neuroanatomical Progression of PCA
Table 4.2: Baseline population demographics and neuropsychological data for PCA sub-
groups. For every neuropsychological test, we report the number of participants with
available data (n) and the mean and standard deviation of the available measures.
4.3. Methods 75
• Episodic memory: short recognition memory test (sRMT) for words and faces
The score for each of the four categories was computed by standardising each of
the sub-scores on a 0-100 scale, corresponding to the minimum and maximum values
obtained by the participants, and then taking the average within each category. The
subjects were then classified into three groups. The worst 1/3 of subjects (n=30) on the
early visual processing tests as compared to the memory tests (i.e. difference between
early visual and memory tests) were assigned to the vision subgroup. The remaining 2/3
of participants were split into two groups based on the difference between visuoperceptual
and visuospatial tasks: subjects with space < object performance (n=30) were assigned to
the space subgroup while remaining subjects (n=29) were assigned to the object subgroup.
Of all the PCA subjects selected for the subgroup analysis, only 23 (vision), 21 (space)
and 18 (object) had imaging data. Demographics and neuropsychological data of subjects
belonging to the PCA subgroups is shown in table 4.2.
(a) (b)
(c) (d)
Figure 4.1: Diagram of the Differential Equation Model. (a) Subject-specific biomarker
rates of change were measured from line of best fit, i.e. line slope. (b) Rate of change
model: the slopes of each fitted line were plotted against the average biomarker value of
each subject (blue crosses). A non-parametric model (Gaussian Process regression, green
line) was then fitted on measurements. (c) Trajectory reconstruction: A line integral
was performed on the rate of change model. (d) Anchoring process: to give an absolute
time reference, the origin t0 was set as the line that best separates controls from patients,
which have been staged along the time axis using their biomarker data. Diagram made
by me.
For estimating uncertainty within the EBM sequence, we used MCMC to take 100,000
samples of the event sequence, starting from the maximum likelihood solution. The
perturbation rule used is described in detail in section 3.5.1.2.
4.3. Methods 77
1
The staging of subjects using all their data required an initial trajectory alignment, which we aligned
by initially setting t0 to be the mean biomarker value of patients at baseline.
78 Chapter 4. Longitudinal Neuroanatomical Progression of PCA
We used these non-parametric tests due to non-gaussianity of the data (data is ordinal
representing ranks). The reason for using different tests (Wilcoxon vs Mann-Whitney)
is becuase in one case we compare paired samples (two events within the same sequence
sample), and in the other unpaired (two events in different sequences, e.g. in a randomly
sampled PCA sequence vs a different randomly sampled tAD sequence). We also thinned
the MCMC samples (1/100) due to dependence between consecutive samples.
For DEM results, we tested for differences in estimated biomarker values at different
timepoints (-10, 0 and 10 years from t0 ) both within- and between-groups. For every pair
of ROIs, within-group differences were assessed using two-tailed unpaired t-tests. For
all ROIs and timepoints, between-group (PCA vs tAD) differences were assessed using
similar two-tailed t-tests. For rejecting null hypotheses, we applied Bonferroni-corrected
significance thresholds for all tests performed on EBM and DEM results.
4.4 Results
4.4.1 Progression of PCA and Typical AD
Fig. 4.2 shows the maximum likelihood progression of atrophy estimated by the EBM, for
both PCA and tAD patients. Snapshots of brain atrophy were taken at model stages 4,
8, 16, 24, 32, 40 and 46 (of 46) using the template from Supplementary Fig. A.1. Figure
4.3 shows the maximum likelihood sequence and the variance in the main sequence. PCA
patients show early atrophy in occipital areas, ventricles and the superior parietal region,
while tAD patients show early atrophy in the amygdala, hippocampus and entorhinal
cortex, followed by temporal areas. The ordering is largely preserved under bootstrapping
(Supplementary Fig. A.2), and supported by statistical testing (Supplementary Fig.
A.3). Differences in abnormality sequences between PCA and tAD are also statistically
significant under Bonferroni corrections (Supplementary Figure A.7).
Fig. 4.4 shows the DEM-estimated biomarker trajectories for PCA (left) and tAD
(right). Confidence estimates of the mean trajectory are also given in Fig. 4.5. Amongst
PCA patients, occipital and parietal atrophy was most evident before t0 , and by t0 we
also observe considerable atrophy in the temporal lobe. Between t0 and 10 years after
t0 , we observe a marked increase in the rate of occipital, parietal and temporal atrophy
and ventricular expansion. By contrast, hippocampal, entorhinal and frontal atrophy
never match the extent of tissue loss in posterior and temporal regions. After 10 years
from t0 , atrophy rates in occipital, parietal and temporal lobes seem to slow down, but
limited data in this time window prevents drawing any clear conclusions. Statistical
testing within the PCA cohort also confirms our conclusions – see Supplementary Tables
A.1, A.2 and A.3.
By contrast, before t0 tAD patients showed most extensive tissue loss in the hippocam-
pus, confirmed by significance tests between hippocampal volume and other regions (p <
4e-05, see Supplementary Figs. A.4 and A.5). After t0 , subsequent rates of change are the
highest for temporal atrophy and ventricular expansion. It is of note that within 12 years
from t0 , model estimates of parietal and ventricular abnormality amongst tAD patients
are equivalent to or exceed the relative extent of hippocampal abnormality. Comparing
PCA and tAD trajectories directly (Fig. 4.5), the separation between groups at t0 is
greatest in parietal (PCA > tAD, p < 1e-6) and hippocampal (tAD > PCA, p < 1e-22)
volumes – see Supplementary table A.7 for full statistical testing.
4.4. Results 79
normal abnormal
Figure 4.2: Atrophy progression in PCA and tAD patients according to the event-based
model. White regions are within the volume range of healthy controls, while red regions
show abnormally low volumes by the corresponding stage, with shading indicating the
probability of abnormality. By each stage, a number of biomarkers shaded in red became
abnormal. Brain pictures generated using BrainPainter [227]
Figure 4.3: Uncertainty in the EBM-estimated atrophy sequences for (top) PCA and
(bottom) tAD from Fig 4.2. The ROIs on the Y-axis are ordered according to the timing
of abnormality, from early abnormalities on the top to late abnormalities on the bottom.
The X-axis shows the position of a biomarker in the abnormality sequence. Each pixel at
position (i, j) shows the probability of biomarker j becoming abnormal at position i, with
darker squares showing higher confidence and whiter squares showing lower confidence.
The biomarker orderings are sampled from the EBM posterior distribution.
4.4. Results 81
Hippocampus Entorhinal
Hippocampus Whole Brain FrontalParietal
Entorhinal Occipital Parietal Whole Brain
Whole Brain Ventricles
Frontal Frontal
Occipital TemporalOccipitalEntorhinal TemporalHippocampus
Ventricles Ventricles Parietal Temporal
2 2 2
Figure 4.4: (a-b) Trajectories of different ROI volumes from the differential equation
model for (a) PCA progression and (b) tAD progression. The x-axis shows the number
of years since t0 , and the y-axis shows the z-score of the ROI volume relative to controls.
The trajectories of the ventricles have been flipped to aid comparison. Overlayed are
histograms of subject stages based on the estimated trajectories.
8 8 8 8
10 0 10 20 10 0 10 20 10 0 10 20 10 0 10 20
Occipital Temporal Frontal Parietal
2 2 2 2
0 0 0 0
2 2 2 2
4 4 4 4
6 6 6 6
8 8 8 8
10 0 10 20 10 0 10 20 10 0 10 20 10 0 10 20
Years since t0
Figure 4.5: Mean trajectories for ROI volumes for PCA and tAD aligned on the same
temporal scale with samples from the posterior distribution showing the confidence of the
mean trajectory. The axis shows the number of years since t0 , and the y-axis shows the
z-score of the ROI volume relative to controls. The trajectories for the ventricles have
been flipped to aid visual comparison. The 1 std and 0 std horizontal lines represent the
limit of 1 and 0 standard deviations away from the mean values of controls.
82 Chapter 4. Longitudinal Neuroanatomical Progression of PCA
normal abnormal
Figure 4.6: Early atrophy progression within the three cognitively-defined PCA sub-
groups, as estimated by the EBM. The top figures shows snapshots of the atrophy pat-
terns for the first 7 stages in the EBM, while the last row shows the uncertainty in the
atrophy progression sequence. Brain pictures generated using BrainPainter [227]
4.5. Discussion 83
4.5 Discussion
In this work we performed one of the first longitudinal studies of atrophy progression
in PCA. Results suggest that in PCA occipital and superior parietal areas are the first
to become affected, followed by temporal areas. By 10 years after t0 , there seems to be
widespread atrophy in the occipital, parietal and temporal areas, as well as ventricular
expansion. In contrast, tAD seems to have significant early atrophy in the hippocampus,
with subsequent temporal atrophy and ventricular expansion starting 5 years after t0 .
Regarding PCA heterogeneity, our study also provided the first glimpse into the early
longitudinal patterns of atrophy within three cognitively defined PCA subtypes. We
found early phenotype-specific patterns of atrophy within each cognitively-defined PCA
subgroup. These patterns of pathology overlap with the pathways that are hypothesised
to be affected within each group: striate cortex for the vision subgroup, dorsal pathway
for the space subgroup and ventral pathway for the object subgroup. Nonetheless, among
the subgroups there is considerable variability in these patterns as well as spatial overlap,
which might suggest that these should not necessarily be interpreted as distinct diseases,
but rather that the patients lie on a continuum of phenotypical variation, as suggested
by [144].
Our study has several strengths. First of all, the large number of PCA subjects with
longitudinal neuroimaging and cognitive data allowed us to perform a robust analysis of
PCA atrophy progression. The EBM and DEM methods we used are all data-driven,
don’t require manual biomarker thresholds and don’t rely on diagnostic classes, which
are often noisy and biased. Moreover, the ability of the EBM to work with limited
cross-sectional data allowed us to estimate the progression of PCA subgroups, which are
small and have limited longitudinal data available. An advantage of the DEM method is
its ability to fit continuous, non-parametric biomarker trajectories based on GPs, which
makes it suitable for modelling biomarkers whose trajectories have varying shapes.
Nevertheless, our study has several limitations that need to be addressed. First of
all, since data was acquired over an extended period of time, not all subjects had CSF,
molecular or pathological confirmation for Alzheimer’s disease. This can be a problem,
as previous studies suggested that at least half of patients who receive a diagnosis of
probable AD actually have other non-AD underlying pathologies [228, 229]. Follow-up
studies will need to have a higher proportion of patients with pathological or molecular
confirmation. Moreover, the data was acquired in three different centres using different
scanners and field strengths, although we adjusted for these covariates.
The EBM and DEM models that we employed also have several limitations that
we acknowledge. First of all, both methodologies assume all subjects follow the same
progression sequence. Secondly, the DEM requires longitudinal data, which prevented
us from fitting the DEM to the PCA subgroups, who lacked enough longitudinal data.
Another assumption made by the EBM is that the control population is well-defined, as
we fit the distribution of normal biomarker values directly on the biomarker values of the
control population. The EBM also assumes simplistic, step-wise biomarker trajectories
that switch from a normal to an abnormal value. With respect to the DEM, the approach
requires a reference timepoint, which we took it to be the threshold that best separates
the controls from patients after disease staging.
There are several avenues for future research. Further molecular and pathological
confirmation can be obtained for the remaining patients to ensure they all have a reliable
diagnosis, which will enable an unbiased estimation of the progression sequence. The
84 Chapter 4. Longitudinal Neuroanatomical Progression of PCA
EBM and DEM methodologies can be further extended to allow random effects or to fit
different progression sequences for different sub-populations in a data-driven way, such
as the approach of [230]. Information about the rate and extent of atrophy in the PCA
subgroups can also be computed after enough data has been acquired. A well-defined
control population for the EBM can also be defined by selecting only amyloid-negative
subjects or by other types of stratification. The EBM model can be extended to model
more complex trajectory shapes, while the DEM can be further extended to a multivariate
approach that inherently aligns the biomarker trajectories.
Finally, one of the key directions of future research is to understand the disease mech-
anisms underlying PCA. To this end, several methods can be used to estimate these
mechanisms, such as those based on propagation of pathogenic proteins [6, 223] or the
architecture of brain networks [126]. The influence of genetic factors such as Alipoprotein
E (APOE) status [150, 151] and other factors recently identified [231, 150] from genome-
wide association studies also need to be understood. This research will lead the way
towards drug development in PCA clinical trials and will allow the selection of robust
outcome measures and fine-grained patient stratification in clinical trials in PCA.
4.6 Conclusion
In this work I performed a statistical analysis of the neuroimaging data from PCA and
tAD subjects from the DRC, HUVR and UCSF centres. I pre-processed all the MRI
images and applied the event-based model and the differential equation model on the
PCA and tAD cohorts, as well as on three cognitively-defined PCA subgroups. The
analysis I made gives the first glimpse into the longitudinal progression of atrophy in
PCA subjects, and into the early longitudinal patterns of atrophy in the vision, space
and object subgroups.
In the following chapter, I will present some novel extensions to the EBM and DEM
models that will enable better estimation of the parameters for the EBM and alignment of
the biomarker trajectories for the DEM. These improvements can provide a more accurate
disease signature, and remove the need for ad-hoc methods of estimating parameters.
Chapter 5
5.1 Contributions
In this work I present methodological extensions to the event-based model (EBM) and
differential equation model (DEM) and I evaluate their performance compared to the
standard implementations. In order to assess differences between these methods more
accurately, I also propose novel performance measures based on disease staging consis-
tency and prediction of time elapsed between visits. I formulated and implemented the
novel methodologies, and performed their evaluation. I also pre-processed the DRC MRI
scans. My colleague Alexandra Young pre-processed the ADNI data.
5.2 Introduction
Many data-driven disease progression models (DPMs) that have been presented in chap-
ter 3 make assumptions about the biomarker data and the model parameters, which limit
their usefulness on practical applications. For example, the differential equation model
by [25] is univariate, hence it assumes independence across different biomarkers. In order
to place biomarker trajectories on the same time frame, in the previous chapter we used a
post-hoc anchoring process (see section 4.3.3.2). This anchoring is inaccurate, as it relies
on setting the reference time t0 using biomarker values of a clinical group (i.e. controls
or AD). This anchoring process is challenging because of singularities arising from flat
trajectories1 , and the fact that subjects are at different stages along the disease. Another
limitation of some DPMs is that the fitting algorithm assumes independence between dif-
ferent sets of parameters. While this is done in order to ensure computational tractability,
this yields inaccurate parameter estimates. In particular, the event-based model param-
eter estimation procedure proposed by [23] and [24] assumes that the parameters of the
likelihood models for normal and abnormal values are independent of the abnormality
1
The alignment is performed by setting t0 = 0 so that f (t0 ) = mean(patients). However, if the
trajectory is flat then there are many points t0 that match the mean of patients. Even is the trajectory
is not fully flat, the measurement noise is amplified by the low slope of f .
86 Chapter 5. Novel Extensions to the EBM and DEM
sequence. Some better parameter estimation procedures are therefore needed, which can
ensure a robust data fit.
The evaluation of the performance of disease progression models is another open prob-
lem that has not been addressed so far. While previous studies used accuracy of clinical
status predictions[24], clinical diagnosis is often not reliable without neuropathological
confirmation – one study reported that a clinical diagnosis of probable AD has between
70.9–87.3% accuracy and between 44.3%–70.8% specificity. Therefore, performance met-
rics based on the prediction of clinical diagnosis might not be sufficiently sensitive to
differences in the performance of such algorithms. While [23] computed the number of
subjects with increased staging over time – a performance measure that doesn’t rely on
clinical diagnosis – it does not take model uncertainty of staging into account and it is
specific to discrete models such as the event-based model.
In this chapter we suggest novel extensions in the event-based model and differen-
tial equation model and propose four novel performance measures for evaluating disease
progression models that don’t rely on clinical diagnosis. For the event-based model, we
devise two novel fitting procedures that perform joint optimisation of the parameters of
the normal and abnormal likelihood models, as well as the abnormality sequence. For
the differential equation model, we devise a novel data-driven way to align the biomarker
trajectories to a common axis by estimating trajectory-specific and subject-specific time
shifts. The novel performance measures that we propose exploit uncertainty in the es-
timated stages and are also suitable for evaluating continuous trajectory models. Using
data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Dementia
Research Centre (DRC), UK, we show that the novel models generally have better or
equal performance compared to standard models. Moreover, we also show that the novel
performance measures that we proposed are more sensitive to changes in models than
standard measures based on the prediction of diagnosis or conversion status.
5.3 Methods
5.3.1 EBM Extensions
In this section we outline two novel methods of parameter fitting for the event-based
model: a blocked MCMC sampling of the distribution parameters and event ordering
(section 5.3.1.1), and an Expectation-Maximisation approach (section 5.3.1.2). Further-
more, we also present a novel methodology performing a data-driven temporal alignment
of the differential equation model trajectories (section 5.3.2).
where xij represents the value of biomarker j from subject i and is informative of event Ej
in subject i, P is the number of subjects and N is the number of biomarkers. The abnor-
5.3. Methods 87
We maximise this likelihood using blocked MCMC sampling, where at each step we
only propose parameters for biomarker j, i.e. [µnj , σjn , µaj , σja ] along with a new sequence
Sjnew where only event Ej changed its position. The distribution parameters for the
other biomarkers and the ordering of the other events i 6= j are kept the same. This
blocked approach can lead to faster convergence because there is strong dependence
between parameters corresponding to the same biomarker and between the position of the
corresponding event in the sequence. The covariance matrix of the proposal distribution
is estimated by taking 100 bootstraps of the dataset and computing the covariance of
[µc , σ c , µp , σ p ], where µc , σ c are the mean and standard deviation of the control group
while µp , σ p are the mean and standard deviation of the patient group.
P X
" zi N
#
X X X
Q(θ|θold ) = p(Zi = zi |Xi , θold ) log p(xij |ES(j) ) + log p(xij |¬ES(j) )
i=1 zi j=1 j=zi +1
(5.3)
88 Chapter 5. Novel Extensions to the EBM and DEM
d
We find the maximum for µnk , the mean of p(x|¬Ek ), by solving dµn
Q(θ|θold ) = 0.
k
This gives the following update equation for µnk :
P
X
µnk = xik win (5.4)
i=1
Assuming the data from each subject is conditionally independent given zp , we get
the full likelihood:
P X
Y B
Y
p(X|t1 , . . . , tB , σ1 , . . . , σB ) = p(Zp = zp ) N (xpb |fb (zp − tb ), σb ) (5.8)
p=1 zp b=1
5.3. Methods 89
This likelihood can be optimised with any method of choice such as MCMC sampling
or gradient methods. We chose to optimise the model using an iterative approach, where
for each biomarker b we optimise it’s trajectory shift tb conditioned on all the other
parameters (Markov blanket), and then estimate it’s measurement noise σb .
Ti
N X
1 X
Ch = PN I[Mti > Mt−1
i
] (5.9)
−N + i=1 Ti i=1 t=2
where S is the set of possible stages in the disease progression model. We then define
the soft staging consistency for the whole population as the mean of subject-specific
consistencies for consecutive timepoints:
Ti
N X
1 X
Cs = PN Csi (t1 , t2 ) (5.11)
−N + i=1 Ti i=1 t=2
90 Chapter 5. Novel Extensions to the EBM and DEM
where ait is the age of subject i at timepoint t, Mti is the maximum likelihood stage for
subject i at timepoint t and τ (Mti ) is the estimated time from onset associated with stage
Mti . The equivalent soft time-lapse metric Ds , which uses probabilistic staging variables
zti , is defined as:
Ti
N X
1 X
E[τ (zti ) − τ (zt−1
i
Ds = PN )] − (ait − ait−1 ) (5.13)
−N + i=1 Ti i=1 t=2
Demographics CN PCA AD
Number 89 74 67
Sex M/F 33/56 28/46 35/32
Age (years) 61 ± 11 63 ± 7 66 ± 9
Years from onset - 4.5 ± 2.8 4.8 ± 2.6
Number of visits 2.8 ± 2.5 2.5 ± 1.7 3.0 ± 2.7
middle and superior occipital and the occipital fusiform and lingual; (e) 5 parietal regions:
superior parietal, angular, precuneus, supramarginal and postcentral; (f) 4 temporal
regions: inferior, middle and superior temporal along with fusiform; (g) 4 frontal regions:
superior, middle and inferior frontal along with precentral; and (h) 3 limbic regions:
entorhinal, parahippocampal and posterior cingulate.
Demographics CN MCI AD
Number 92 129 64
Sex M/F 48/44 82/47 34/30
Age (years) 75 ± 5 73 ± 7 75 ± 8
Education (years) 15.6 ± 2.9 15.9 ± 3 15 ± 3
APOE +/- 22/70 72/57 45/19
FreeSurfer Version 4.3 was used to compute regional volumes of the hippocampus, en-
torhinal cortex, middle temporal gyrus, fusiform gyrus, ventricles, whole brain and total
intracranial volume (TIV) at baseline, 12- and 24-month follow-up. All regional volumes
were normalised for each subject by dividing by TIV. Atrophy rates for the whole brain
and hippocampus were estimated using the Boundary Shift Integral (BSI) ([232]) using
the scans at baseline and 12-months follow-up. In particular, volume change for the
whole brain was measured using the KN-BSI method ([233]) and for hippocampus using
the MAPS-HBSI method ([234]).
We used the same biomarker set as the one used by [24], which included 14 biomark-
ers in total: (a) three CSF biomarkers: amyloid-β1−42 , phosphorylated tau and total
tau; (b) 3 cognitive tests: Alzheimer’s Disease Assessment Scale - Cognitive Subscale
(ADAS-Cog), Rey Auditory Verbal Learning Test (RAVLT) and the Mini-Mental State
Examination (MMSE); (c) six regional brain volumes: whole brain, ventricles, hippocam-
pus, entorhinal, middle temporal gyrus and fusiform gyrus; (d) rates of atrophy for two
ROIs: hippocampus and whole brain.
92 Chapter 5. Novel Extensions to the EBM and DEM
5.4 Results
We tested all novel EBM and DEM methods, along with their standard implementations.
We evaluated each model using the staging consistency and time-lapse metrics, using
data from the DRC and ADNI datasets. On the DRC dataset, we also evaluated the
models with respect to diagnosis prediction, while on ADNI we evaluated them based on
prediction of conversion from healthy controls to mild cognitive impairment (MCI) and
from MCI to Alzheimer’s disease.
Table 5.3: Model performance according to staging-based metrics on PCA subjects from
the DRC cohort. The mean and standard deviations are calculated for each testing set
in 10-fold cross-validation.
5.4. Results 93
Table 5.5: Model performance at diagnosis prediction on the DRC cohort. Each en-
try shows the mean and standard deviation of the balanced accuracy across the cross-
validation folds.
In table 5.6 we show the staging-based performance results of the progression models
on the ADNI dataset. As with the DRC results, for each metric we show its mean and
standard deviation over the 10 cross-validation folds. In table 5.7 we also evaluated the
models on how well they predict conversion from MCI to AD at 12-months, 24-months and
36-months from baseline visit. We did not compute results for prediction of conversion
status in controls due to small and very imbalanced datasets (i.e. only ).
94 Chapter 5. Novel Extensions to the EBM and DEM
5.5 Discussion
5.5.1 Model Performance on DRC cohort
In the PCA cohort, we notice that the extended EBM methods show better results
compared to the standard EBM method, whereas the extended DEM method has equal
performance compared to the standard method. When comparing EBM vs DEM models,
most EBM models perform as well as the DEM models in terms of hard staging consistency
but relatively worse in soft staging consistency. There is also a drop in EBM staging
consistency when moving from the hard to the soft staging consistency, which can be
explained by the discrete nature of the EBM and by the simplistic biomarker trajectories,
effectively modelled as step-functions, which can result in significant staging uncertainty.
In the AD cohort we again find that the novel EBM methods show improvements
over standard methods, while there is no significant difference between the novel and
standard DEM methods. When comparing EBMs vs DEMs, we notice that the EBM
models actually perform better in terms of hard-, but worse in soft-staging consistency.
5.5. Discussion 95
This could again be due to overly simplistic EBM trajectories that might not offer a good
fit to the data.
In the diagnosis prediction tasks, most disease progression models have similar per-
formance, with only the Standard EBM having a low performance in the PCA vs AD
test. The SVM classifier has slightly worse results compared to the disease progression
models for the Controls vs PCA task, but similar results for the other tasks.
In the ADNI cohort, we notice that the extended EBM and DEM methods have similar
performance to the standard methods. There is again a drop in EBM performance on the
soft consistency metric as compared to the hard consistency. The fact that there is no im-
provement in ADNI data between the novel methods and the standard methods suggests
that the standard methods already offered a good fit on this dataset, and further that the
ADNI dataset has different characteristics compared to the DRC dataset. We attribute
this to the fact that the biomarkers present in the ADNI dataset were multimodal and
included both early-stage mollecular markers as well as late-stage cognitive tests, which
enabled even the standard models to robustly estimate the subjects’ disease stages.
The results on conversion prediction in ADNI show that all models have a broadly
similar performance at this task. However, a few clear differences can be noticed in some
models. The model with the best performance at 12-months and 24-months conversion
prediction is the DEM with standard trajectory alignment, while at 36-month conversion
the SVM and the novel EBM and DEM methods perform the best. The fact that different
models have different performance at different durations-of-conversion suggests different
models have better fits on certain time-frames of the disease time course.
5.6 Summary
In this work we presented several extensions of the EBM and the DEM. We further devised
performance metrics that measure the accuracy of the predicted subject stages and clinical
diagnosis. We evaluated the new methodologies on data from two distinct diseases (PCA
vs tAD), and on two independent datasets (ADNI and DRC). Our results show that
in many situations the novel EBM and DEM fitting methods show improvements with
respect to our performance metrics compared to the standard versions.
differences in these methodologies, which might not be detectable in patient datasets due
to inherent measurement noise and disease heterogeneity.
5.7 Conclusion
In this chapter I presented methodological extensions in the EBM and DEM, and evalu-
ated their performance based on a set of performance measures, some of which I proposed.
Future work will focus on evaluating other types of disease progression models presented
in chapter 3, or on devising more sensitive performance metrics, for evaluation on both
simulated data as well as patient datasets.
In the next chapter, I will present DIVE: a novel disease progression model that can
estimate fine-grained spatial patterns of brain pathology, and estimate latent subject-
specific time-shifts. Such a model overcomes a some limitations of the EBM and DEM
models, which do not take spatial correlation into account and assume a pre-defined
ROI atlas. DIVE can also help us better understand underlying disease mechanisms by
studying the overlap between spatial patterns of pathology and brain connectomes.
98 Chapter 5. Novel Extensions to the EBM and DEM
Chapter 6
DIVE: A Spatiotemporal
Progression Model of Brain
Pathology in Neurodegenerative
Disorders
6.1 Publications
• R. V. Marinescu, A. Eshaghi, M. Lorenzi, A. L. Young, N. P. Oxtoby, S. Garbarino,
T. J. Shakespeare, S. J. Crutch and D. C. Alexander, A Vertex Clustering Model
for Disease Progression: Application to Cortical Thickness Images, Information
Processing in Medical Imaging, 2017
6.2 Introduction
Current image-based disease progression models, such as those presented in section 3.5,
estimate the evolution of the disease using a small set of biomarkers corresponding to
pre-defined regions-of-interest (ROI). This ROI parcellation is usually coarse and doesn’t
allow one to find spatially dispersed patterns of atrophy. While spatiotemporal longitudi-
nal models have already been demonstrated [238, 239, 240], these models regress against
pre-defined sets of covariates such as age, time since baseline or clinical markers. This is
problematic because, age-based alignment of subjects assumes all subjects have the same
100 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
age of disease onset, while for time since baseline, its relationship with disease onset is un-
known. Similarly, clinical markers are noisy, biased, suffer from floor/ceiling and training
effects, are not sensitive in pre-symptomatic phases, and have low test-retest reliability
[241]. Recently, some spatiotemporal models that estimate subject-specific time-shifts
have been developed [4, 5]. However, these models generally cannot recover dispersed
and disconnected pathological patterns, because they assume voxel measurements corre-
late based on spatial distance, either through a distance function or distance from control
points. However, spatially dispersed pathological patterns have been observed in AD and
related dementias and are hypothesised to appear due to the interaction of pathology with
brain networks [38]. Discovering such fine-grained patterns could allow one to understand
underlying mechanisms of pathology propagation along these networks. However, a spa-
tiotemporal disease progression model that allows recovery of dispersed and disconnected
atrophy patterns present in AD, is not currently available.
In this work, we present DIVE: Data-driven Inference of Vertexwise Evolution. DIVE
is a novel disease progression model with single vertex resolution that makes only weak
assumptions on spatial correlation. In contrast to approaches which model temporal
trajectories for a small set of biomarker measures based on a priori defined ROIs, DIVE
models temporal trajectories for each vertex on the cortical surface. DIVE combines
unsupervised learning and disease progression modelling to identify clusters of vertices
on the cortical surface that show a similar trajectory of brain pathology over a particular
patient cohort. This formulation enables us to estimate a fine-grained spatial distribution
of pathology and also provides a novel parcellation of the brain based on temporal change.
We first test DIVE on synthetic data and show that the model can recover known
biomarker trajectories and time-shifts. We then demonstrate the model on both MRI and
PET data from two cohorts: the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and
the Dementia Research Centre (DRC), UK. We use the model to reveal spatiotemporal
patterns of pathology to a much finer resolution than previous models and demonstrate
the ability to assign subjects to stages that predict progression. Finally, we validate
DIVE in terms of how robust are the estimated pathology patterns and how well the
disease progression scores correlate with cognitive tests. Code for DIVE is available
online: https://github.com/mrazvan22/dive.
6.3 Methods
In this section we describe the mathematical formulation of DIVE (section 6.3.1), then
we show how to fit the model using Expectation Maximisation (section 6.3.6) and we
describe further implementation details of the algorithm (section 6.3.7). Afterwards, we
outline the synthetic data-generation process (section 6.3.8) for testing the model in the
presence of ground truth, as well as the pipeline for pre-processing the ADNI and DRC
datasets (section 6.3.9).
extract vertexwise/
B voxelwise measures
Vertex 1 measure
Vertex measure
progression scores
Disease
Progression
Score Disease Disease
Progression score Progression score
Figure 6.1: Diagram of the proposed DIVE model. DIVE assumes that biomarkers of
pathology (e.g. cortical thinning) can be measured at many vertices (i.e. locations) on the
cortical surface (A), where each vertex has a distinct trajectory of change during disease
progression (B). In (B), each individual has measurements for vertex 1 at three visits.
DIVE assigns to every cortical vertex one of a small set of temporal trajectories describing
the change in some image-based measurement (e.g. cortical thickness, amyloid PET, DTI
fractional anisotropy measures) from beginning to end of the disease progression. The
estimation process simultaneously estimates the set of clusters, the trajectory defining
each cluster, and the position of each subject along the trajectories, which are defined
on a common timeline. The process iterates assignment of each vertex to clusters (red,
green and blue in this diagram) (C), estimation of the trajectory in each cluster (D) and
estimation of the disease progression score (location along trajectory) for each subject
(E), all within an Expectation-Maximisation framework, until convergence. In particular,
(E) shows how the disease progression score, which is initially set to the individual’s age,
converges to the disease stage of the subject. Diagram made by me.
102 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
cortical surface (or voxel in the 3D brain volume), we estimate a unique trajectory along
the disease progression timeline (Fig 6.1B), while also estimating subject/visit-specific
disease progression scores (i.e. disease stages). We do that by grouping vertices with
similar biomarker trajectories into clusters (Fig 6.1C), and we estimate a representative
trajectory for every cluster (Fig 6.1D). Each trajectory is a function of subject-/visit-
specific disease progression scores (DPS) (Fig 6.1E). The DPS depends linearly on the
time since baseline visit, but with subject-specific slope and intercept.
p(Vlij |αi , βi , θZl , σZl , Zl ) = N (Vlij |f (αi tij + βi |θZl ), σZl ) (6.3)
where N (Vlij |f (αi tij +βi |θZl ), σZl ) represents the probability density function (pdf) of the
normal distribution that models the measurement noise along the sigmoidal trajectory
of cluster Zl , having variance σZl . Next, we assume the measurements from different
subjects are independent, while the measurements from the same subject i at different
visits j are linked using the disease progression score from equation 6.1. Moreover, we
also assume a uniform prior on Zl . This gives the following model:
6.3. Methods 103
Y
p(Vl , Zl |α, β, θ, σ) = N (Vlij |f (αi tij + βi |θZl ), σZl ) (6.4)
(i,j)∈I
where I = (i, j) represents the set of all the subjects i and their corresponding visits j.
Furthermore, Vl = [Vlij |(i, j) ∈ I] is the 1D array of all the values for vertex l across every
subject and corresponding visit. Vectors α = [α1 , . . . , αS ] and β = [β1 , . . . , βS ], where
S is the number of subjects, denote the stacked parameters for the subject shifts. If a
subject i has multiple visits, these visits share the same parameters αi and βi . Vectors
θ = [θ1 , . . . , θK ] and σ = [σ1 , . . . , σK ], with K being the number of clusters, represent the
stacked parameters for the sigmoidal trajectories and measurement noise specific to each
cluster.
Due to our main motivation of modelling population trajectories and in order to
ensure robustness and identifiability, we did not add random effects to the trajectories of
specific subjects.
Throughout the article, we will use the shorthand zlk = p(Zl = k).
104 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
L
Y Y Y
p(V, Z|α, β, θ, σ, λ) = N (Vlij |f (αi tij + βi |θZl ), σZl ) Ψ(Zl , Zl2 ) (6.9)
l (i,j)∈I l2 ∈Nl
where Ψ(Zl , Zl2 ) is a clique term representing the likelihood of a neighbouring vertex
l2 to have similar label with vertex l. The formula for the clique term is:
(
exp(g(λ)) if k = k2
Ψ(Zl = k, Zl2 = k2 ) = (6.10)
exp(−h(λ)) otherwise
where λ is a parameter controlling how much to penalise neighbouring vertices that
belong to distinct clusters, and g and h are positive, monotonic functions over the λ > 0
range. We choose g(λ) = λ and h(λ) = λ2 , which results in a concave objective function
for λ, ensuring that it can later be optimised (see M-step).
Therefore, the model parameters that need to be estimated are M = [α, β, θ, σ, λ]
where α and β are the subject specific shifting parameters, θ and σ are the cluster specific
trajectory and noise parameters and λ is the clique parameter denoting the penalisation
of spatially non-smooth assignments of latent variables Z.
6.3.6.1 E-step
In the Expectation step, at iteration u we seek an estimate of p(Z|V, M (u−1) ), given the
(u−1) (u−1) (u−1) (u−1) (u−1)
current estimates of the parameters M (u−1) = [θk , σk , αi , βi , λi ]. We
perform this using Iterated Conditional Modes [195], which performs coordinate-wise
gradient ascent. This works by conditioning the clique terms Z on the values of Z from
the previous iterations. This approximation gives the following factorisable likelihood:
L
Y h i
(u−1) (u−1)
p(Z|V, M )≈ EZ (u−1) |V ,M p(Zl |Vl , M, ZNl ) (6.11)
Nl l
l
The factorised form allows for tractable computation and memory storage of p(Z).
Let zlk (u) = p(Zl = k|Vl , M (u−1) , Z (u−1) ). After simplifications we reach the following
update rule:
" #
X h i
(u) (u−1)
log zlk ∝ Dlk + log exp(−λ2 ) + zl2 k (exp(λ) − exp(−λ2 )) (6.12)
l2 ∈Nl
The full derivation is given in Supplementary section B.3. In order to enable opti-
misation over λ, a final modification of this step is performed, by considering zlk to be
functions ζlk (λ) over λ. This results in the update equation from Alg. 6.2, line 18 which
is based on pre-defined terms on lines 13-14.
6.3.6.2 M-step
In the Maximisation step we try to estimate the model parameters M = (α, β, θ, σ, λ)
that maximise EZ|V,M (u−1) [log p(V, Z|M )]. We cannot simultaneously optimise all 5 sets
of parameters, so we optimise them independently. In order to get the update rule for the
trajectory parameters θk corresponding to cluster k we need to maximise the expected log
likelihood with respect to θk . The key observation here is that if we assume fixed α, β and
Z, then the trajectory parameters θk for every cluster k are conditionally independent, i.e.
θk ⊥⊥ θm |(Z, α, β, σ) ∀ (k, m), k 6= m. This allows us to maximise every θk independently
using the following equation:
106 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
(0)
1 Initialise α(0) , β (0) , zlk
2 while θ, σ, α, β or zlk not converged do
; // M-step 1: For each cluster, optimise its trajectory
3 for k = 1 to K do
(u) (u−1) P (u−1) (u−1)
θk = arg minθk Ll=1 zlk ij
|θk ))2 − log p(θk )
P
4 (i,j)∈I (Vl − f (αi tij + βi
(u) (u)
5 θ = make identifiable(θk )
k 2
(u) 1
PL (u−1) P ij (u−1) (u−1) (u) 2
6 σk = |I| l=1 zlk (i,j)∈I (Vl − f (αi tij + βi |θk )) − log p(σk )
7 end
; // M-step 2: For each subject, optimise its time shift αi and progression
speed βi
8 for i = 1 to S do " #
(u−1)
(u) (u) PL PK z ij (u)
βi |θk ))2
P
9 αi , βi = arg minαi ,βi l=1 k=1
lk 2
(u) j∈Ii (Vl − f (αi tij + − log p(αi , βi )
2 σk
10 end
; // E-step 1: Define functions ζlk (λ) computing zlk , the probability of
vertex l being assigned to cluster k, given fixed λ
11 for l = 1 to L do
12 for k = 1 to K do
; // Pre-compute data fit terms Dlk
2
(u) ij (u) (u) (u) 2
Dlk = − 21 log (2π σk 1
P
13 )|I|− (u) 2
i,j∈I (Vl −f (αi tij +βi |θk ))
2 σk
14 ζlk (λ)≈ h i
(u−1)
exp(−λ2 ) + zl2 k (exp(λ) − exp(−λ2 ))
P
exp Dlk + l2 ∈Nl log
15 end
16 end
; // M-step 3: optimise clique term λ using above definitions in E-step 1
17 λ(u) =
arg maxλ Ll=1 K
P P P 2
P
k=1 ζlk (λ) Dlk + λ l2 ∈Nl ζl2 k (λ) − λ l2 ∈Nl (1 − ζl2 k (λ))
; // E-step 2: Compute next zlk using the best λ
(u)
18 zlk = ζlk (λ(u) )
19 end
(u) (u)
(u) αi (u) βi −µN
20 αi = σN
, βi = σN
; // Re-scale subject shifts
Figure 6.2: The DIVE parameter estimation algorithm. The algorithm, based on
Expectation-Maximisation, iteratively optimises the assignment of vertices to clusters
(E-step) and the parameters for the biomarker trajectories and subject time-shifts (M-
step).
6.3. Methods 107
K
X YL Y
θk = arg max p(Z|V, M (u−1) ) log N (Vlij |f (αi tij + βi |θzl ), σzl ) + log p(θk )
θk z1 ,...,zL l=1 (i,j)∈I
(6.14)
A similar observation of conditional independence can also be observed for the latent
variables Z. This allows us to decompose the joint distribution over Z, and after ex-
panding the noise model we reach the optimisation problem from Alg. 6.2, line 4. See
Supplementary section B.3 for full derivation. This does not have a closed-form solution,
so we use numerical optimisation for finding θk that maximises the equation from Alg.
6.2, line 4.
A similar equation, yet in closed form, is also obtained for σk (Alg. 6.2, line 6). After
estimating θ and σ for every cluster, we use the new values to estimate the subject specific
parameters α and β. For every subject i, we maximise the expected log likelihood with
respect to αi , βi independently, and after simplifications we obtain the update rule from
Alg. 6.2, line 9, which is again solved using numerical optimisation. For the numerical
optimisation of θ we used the Nelder-Mead method for its robustness, while for α and
β we used the second-order Broyden–Fletcher–Goldfarb–Shanno algorithm due to fast
convergence.
The large dimensionality of the dataset (around 163,428 vertices x 400 subjects x 4
timepoints each) makes model fitting extremely difficult from a computational perspec-
tive. Initial optimisation on a smaller subset of around 100 ADNI subjects took around
30h. However, we achieved a significant speed-up in the evaluation of objective functions
by computing a zlk -weighted average of vertex measurements within each cluster (see
Appendix section B.4). This resulted in a final convergence time of around 4-6h depend-
ing on the size of the dataset, using an Intel Xeon E3-1271 @ 3.60GHz CPU. Regarding
memory requirements, loading into memory around 1600 and fitting the model required
around 12GB of RAM. However, we dropped it down by a factor of x4 by using small
16-bit floating representations for the vertexwise biomarkers.
For optimising λ, we again try to optimise in the M-step the expected full data
likelihood under the Z estimates from the previous iteration:
λ(u) = arg max Ep(Z|V,M (u−1) ,λ,Z (u−1) ) [log p(V, Z|M (u−1) )] (6.15)
λ
We simplify the above equation by expanding the likelihood model and approximating
the joint over Z with the product of the marginals zlk over all vertices l. This results in
the update equation from Alg. 6.2 line 17 – see appendix for full derivation. In this final
equation we also replaced zlk with a function ζlk (λ) over λ, which updates zlk based on
the current value of λ being evaluated. This is done to increase convergence, as latent
variables zlk are highly coupled with the value of λ being evaluated.
at the clinical visit. We initialise zlk using k-means clustering of the vectors Vl . We
also initialise hyperparameters αshape = 16e4, αrate = 16e4, βmean = 0 βstd = 0.1, which
work well in practice as they result in realistic ranges for αi and βi of around [0.3, 3]
and [-15,15] respectively. The reason why we need to give such large numbers of 16e4 is
because there are many vertex measurements (¿100,000) that each drag the subject to
an extremity if most values are above/below the population curve. This can be avoided
in the future by adding subject-specific random effects to the population trajectory.
As already explained in [2], the sigmoid parameters θk are not identifiable because
f (t; ak , bk , ck , dk ) = f (t; −ak , −bk , ck , ak + dk ). We thus need to apply the following trans-
(u) (u) (u) (u) (u) (u) (u) (u)
formation on line 5 of Alg. 6.2: if bk < 0 then ak = −ak ; bk = −bk ; dk = dk −ak .
This ensures model identifiability and is performed at every iteration.
• Scenario 1: as the number of clusters increases, evaluate how well DIVE can esti-
mate the correct number of clusters using AIC and BIC
• Scenario 2: as the trajectories become more similar, test how well we can recover
the assignment of vertices to clusters and the DIVE parameters
1. Sampled baseline age ai1 and shift parameters αi , βi for 300 subjects with 4 time-
points (each timepoint 1 year apart), with ai1 ∼ U (40, 80), αi ∼ Γ(6.25, 6.25),
βi ∼ N (0, 10). Time since baseline has been obtained for every visit j of subject i
as follows: tij = aij − ai1 .
6.3. Methods 109
2. Generated three sigmoids with different (slope, centre) parameters: [(-0.1, -15), (-
0.1, 2.5), (-0.1, 20)] (Fig. 6.3a, red lines). Upper and lower limits have been set to
1 and 0 respectively.
3. randomly assign every vertex l ∈ {1, . . . , L}, where L = 1000, to a cluster a[l] ∈
{1, 2, 3}
5. Sampled subject data for every vertex l from its corresponding perturbed trajectory
θl with noise standard deviation σl = 1
From the basic simulation, we generated synthetic data for each of the three scenarios
by varying one parameter at a time and kept the other parameters constant, having the
same values as in the basic simulation. We varied the following parameters:
• Scenario 1: number of clusters - 2, 3, 5, 10, 15, 20, 30 and 40. The cluster centres
were spread evenly across a fixed total DPS range where the data was available.
• Scenario 2: distance between trajectory centres (as proportion of total DPS range
sampled) – 0.33, 0.30, 0.23, 0.17, 0.10, 0.07, 0.03 and 0.02
• Scenario 3: number of subjects - 300, 200, 100, 50, 35, 20, 10 and 5
Table 6.1: Demographics of the four cohorts used in our analysis. ADNI MRI and the
DRC cohorts were used for the cortical thickness analysis, while ADNI PET was used for
the PET AV45 analysis. MCI – mild cognitive impairment, SMC - subjective memory
complaints, EMCI – early MCI, LMCI – late MCI.
meant that the images were co-registered, averaged across the 6 five-minute frames, stan-
dardised with respect to the orientation and voxel size and smoothed to produce a uniform
resolution of 8mm full-width/half-max (FWHM).
The DRC dataset consisted of T1 MRI scans from 31 healthy controls, 32 PCA and
23 typical AD subjects with at least 3 scans each and an average of 5.26 scans per
subject. All PCA patients fulfilled both Tang-Wai [147] and Mendez [146] criteria based
on clinical review. The typical AD patients all met the criteria for probable Alzheimer’s
disease [115, 116].
Given that the ADNI and DRC datasets contained subjects with different modalities
or diseases, we ran DIVE independently on the following four cohorts (see Table 6.1 for
demographics):
1. ADNI MRI: controls, MCI and tAD subjects from ADNI (cortical thickness data)
2. DRC tAD: tAD subjects and controls from the DRC dataset (cortical thickness
data)
3. DRC PCA: PCA subjects and controls from the DRC dataset (cortical thickness
data)
4. ADNI PET: AV45 scans from ADNI containing subjects with following diagnoses:
healthy controls, subjective memory complaints, early MCI, late MCI and Alzheimer’s
disease.
registered images were then registered to the average Freesurfer template. No further
smoothing was performed on these images (FWHM level of zero mm). From these
template-registered volumetric images, cortical thickness measurements were computed
at each vertex (i.e. point) on an average 2D cortical surface manifold. For each vertex we
averaged the thickness levels from both hemispheres in order to later ease visualisation
and to obtain a smaller representation of the input data. Each of the final images had a
resolution of 163,842 vertices on the cortical surface.
Finally, we standardised the data by computing Z-scores for each vertex with respect
to the values of that vertex in the control population. This normalisation step ensures that
the model will not be affected by different thicknesses of the cortex at various locations
on the cortical surface. This step is specific for MRI cortical thickness data, and might
not be necessary for other modalities (e.g. PET).
6.4 Results
6.4.1 Results on Synthetic Data
In the basic simulation, we obtained a clustering agreement ℵ of 0.97, which suggests
that almost all vertices were assigned to the correct cluster. Fig. 6.3a shows the original
trajectories and the recovered trajectories using our model, plotted against the disease
progression score on the x-axis and the vertex value on the y-axis. In Fig. 6.3b we plotted
the recovered DPS of each subject along with the true DPS. The results for the three
scenarios are shown in Figs. 6.3c-6.3e. In Fig. 6.3c, we show for Scenario 1 the estimated
number of clusters against the true number of clusters using both AIC and BIC criteria.
In Figs. 6.3d-6.3e we show the distributions for ℵ in Scenarios 2 and 3 as the problem
becomes harder in each successive step.
112 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
(a) (b)
Figure 6.3: (a-b) Results for the basic simulation, where trajectories are relatively well
separated. (a) Reconstructed temporal trajectories (blue) plotted against the true tra-
jectories (red). The x-axis shows the disease progression score (DPS), while the y-axis
shows the biomarker values of the vertices. (b) Estimated subject-specific DPS scores
compared to the true scores. (C-E) Simulation results for the three scenarios: (c) increas-
ing number of clusters, (d) trajectories becoming similar and (e) decreasing number of
subjects. On the x-axis we show the variable that was changing within the scenario (e.g.
number of clusters), while on the y-axis we show the agreement measure ℵ, representing
the percentage of vertices that were assigned to the correct cluster.
The results show that, in a simple experiment where the trajectories are well sepa-
rated, DIVE can very accurately estimate which clusters generated each vertex. More-
over, the recovered trajectories and DPS scores are close to the true values. The results
of Scenario 1 also suggest that both AIC and BIC are effective at estimating the correct
number of known clusters, with AIC having slightly better performance than BIC for
larger numbers of clusters. On the other hand, the results of the stress test scenarios 2
and 3 show that performance measure ℵ drops when the trajectories become very sim-
ilar with each other or when the number of subjects decreases. This happens because
small differences in trajectories are hard to detect in the presence of measurement noise,
while a small number of subjects doesn’t provide enough data to accurately estimate the
parameters. Similar decreases in performance for scenarios 2 and 3 are observed also for
other measures, such as the error in recovered trajectories or DPS scores (Supplementary
Fig B.1).
6.4. Results 113
• different in distinct modalities: cortical thickness from MRI vs amyloid load from
AV45 PET.
6.4.2.2 Results
The optimal number of clusters, as estimated with AIC, was three for the ADNI MRI
dataset, three for the DRC tAD dataset, five for the DRC PCA dataset and eighteen for
the ADNI PET dataset. Fig. 6.4a (left) shows the results from the ADNI MRI dataset,
where in the left image we coloured the vertices on the cortical surface according to the
cluster they most likely belong to. We assigned a colour for each cluster (both the brain
figures on the left and the trajectory figures on the right) according to the extent of
pathology of its corresponding trajectory at a DPS score of 1. The cluster colours range
from red (severe pathology) to blue (moderate pathology). In Fig. 6.4a (right), we show
the resulting cluster trajectories with samples from the posterior distribution of each θk .
Similar results are shown for the other three datasets: the DRC tAD dataset (Fig. 6.4b),
DRC PCA dataset (Fig. 6.4c) and the ADNI PET dataset (Fig. 6.4d).
We notice that in tAD subjects using the ADNI datasets (Fig. 6.4a), there is more
severe cortical thinning mainly in the inferior temporal lobe (red cluster), with disperse
atrophy also in parietal and frontal regions (green cluster), with relative sparing of the
inferior frontal and occipital lobes. In tAD subjects from the DRC dataset (Fig. 6.4b),
we see a relatively similar pattern, however with more pronounced atrophy in the supra-
marginal cortex (red cluster) compared to ADNI. This could be due to the younger ages
of controls and tAD subjects in the DRC dataset as compared to ADNI. The spatial
distribution of cortical thinning found with DIVE resembles results from previous longi-
tudinal studies such as [244, 245]. However, in contrast to these approaches, our model
gives insight into the timing and rate of atrophy and is also able to stage subjects across
the disease time course. We also find that the cluster trajectories in the DRC tAD dataset
have similar dynamics to the ADNI MRI dataset, although they show a clearer separation
between each other.
In the PCA subjects (Fig. 6.4c), we find that atrophy is mainly focused on the
posterior part of the brain, with limited spread in the motor cortex, anterior temporal
and frontal areas. This posterior-focused pattern of atrophy is different from the one found
in the tAD datasets, and agrees with previous findings in the literature [18, 20]. However,
as opposed to the results from [20] which showed posterior regions uniformly affected,
we notice that there are two clusters within the posterior region with different pathology
dynamics, with the superior parietal and supramarginal areas affected more that the
remaining posterior regions. This might be attributable to DIVE’s ability to model
114 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
subjects’ disease onset and progression speed, along with non-linear cortical thinning
dynamics, other differences due to the different subjects analysed, and the merging of left
and right hemispheres could also give such differences.
In ADNI PET (Fig. 6.4d) we see that the regions with the highest amyloid uptake
are more spatially continuous, comprising the precuneus and anterior frontal areas. On
the other hand, the anterior-superior temporal gyrus shows the least uptake of amyloid.
This result closely matches the result by [4], which used a completely different dataset
and modelling technique. These results using AV45 PET are also noticeably different
from results using cortical thickness (e.g. Fig. 6.4a), which have more high-frequency
patterns and only give 3-5 optimal clusters instead of 20. The layers of clusters starting
from the precuneus and frontal lobes, which range from severe to less severe atrophy,
suggest a continuum of variation in vertex trajectories in the case of the PET dataset
(Fig 6.4d, right). These trajectories all start with a low amyloid SUVR, between 0 and
0.25, but in late stages the trajectories for some clusters such as cluster 0 can reach an
SUVR of 1.5. The reason for seeing this continuum might be because the PET images
have a much lower resolution than MR images and were smoothed by ADNI during the
pre-processing steps.
• Clinical validity of DPS scores: test whether the subject disease progression scores,
based purely on MRI or PET data, correlate with cognitive tests such as Clinical
Dementia Rating Scale - Sum of Boxes (CDRSOB), Alzheimer’s Disease Assessment
Scale - Cognitive (ADAS-COG), Mini-Mental State Examination (MMSE) and Rey
Auditory and Verbal Learning Test (RAVLT).
Figure 6.4: (left column) DIVE estimated clusters (left column) and corresponding disease
progression trajectories (right column) on four datasets: (a) ADNI MRI (b) DRC tAD
(c) DRC PCA and (d) ADNI PET. We coloured each cluster according to the extent of
pathology (cortical thickness or amyloid uptake) at DPS=1.
116 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
Figure 6.5: (top) Clusters estimated from 10-fold cross-validation training sets on the
ADNI MRI dataset. (bottom) Estimated trajectories for each fold.
and actual measurements, averaged over all subjects and all locations on the brain; to
evaluate these predictions, for every subject we use the first n-1 scans for training and
the last scan for testing the prediction.
Fig. 6.5 shows the brain clusters and corresponding trajectories, estimated for all the
cross-validation folds after fitting the model on the training data. The clusters have been
coloured using a similar colour scheme as in Fig. 6.4. In Fig 6.6 we show scatter plots
of the DPS scores with clinical measures such as CDRSOB, ADAS-COG, MMSE and
RAVLT.
6.4. Results 117
Figure 6.6: Scatter plots of the DPS scores estimated from the ADNI MRI dataset,
plotted against four cognitive tests: CDRSOB, ADAS-COG, MMSE and RAVLT. For
each cognitive test we also report the Pearson correlation coefficient and p-value. The
disease progression scores, computed only based on MRI cortical thickness data, correlate
with these cognitive measures, suggesting that the DPS scores are clinically meaningful.
Model CDRSOB (ρ) ADAS13 (ρ) MMSE (ρ) RAVLT (ρ) Prediction (RMSE)
DIVE 0.37 ± 0.09 0.37 ± 0.10 0.36 ± 0.11 0.32 ± 0.12 1.021 ± 0.008
ROI-based model 0.36 ± 0.10 0.35 ± 0.11 0.34 ± 0.13 0.30 ± 0.13 1.019 ± 0.010
No-staging model *0.09 ± 0.06 *0.03 ± 0.09 *0.05 ± 0.06 *0.02 ± 0.06 *1.062 ± 0.024
Table 6.2: Performance evaluation of DIVE and two simplified models on the ADNI MRI
dataset using 10-fold cross-validation. In the middle four columns, we show between-
subject correlations between the DPS scores and several cognitive tests: CDRSOB,
ADAS-Cog13, MMSE and RAVLT. The last column shows the prediction error (RMSE)
of cortical thickness values from follow-up scans. (*) Statistically significant differences
between the model and DIVE, Bonferroni corrected for multiple comparisons.
The results in Fig. 6.5 demonstrate that DIVE is robust in cross-validation, as the
estimated clusters and trajectory parameters are all similar across folds. The average
Dice score overlap across the 10-folds range were 0.77, 0.76 and 0.90 for clusters 0, 1 and
2 respectively. The DIVE-derived DPS scores, which were estimated purely based on
MRI data, are also clinically relevant as they correlate with cognitive tests (Fig. 6.6).
The performance of DIVE in terms of subject staging and biomarker prediction also
compares favourably with simpler no-staging and ROI-based models (Table 6.2). Results
show that DIVE has comparable performance to the ROI-based model, both in terms
of subject staging and cortical thickness prediction. The fact that DIVE has similar
performance to a simpler model which has less parameters is evidence that the estimated
patterns are meaningful. Moreover, DIVE offers qualitative insight into the fine-grained
spatial patterns of pathology and their temporal progression. Furthermore, the No-
staging model performs significantly worse than DIVE, both in terms of subject staging
and for biomarker prediction. This suggests that, when modelling progression of AD, it
is important to account for the fact that patients are at different stages along the disease
time-course.
118 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
6.5 Discussion
6.5.1 Summary and Key Findings
We presented DIVE, a spatiotemporal model of disease progression that clusters vertex-
or voxel-wise measures of pathology in the brain based on similar temporal dynamics.
The model highlights, for the first time, groups of cortical vertices that exhibit a similar
temporal trajectory over the population. The model also estimates the temporal shift
and progression speed for every subject. We applied the model on cortical thickness
vertex-wise data from three MRI datasets (ADNI, DRC tAD and DRC PCA), as well
as an amyloid PET dataset (ADNI). Our model found qualitatively similar patterns of
cortical thinning in tAD subjects using the two independent datasets (ADNI and DRC).
Moreover, it also found different patterns of pathology dynamics on two distinct diseases
(tAD and PCA) and on different types of data (PET and MRI-derived cortical thickness).
Finally, DIVE also provides a new way to parcellate the brain that is specific to the
temporal trajectory of a particular disease, and enables staging of individuals at risk of
disease, which can potentially help stratification in clinical trials.
The characteristics of the subjects’ data used for training can affect the DIVE out-
put. For instance, in cortical thinning analyses we standardised the data with respect
to controls, which might have already shown cortical thinning due to early pathology.
This can be mitigated through enrichment of the control population to amyloid-negative
individuals. DIVE also relies on subjects spanning the entire disease progression, so in-
clusion of subjects in middle stages is recommended for a robust estimation of trajectories
and spatial patterns. To reliably estimate the subject-specific time shift and progression
speed, multiple follow-up scans are required. We mitigated this by using only subjects
with at least three scans, and further placing informative priors on these parameters.
The DIVE-estimated spatial patterns are patchier in MRI compared to PET scans,
which had lower resolution and were smoothed a-priori. However, we believe MRI images
should not instead smoothed a-priori, as the spatial correlation mechanism within DIVE
enables it to automatically remove high-frequency patterns from MRI that are not mean-
ingful. Moreover, such a-priori smoothing could potentially loose dispersed patterns of
pathology that arise due to underlying disruption of brain networks.
random effects, or different progression dynamics for distinct subgroups, using unsuper-
vised learning methods like the SuStaIn model by [29]. While SuStaIn, just like DIVE,
estimates clusters and trajectories within the dataset, the clusters in SuStaIn are made
of subjects with similar disease progression, while the clusters in DIVE are made of ver-
tices with similar progression. Future work could combine clustering along both subjects
and vertices simultaneously to estimate disease subtypes with distinct spatiotemporal
dynamics at the vertexwise level.
There are several potential future applications of DIVE. One of the advantages of
DIVE is that it can be used to study the link between disconnected patterns of brain
pathology and connectomes extracted from diffusion tractography or functional MRI
(fMRI). Such an analysis would enable further understanding of the exact underlying
mechanisms by which the brain is affected by the disease. Our model, which can es-
timate fine-grained spatial patterns of pathology, is more suitable than standard ROI-
based methods for studying the link between pathology and these structural or functional
connectomes, because white matter or functional connections have a fine-grained and
spatially-varying distribution of endpoints on the cortex.
Apart from studying the link with brain connectomes, there are other potential ap-
plications for DIVE. While we only applied it to vertexwise data, the model can also be
applied to study voxelwise data. Moreover, DIVE can be applied to other modalities or
types of data, including FDG PET, tau PET, DTI or Jacobian compression maps from
MRI. Moreover, the model can also be extended to cluster points on the brain surface
according to a more complex disease signature, that can be made of two or more biomark-
ers. For example, using our cortical thickness and amyloid PET datasets from ADNI, we
could have clustered points on the brain based on both modalities simultaneously. Such
complex disease signatures can offer important insights into the relationships between
different modalities and underlying disease mechanisms.
DIVE is a spatiotemporal model that can be used for accurately predicting and stag-
ing patients across the progression timeline of neurodegenerative diseases. The spatial
patterns of pathology can also be used to test mechanistic hypotheses which consider AD
as a network vulnerability disorder. All these avenues can help towards disease under-
standing, patient prognosis, as well as clinical-trials for assessing efficacy of a putative
treatment for slowing down cognitive decline.
6.6 Conclusion
In this chapter I developed DIVE, a spatiotemporal model of disease progression that
estimates fine-grained spatial patterns of brain pathology, while simultaneously placing
subjects optimally on a disease time axis. I applied it to two typical AD MRI datasets
(ADNI and DRC), one dataset of PCA patients, and one typical AD PET dataset. I also
tested the robustness of the method in simulations, under cross-validation, and I’ve also
compared its performance to simpler feature-based models.
In the next chapter, I will present another model, DKT, that can transfer information
across different types of dementias in order to estimate the progression of rare dementias
from limited, unimodal datasets.
120 Chapter 6. DIVE: A Spatiotemporal Progression Model of Brain Pathology
Chapter 7
7.1 Contributions
In this chapter I present Disease Knowledge Transfer (DKT), a novel method for trans-
ferring biomarker information between related neurodegenerative diseases. I performed
the mathematical modelling, implementation of the DKT method, data pre-processing,
statistical analysis and model validation. The TADPOLE dataset has been assembled by
myself and Neil Oxtoby, with suggestions from the EuroPOND team. The PCA dataset
was acquired by the Dementia Research Centre, UK.
While the original DKT implementation relied on a non-parametric GP disease pro-
gression model by Marco Lorenzi [246] as a building block, for this thesis I chose a
simpler parametric model, due to the complexity of fitting hierarchical, non-parametric,
latent-space models.
7.2 Publications
• R. V. Marinescu, M. Lorenzi, S. B. Blumberg, P. Planell-Morell, A. L. Young, N.
P. Oxtoby, A. Eshaghi, K. X. X. Yong, S. Crutch, D. C. Alexander, arXiv, 2019.
7.3 Introduction
The estimation of accurate biomarker signatures in Alzheimer’s disease (AD) and related
neurodegenerative diseases is crucial for understanding underlying disease mechanisms,
predicting subjects’ progressions, and selecting the right subjects in clinical trials. As a
result, data-driven disease progression models (chapter 3) were proposed that reconstruct
long term biomarker signatures from collections of short term individual measurements.
When applied to large datasets of typical AD, disease progression models have shown im-
portant benefits in understanding the earliest events in the Alzheimer’s disease cascade
[28, 24], the heterogeneity of AD [29], helped discover novel genes involved in AD [247]
and they showed improved predictions over standard approaches [30]. However, by neces-
sity these models require large datasets – in addition they must be both multimodal and
longitudinal. Such data is not available in rare neurodegenerative diseases. In particular,
122 Chapter 7. Disease Knowledge Transfer across Neurodegenerative Diseases
most datasets for rare neurodegenerative diseases come from local clinical centres, are
unimodal (e.g. MRI only) and limited both cross-sectionally and longitudinally – this
makes the application of disease progression models extremely difficult. Moreover, such a
model estimated from common diseases such as typical AD may not generalise to specific
variants. For example, in Posterior Cortical Atrophy – a neurodegenerative syndrome
causing visual disruption – posterior regions such as the occipital lobe and superior pari-
etal regions are affected early, instead of the hippocampus and temporal regions that are
affected early in typical AD.
The problem of limited data in medical imaging has so far been addressed through
transfer learning methods. Such techniques have been successfully used to improve the
accuracy of AD diagnosis [248, 249] or prediction of MCI conversion [250], but have two
key limitations. First, they use deep learning or other machine learning methods, which
are not interpretable and don’t allow us to understand underlying disease mechanisms
that are either specific to rare diseases, or shared across related diseases. Secondly, these
models cannot be used to forecast the future evolution of subjects at risk of dementia,
which is important for selecting the right subjects in clinical trials.
We propose Disease Knowledge Transfer (DKT), a generative joint model that esti-
mates continuous multimodal biomarker progressions for multiple neurodegenerative dis-
eases simultaneously – including rare neurodegenerative diseases – and which inherently
performs transfer learning between the modelled phenotypes. This is achieved by exploit-
ing biomarker relationships that are shared across diseases, whilst accounting for differ-
ences in the spatial distribution of brain pathology. DKT is interpretable, which allows us
to understand underlying disease mechanisms, and can also predict the future evolution
of subjects at risk of diseases. We apply DKT on Alzheimer’s variants and demonstrate
its ability to predict non-MRI trajectories for patients with Posterior Cortical Atrophy,
in lack of such data. This is done by fitting DKT to two datasets simultaneously: (1)
the TADPOLE Challenge [236] dataset containing subjects from the Alzheimer’s Disease
Neuroimaging Initiative (ADNI) with MRI, FDG-PET, DTI, AV45 and AV1451 scans
and (2) MRI scans from patients with Posterior Cortical Atrophy from the Dementia
Research Centre (DRC), UK. We first show that the estimated non-MRI trajectories for
PCA subjects are plausible as they agree with previous literature findings. We finally val-
idate DKT on three datasets: 1) simulated data with known ground truth, 2) TADPOLE
sub-populations with different progressions and 3) 20 DTI scans from controls and PCA
patients from the DRC, showing it yields favourable performance compared to standard
approaches. Code for DKT is available online: https://github.com/mrazvan22/dkt.
7.3. Introduction 123
abnormal abnormal
n n
io
ct tio
fu
n
on nc
Dysfunction
Dysfunction
i fu n
D ys n ct y s ct
io
score
fu
score
al lD un
r ys l
ta on ita ys
f l
po lD ra ion
em a i pi cti c ip D po nct
T nt cc un O
c t al m u
Fr
o O ysf on Te ysf
D Fr D
abnormal abnormal
Temporal Unit Occipital Unit
l
ra
po
Biomarker
l
Biomarker
l
te
m ra - ita
value
po d ip
value
d oi al l
Am
yl
oi
u
te
m
po
ral ... l
y it
Am ccip
Ta
u
occ
cc
ip
ita
Ta m o o
te RI
RI M
M
normal Temporal Dysfunction normal Occipital Dysfunction
Figure 7.1: Diagram of the proposed framework for joint modelling of multiple diseases.
We assume that each disease can be modelled as the evolution of abstract dysfunctionality
scores (Y-axis, top row), each one related to different brain regions. Each region-specific
dysfunctionality score then further models (X-axis, bottom row) the progression of sev-
eral modality-specific biomarkers within that same region. For instance, the temporal
dysfunction, modelled as a biomarker in the disease specific model (top row), is the
X-axis in the disease agnostic model (temporal unit, bottom row), which aggregates to-
gether abnormality from amyloid, tau and MR imaging within the temporal lobe. The
biomarker correlations within the bottom units are assumed to be disease agnostic and
shared across all diseases modelled. Disease knowledge transfer can then be achieved via
the disease-agnostic units.
124 Chapter 7. Disease Knowledge Transfer across Neurodegenerative Diseases
7.4 Methods
7.4.1 DKT Framework
Fig. 7.1 shows the overall diagram of our proposed framework for joint modelling of
diseases. We assume that the progression of each disease (X-axis, top row) can be mod-
elled as the evolution of abstract dysfunctionality scores, each one related to different
brain regions (top row). Each dysfunctionality score is then modelled as the progression
of several biomarkers within that same region, but acquired using different noninvasive
imaging modalities (bottom row). Each group of biomarkers in the bottom row will be
called a functional unit, because the correlations between biomarkers are related through
common ”function” in a disease–agnostic way, since they are related to the same under-
lying brain region. Biomarker groupings into functional units are defined a-priori. We
choose to model the correlations within each unit using the disease progression model
(DPM) by Jedynak et al. [2], but any other DPM can also be used. The DPM allows
us to reconstruct unit-specific dysfunction progression manifolds (bottom row, X axis),
which can be used for staging subjects. Finally, we use the same DPM to express the
progression within each disease (Figure 1, top) in terms of the dysfunction scores esti-
mated within each functional unit. More precisely, the X-axis dysfunction scores from
the functional units become Y-axis measurements in the disease specific models.
The model has a concise mathematical formulation. We assume a set of given biomark-
ers measurements Y = [yijk |(i, j, k) ∈ Ω] for subject i at visit j in biomarker k, where Ω is
defined as the set of available biomarker measurements, since subjects can have missing
data at various visits. We assume that each subject i at each visit j has an underlying
disease stage sij = βi + mij , where mij represents the months since baseline visit for
subject i at visit j and βi represents the time shift of subject i. We further denote by θk
the parameters used to represent the trajectory for biomarker k ∈ K within its functional
unit ψ(k), where ψ: {1, ..., K} → Λ is a function that maps each biomarker k to a unique
functional unit l ∈ Λ, where Λ is the set of functional units. Moreover, we denote by
λld the parameters for the trajectory of the dysfunction score corresponding to functional
unit l ∈ Λ in the space of disease d. These definitions allow us to formulate the likelihood
for a single measurement yijk as follows:
ψ(k) ψ(k)
p(yijk |θk , λdi , βi , k ) = N (yijk |g(f (βi + mij ); λdi ; θk ), k ) (7.1)
where g( . ; θk ) represents the trajectory of biomarker k within functional unit ψ(k) and
ψ(k)
f ( . ; λdi ) represents the trajectory of the functional unit ψ(k) within the space of
disease di . To be precise, di ∈ D represents the index of the disease space where subject i
belongs, where D is the set of all diseases modelled. For example, MCI and tAD subjects
from ADNI as well as tAD subjects from the DRC cohort can all be assigned di = 1,
while PCA subjects from the DRC dataset can be assigned di = 2. Healthy controls
can be assigned to either disease space, although a more precise assignment would take
molecular biomarkers into account. Variable k denotes the variance of measurements for
biomarker k.
We extend the above model to multiple subjects, visits and biomarkers to get the full
model likelihood:
Y ψ(k)
p(y|θ, λ, β, ) = p(yijk |θk , λdi , βi ) (7.2)
(i,j,k)∈Ω
7.4. Methods 125
Figure 7.2: The algorithm for estimating the DKT parameters. The algorithm succes-
sively updates the biomarker trajectories within the functional units (disease agnostic
models), dysfunctionality trajectories (disease specific) and subject-specific time shifts
until convergence.
k that belong to functional unit l (i.e. ψ(k) = l). Finally, Ωi (line 13) represents all
measurements from subject i, for all biomarkers and visits.
The algorithm we proposed in Figure 7.2 has a complexity of O(I ∗ S), where S is the
number of subjects in the dataset and I is the number of iterations until convergence. In
practice, convergence is achieved after around 10-15 iterations, which takes around 1h on
a Xeon CPU E5-2680 @ 2.5GHz.
Disease model
– functional unit l0 : θ0 = (1, 5, 0.2, 0), θ2 = (1, 5, 0.55, 0) and θ4 = (1, 5, 0.9, 0)
– functional unit l1 : θ1 = (1, 10, 0.2, 0), θ3 = (1, 10, 0.55, 0) and θ5 = (1, 10, 0.9, 0)
7.4. Methods 127
– ”synthetic AD” disease: λ00 = (1, 0.3, −4, 0) and λ10 = (1, 0.2, 6, 0).
– ”synthetic PCA” disease: λ01 = (1, 0.3, 6, 0) and λ11 = (1, 0.2, −4, 0).
Subject model
• For each subject and each biomarker, we generated data for four consecutive visits,
each visit one year apart, using a noise standard deviation of 0.05.
These trajectory and subject parameters were chosen to mimic the TADPOLE and
DRC cohorts, described below. Before fitting DKT on the synthetic dataset, we discarded
the data from biomarkers k0 , k1 , k4 and k5 for all subjects within the synthetic PCA
cohort, to simulate the lack of multimodal data in these subjects. Remaining biomarkers
k2 and k3 , for which data was still available in the synthetic PCA cohort, are assumed to
be of the same modality (e.g. MRI volume) but to represent measurements from different
brain regions (e.g. temporal and occipital).
7.5 Results
7.5.1 Synthetic Results
Fig. 7.3 shows the true and estimated subject shifts and trajectories for each functional
unit l and biomarker k. In the top-left figures we show scatter plots of the true shifts
(y-axis) against estimated shifts (x-axis), for the ’synthetic AD’ and ’synthetic PCA’
diseases. On the top-right and middle-left figures, we show the trajectories of the func-
tional units within disease d = 0 (synthetic AD) and d = 1 (synthetic PCA). In the
middle-right and bottom-left figures, we show the biomarker trajectories within units l0
and l1 . In Figure 7.4, we show the corresponding trajectories of PCA patients, which as
opposed to Fig. 7.3, are plotted directly against the time-shifts, as it is normally done in
a classical disease progression model. We also show the true trajectories and the data of
the synthetic PCA cohort.
The results in Fig. 7.3 suggest that the DKT-estimated trajectories match closely
(mean absolute error, MAE < 0.058) with the true trajectories, for both the unit-
trajectories within the disease-specific models and the biomarker trajectories within the
disease-agnostic models. Moreover, the subject time-shifts are very close (R2 > 0.98) to
the true time-shifts. When plotted directly against the disease space, the estimated PCA
trajectories also match the true trajectories, even when there is a complete lack of such
data (Fig. 7.4, biomarkers 0,1,4 and 5). There are however small errors in biomarkers
0 and 5 which are due to measurement noise (confirmed by experiments with smaller
noise level – not shown here). The equivalent trajectories estimated for the synthetic AD
cohort also show very good agreement with the true trajectories (Fig. C.1).
Subject shifts Subject shifts Dis0 estimated trajectories Dis0 true trajectories
10
dysfunctionality score
dysfunctionality score
R 2 = 0.997 10 R 2 = 0.988 1.0 MAE = 0.057 1.0
5 5
true shifts
true shifts
0 0 0.5 0.5
5 CTL 5 CTL2 Unit0 Unit0
AD PCA 0.0 Unit1 0.0 Unit1
10 10
10 0 10 10 5 0 5 10 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
estimated shifts estimated shifts disease progression score disease progression score
Dis1 estimated trajectories Dis1 true trajectories Unit0 estimated trajectories Unit0 true trajectories
dysfunctionality score
dysfunctionality score
biomarker value
0.5 0.5 0.5 0.5
biomk 0 biomk 0
Unit0 Unit0 biomk 2 biomk 2
0.0 Unit1 0.0 Unit1 0.0 biomk 4 0.0 biomk 4
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
disease progression score disease progression score dysfunctionality score dysfunctionality score
Unit1 estimated trajectories Unit1 true trajectories
1.0 MAE = 0.015 1.0
biomarker value
biomarker value
0.5 0.5
biomk 1 biomk 1
0.0 biomk 3 0.0 biomk 3
biomk 5 biomk 5
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
dysfunctionality score dysfunctionality score
Figure 7.3: Comparison between true and DKT-estimated subject time-shifts and
biomarker trajectories. (top-left) Scatter plots of the true shifts (y-axis) against esti-
mated shifts (x-axis), for the ’synthetic AD’ (left) and ’synthetic PCA’ (right) diseases.
We also show the DKT-estimated and true trajectories of the functional units within
the ’synthetic AD’ disease (top-right) and the ’synthetic PCA’ disease (middle-left). For
these figures, the x-axis measures the normalised disease progression score si while the
y-axis measures the dysfunctionality scores f (si ; λld ). Finally, we also show the biomarker
trajectories within unit 0 (middle-right) and unit 1 (bottom), where the x-axis represents
the dysfunctionality scores f (si ; λld ) and the y-axis represents the biomarker value.
130 Chapter 7. Disease Knowledge Transfer across Neurodegenerative Diseases
10 0 10 10 0 10 10 0 10
biomarker 3 biomarker 4 biomarker 5
MAE = 0.042 MAE = 0.029 MAE = 0.058
1.0 1.0 1.0
10 0 10 10 0 10 10 0 10
Disease Progression (years)
Figure 7.4: Estimated biomarker trajectories for the ”synthetic PCA” disease, plotted
alongside true trajectories. Estimation of the trajectories in biomarkers 0,1,4 and 5 has
been done without any data from the ”synthetic PCA” disease, only based on the disease-
agnostic correlations with biomarkers 2 and 3.
7.5. Results 131
Figure 7.5: (a) DKT-estimated biomarker trajectories in the occipital functional unit.
Subject data from ADNI and our local DRC cohort are also shown. The X-axis, defined
as the occipital dysfunctionality score, represents the time-shifts (in months) of each
subject. (b-c) Progression of DKT-estimated dysfunctionality scores for (b) typical AD
and (c) PCA.
132 Chapter 7. Disease Knowledge Transfer across Neurodegenerative Diseases
Figure 7.6: Estimated multi-modal trajectories for the PCA cohort. The only data that
were available were the MRI volumetric data. The dynamics of the other biomarkers
has been inferred by the model using data from typical AD, and taking into account the
different spatial distribution of pathology in PCA as compared to typical AD.
7.6. Validation on DTI Data in PCA 133
• A separate test set of 20 DTI scans from controls and PCA patients from our own
cohort.
To split TADPOLE into subgroups with different progression, we used the SuStaIn
model by [29], which resulted into three subgroups: hippocampal, cortical and subcorti-
cal, with prominent early atrophy in the hippocampus, cortical and subcortical regions
respectively. To evaluate prediction accuracy, we computed the rank correlation between
the DKT-predicted biomarker values and the measured values in the test data. We
compute the rank correlation instead of mean squared error as it is not susceptible to
systemic biases of the models when predicting ”unseen data” in a certain disease. We also
compared the performance of DKT at predicting unseen data with four other models:
• Multivariate: A multivariate Gaussian Process model with RBF kernel that predicts
a DTI ROI marker from multiple MRI markers.
• Spline: a univariate cubic spline regression model that predicts the DTI biomarker
based on the corresponding MRI biomarker, independently for each region.
Validation results are shown in Table 7.1, for hippocampal to cortical TADPOLE
subgroups, as well as PCA subjects from our DRC cohort. When predicting missing DTI
markers from the TADPOLE cortical subgroup as well as PCA subjects, the DKT corre-
lations are generally high for the cingulate, hippocampus and parietal, and lower for the
frontal lobe. DKT further shows favourable performance compared to the other models,
due to it’s ability to disentangle the progressions of each disease separately. In particular,
it shows the best results for DTI FA prediction in the parietal and temporal lobes on
both datasets and similar performance to the latent-stage model on the PCA dataset for
the cingulate, frontal and hippocampal (differences here are not statistically significant).
Due to the challenging problem of predicting unseen data in these diseases/subtypes,
notice how the models yield bad predictions for the occipital lobe (negative correlations),
most likely due to overfitting.
134 Chapter 7. Disease Knowledge Transfer across Neurodegenerative Diseases
Table 7.1: Performance evaluation of DKT and four other statistical models of decreasing
complexity. We show the rank correlation between predicted biomarkers and measured
biomarkers in (top) TADPOLE subgroups – information transfer from hippocampal sub-
group to cortical subgroup – and (bottom) PCA. (*) Statistically significant difference
in the performance of DKT vs the other models, based on a two-tailed t-test, Bonferroni
corrected.
7.7 Discussion
We presented DKT, a framework that enables, for the first time, joint modelling of
biomarker progressions in multiple neurodegenerative diseases simultaneously. The frame-
work allows the inference of biomarker trajectories in rare diseases, for which there is not
enough data to allow estimation of such trajectories, and accounts for a different spatial
distribution of pathology between distinct types of dementia. This further enables us
to understand the complex mechanisms of rare diseases, as well as mechanisms shared
between different types of related diseases.
We provided an example implementation of DKT using specific models of the biomarker
trajectories, measurement noise and link function (the disease progression score). How-
ever, DKT should be considered as a general framework for joint modelling of biomarker
trajectories within different diseases simultaneously. The actual implementation of DKT
can thus be extended to use non-parametric trajectories, or more complex link functions
that estimate not only subject time-shifts but also progression speed or higher order
terms.
While in this work we have focused on Alzheimer’s variants such as tAD and PCA,
DKT can also be applied to other progressive neurodegenerative diseases of non-Alzheimer’s
type such as tauopathies (e.g. Frontotemporal dementia), synucleinopathies (e.g. Parkin-
son’s disease), other neurodegenerative diseases such as Huntington’s disease or Multiple
Sclerosis, and even the normal ageing process. Cognitive tests can also be included in
the disease-specific sub-models of DKT, or even allocated in the functional units of the
regions that are responsible for those tasks, based on previous voxel-based morphometry
studies. However, some care needs to be exercised when selecting the biomarkers and
grouping them into functional units, as in some diseases the assumption of disease ag-
nostic dynamics might not hold for some groups of molecular biomarkers. For example,
some non-Alzheimer’s tauopathies such as Frontotemporal dementia might show tau ab-
normalities but no corresponding amyloid abnormalities within the same region. In the
case of Frontotemporal dementia, we recommend including higher-level biomarkers such
7.8. Conclusion 135
as glucose metabolism from FDG, white matter degeneration from DTI or volume from
structural MRI, but one should exclude amyloid markers.
Our work has several limitations: 1) DKT assumes all subjects within a disease fol-
low the same trajectory, without considering heterogeneity within the disease population,
2) the allocation of biomarkers into functional units has to be done using a-priori hu-
man knowledge, 3) DKT currently works only on extracted brain features, discarding
important information present in the brain morphometry, 4) for validation, the synthetic
experiment we ran was limited to only one setting of the parameters and 5) the valida-
tion on patient data was also done only on a small set of 20 DTI scans, due to lack of
multimodal data in PCA.
There are several potential avenues for further research: 1) to account for hetero-
geneity, DKT can also be easily extended to include subject-specific effects; 2) improved
schemes for biomarker allocation to functional units can take connectivity into account, or
derive it from the data automatically; 3) to account for brain morphometry and connec-
tivity, DKT can be extended into a fully spatio-temporal model, by estimating continuous
changes in volumetric brain images – in this case, each voxel can have an associated dys-
functionality score that is derived from measurements of various modalities from that
voxel; 4-5) DKT can be further validated on more complex synthetic experiments with
variable parameter settings, and on patient data from ADNI, where the population could
be a-priori split into sub-groups with different progressions. On these subgroups, DKT
can be used to transfer biomarker modalities that have been left out during training.
7.8 Conclusion
In this work I presented DKT, a novel method that can empower studies of rare dementias
with limited biomarker data by leveraging data from larger datasets of related dementias.
When applied to synthetic data with ground truth, I showed that DKT can robustly
recover biomarker trajectories in two distinct diseases and also subject-specific time-shifts.
I also applied DKT to multimodal imaging biomarkers from the TADPOLE Challenge
dataset, where I showed that it can estimate plausible non-MRI biomarker trajectories
for Posterior Cortical Atrophy in lack of such data for this disease. I validated the
performance of DKT on a test set of 20 DTI scans from PCA and controls, and showed
that DKT has similar or better performance compared to simpler models.
In the next chapter, I will present the TADPOLE Challenge, which evaluates the
performance of algorithms and features at predicting the future evolution of subjects at
risk of AD. As opposed to the work performed in this chapter, the TADPOLE challenge
aims to evaluate a much larger set of algorithms and features, comprising regression
techniques, disease progression models, machine learning techniques and even manual
predictions made by clinicians.
136 Chapter 7. Disease Knowledge Transfer across Neurodegenerative Diseases
Chapter 8
8.1 Contributions
In this chapter I present the design of The Alzheimer’s Disease Progression Of Longitu-
dinal Evolution (TADPOLE) Challenge, which aims to predict the evolution of subjects
at risk of Alzheimer’s disease. The challenge was organised by the European Progression
of Neurodegenerative (EuroPOND) consortium, in collaboration with the Alzheimer’s
disease Neuroimaging Initiative (ADNI). The key organisers of the challenge were, in
alphabetical order: Daniel Alexander, Frederik Barkhof, Esther Bron, Nick Fox, Stefan
Klein, Razvan Marinescu (myself), Neil Oxtoby and Alexandra Young.
I contributed with suggestions to the challenge design, helped write the website, as-
sembled the TADPOLE D2 longitudinal dataset and the data dictionary, and wrote
benchmark prediction scripts. I also build the leaderboard system which performs live
evaluation of the participants’ submissions. I further helped promote the competition
at several medical imaging conferences, and organised two mini-competitions at the Py-
ConUK conference and at the CMIC summer school, 2018.
Daniel Alexander proposed the main design of the challenge, secured funding, helped
write the website, and wrote simple prediction scripts. Neil Oxtoby contributed to chal-
lenge design, helped me validate the D2 dataset, built the D3 cross-sectional dataset,
helped write the website, organised webinars and promoted the competition. Alexandra
Young contributed to challenge design, helped write the website, performed simulations
to establish which target biomarkers are most suitable and promoted the competition. Es-
ther Bron and Stefan Klein contributed to challenge design and helped write the website.
Nick Fox and Frederik Barkhof provided valuable suggestions on the challenge design.
Arthur Toga and Michael Weiner offered access to the ADNI database.
8.2 Publications
• R. V. Marinescu, N. P. Oxtoby, A. L. Young, E. E. Bron, A. W. Toga, M. W. Weiner,
F. Barkhof, N. C. Fox, S. Klein, D. C. Alexander and the EuroPOND Consortium,
138 Chapter 8. TADPOLE Challenge: Prediction of Evolution in Alzheimer’s Disease
8.3 Introduction
As already mentioned in section 3, early diagnosis of dementia is important in order
to enable the administration of treatments in early disease stages, before the onset of
cognitive decline. While such early and accurate diagnosis of dementia can be challenging,
this can be aided by quantitative biomarker measurements taken from magnetic resonance
imaging (MRI), positron emission tomography (PET), and cerebro-spinal fluid (CSF)
samples extracted from lumbar puncture. It has been hypothesised for AD [1, 119, 252,
253] that all these biomarkers become abnormal at different intervals before symptom
onset, suggesting that together they can be used for accurate prediction of onset and
overall disease progression in individuals. In particular, some of the early biomarkers
become abnormal decades before symptom onset, and can thus facilitate early diagnosis.
Several approaches for predicting AD-related target variables (e.g. clinical diagnosis,
cognitive/imaging biomarkers) have been proposed which leverage multimodal biomarker
data available in AD. Traditional longitudinal approaches based on statistical regression
model the relationship of the target variables with other known variables. Examples in-
clude regression of the target variables against clinical diagnosis [131], cognitive test scores
[193, 175], rate of cognitive decline [177], and retrospectively staging subjects by time
to conversion between diagnoses [254]. Another approach involves supervised machine
learning techniques such as support vector machines, random forests, and artificial neural
networks, which use pattern recognition to learn the relationship between the values of a
set of predictors (biomarkers) and their labels (diagnoses). These approaches have been
used to discriminate AD patients from cognitively normal individuals [207, 255], and for
discriminating at-risk individuals who convert to AD in a certain time frame from those
who do not [256, 257]. The emerging approach of disease progression modelling aims
to reconstruct biomarker trajectories or other disease signatures across the disease pro-
gression timeline, without relying on clinical diagnoses or estimates of time to symptom
onset. Examples include models built on a set of scalar biomarkers to produce discrete
[23, 24] or continuous [2, 3, 25] biomarker trajectories; richer but less comprehensive
models that leverage structure in data such as MR images [258, 259, 4]; and models of
disease mechanisms [38, 126, 6, 28].
These models have shown promise for predicting AD biomarker progression when us-
ing existing test data, but few have been tested on truly unseen future data. Moreover,
different investigators test these models on different datasets (including subsets of a sin-
gle dataset) and use different processing pipelines. Community challenges have proved
effective, in the medical image analysis field and beyond, for providing unbiased com-
parative evaluations of algorithms and tools designed for a particular task. Previous
challenges that focused on prediction of AD progression include the CADDementia chal-
lenge [39], which aimed to predict clinical diagnosis from MRI scans. A similar challenge,
the ”International challenge for automated prediction of MCI from MRI data” [40] asked
8.4. Competition Design 139
participants to predict diagnosis and conversion status from extracted MRI features of
subjects from the ADNI study [260]. Yet another challenge, The Alzheimer’s Disease
Big Data DREAM Challenge [261], asked participants to predict cognitive decline from
genetic and MRI data.
The Alzheimer’s Disease Prediction Of Longitudinal Evolution (TADPOLE) Chal-
lenge aims to identify the data, features and approaches that are the most predictive of
AD progression. In contrast to previous challenges, our motivation is to improve future
clinical trials through identification of patients most likely to benefit from an effective
treatment, i.e., those at early stages of disease who are likely to progress over the short-
to-medium term (1-5 years). Identifying such subjects reliably helps cohort selection by
focusing on groups that highlight positive treatment effects. The challenge thus focuses
on forecasting three key features: clinical status, cognitive decline, and neurodegenera-
tion (brain atrophy), over a five-year timescale. It uses rollover 1 subjects from the ADNI
study for whom a history of measurements is available, and who are expected to continue
in the study, providing future measurements for testing. Since the test data does not exist
at the time of forecast submissions, the challenge provides a completely unbiased basis
for performance comparison. TADPOLE goes beyond previous challenges by drawing
on a vast set of multimodal measurements from ADNI which support prediction of AD
progression.
Figure 8.1: TADPOLE Challenge design. Participants are required to train a predictive
model on a training dataset (D1 and/or others) and make forecasts for different datasets
(D2, D3) by the submission deadline. Evaluation will be performed on a test dataset
(D4) that is acquired after the submission deadline.
1
i.e. subjects who enrolled in the previous ADNI2 study and who will continue in the third phase.
140 Chapter 8. TADPOLE Challenge: Prediction of Evolution in Alzheimer’s Disease
Table 8.1: The format of the forecasts for three example subjects. Participants have to
predict, for each subject, the probability of clinical diagnosis (CN/MCI/AD), the ADAS-
Cog13 score and Ventricle volume, as well as the 50% confidence range. RID - Roster ID
is the unique identifier for ADNI subjects, ADAS - ADAS-Cog13, CI - confidence range.
Note that, even if the CN/MCI/AD probabilities don’t sum to one, we will normalise
them anyway.
8.5 Forecasts
Since we do not know the exact time of future data acquisitions for any given individual,
TADPOLE challenge participants are required to make, for every individual, month-by-
month forecasts of three key biomarkers: (1) clinical diagnosis which can be either cog-
nitively normal (CN), mild cognitive impairment (MCI) or probable Alzheimer’s disease
(AD); (2) ADAS-Cog13 (ADAS13) score; and (3) ventricle volume (divided by intra-
cranial volume). Evaluation is performed using forecasts at the months that correspond
to data acquisition. TADPOLE forecasts are required to be probabilistic and some eval-
uation metrics will account for forecast probabilities provided by participants. Methods
or algorithms that do not produce probabilistic estimates can still be used, by setting
binary probabilities (zero or one) and default confidence intervals.
Participants are required to submit forecasts in a standardised format (see Table
8.1). For clinical status, relative likelihoods of each option (CN, MCI, and AD) for
each individual should be provided. These are normalised at evaluation time; negative
likelihoods are set to zero. For ADAS13 and ventricle volume, participants need to
provide a best-guess value as well as a 50% confidence interval for each individual. This
50% confidence interval (as opposed to the more standard 95%) was chosen to provide a
more symmetric and less noisy evaluation of over- and under-estimation of the confidence
interval, because similar sample sizes of data fall inside and outside the interval.
8.6 Data
We provide participants with a standard ADNI-derived dataset (available via the Labora-
tory Of NeuroImaging: LONI) which they can use to train their algorithms, removing the
need to pre-process the ADNI data themselves or merge different spreadsheets. However,
participants are allowed to use a custom training set, by adding any other ADNI data
or data from other studies. The software code used to generate the standard dataset is
openly available in a GitHub repository2 and on the ADNI website, packaged with the
standard dataset in the LONI ADNI database.
2
https://github.com/noxtoby/TADPOLE
8.7. TADPOLE Datasets 141
Figure 8.2: Venn diagram of the TADPOLE datasets derived from ADNI data, for train-
ing (D1), longitudinal prediction (D2), cross-sectional prediction (D3) and the test set
(D4). D3 is a subset of D2, which in turn is a subset of D1. Other non-ADNI data can
also be used for training.
and (c) a test data set, which contains the patient outcomes against which we will evaluate
forecasts — in TADPOLE, this data did not exist at the time of submitting forecasts.
In order to evaluate the effect of different methodological choices, we prepared three
standard data sets for training and prediction:
• D1: The TADPOLE standard training set draws on longitudinal data from the
entire ADNI history. The data set contains a set of measurements for every in-
dividual that has provided data to ADNI in at least two separate visits (different
dates) across three phases of the study: ADNI1, ADNI GO, and ADNI2.
• D2: The TADPOLE longitudinal prediction set contains as much available data
as we could gather from the ADNI rollover individuals for whom challenge partici-
pants are asked to provide forecasts. D2 includes all available time-points for these
individuals.
• D3: The TADPOLE cross-sectional prediction set contains a single (most recent)
time point and a limited set of variables from each rollover individual in D2. Al-
though we expect worse forecasts from this data set than D2, D3 represents the
information typically available when selecting a cohort for a clinical trial.
The forecasts will be evaluated on future data (D4 – test set) from ADNI3 rollovers, ac-
quired after the challenge submission deadline. In addition to the three standard datasets
(D1, D2 and D3), challenge participants are allowed to use any other data sets that might
serve as useful additional training data.
Fig. 8.2 shows a diagram highlighting the nested structure of datasets D1–D3. Ta-
ble 8.2 shows the proportion of biomarker data available in each dataset. There are a
considerable number of entries with missing data, especially for some biomarkers such as
tau imaging (AV1451). We also estimated the expected number of subjects and avail-
able data for D4, using information from the ADNI3 procedures and using rollovers from
previous ADNI studies (Table 8.2, right-most column) – See E.1 for more information on
D4 estimates. Based on our estimates, we believe the size of D4 (around 330 subjects, 1
visit/subject) should be enough for a reliable evaluation of TADPOLE submissions.
8.8. Submissions 143
Subject statistics D1 D2 D3 D4
Nr. of subjects 1667 896 896 330
Visits per subject 7.6±3.8 8.5±4.2 1.0±0.0 1 .0 ±0 .0
CN 31 38 45 39
Diagnosis* (%) MCI 56 57 39 49
AD 13 5 16 12
Data availability**
Cognitive tests (%) 70 68 84 62
MRI (%) 62 56 75 69
FDG-PET (%) 16 20 0 20
AV45-PET (%) 16 22 0 19
AV1451-PET (%) 0.7 1.1 0 19
DTI (%) 6 8 0 15
CSF (%) 18 19 0 14
Table 8.2: Subject statistics and available data in the TADPOLE datasets D1, D2 and D3.
There is a considerable amount of missing data in some biomarkers such as AV1451. Num-
bers for D4 are estimated based on ADNI3 procedures (see ADNI3 procedures manual)
and rollovers from previous ADNI studies. (*) Diagnosis at baseline visit. (**) Percentage
of all visits (across all subjects) that have measurements for desired biomarker.
8.8 Submissions
There are two kinds of submissions that challenge participants can make. A simple entry
requires a minimal forecast and a description of methods; it makes participants eligible
for the prizes but not co-authorship on the scientific paper documenting the results. A
simple entry can use any training data or prediction sets and forecast at least one of the
target outcome variables (clinical status, ADAS13 score, or ventricle volume). A full entry
entitles participants for consideration as a co-author on the publication documenting the
results. Such a full entry requires a complete forecast for all three outcome variables on
all subjects from the D2 prediction set, along with a description of the methods. Each
individual participant is limited to a maximum of three submissions. This restriction has
been introduced to avoid the risk of participants tuning their method on the test set by
submitting multiple predictions for a range of algorithm settings. Although not required
for a full entry, participants are strongly encouraged to submit predictions also for D3.
Prizes are awarded to the best submissions regardless of the choice of training sets
(D1/custom) and prediction sets (D2/D3). However, the additional submissions support
the key scientific aims of the challenge by allowing us to separate the influence of the
choice of training data, post-processing pipelines, and modelling techniques or prediction
algorithms. The target variables used for evaluation, in particular ventricle volume, will
use the same post-processing pipeline as the standard data sets D1-D3.
Beyond the standard training dataset (D1), participants can include additional fore-
casts from ”custom” (i.e. constructed by the participant) training data or custom post-
processing of the raw data from subjects in the standard training set. The same applies
to the prediction sets D2 and D3, which can be customised by the participants if desired,
e.g. a prediction set with different features from the same individuals as in D2 and D3.
Table 8.3 shows the twelve possible combinations of subject sets, processing and predic-
tion sets, from which a full-entry submission must contain at least one of the first four
144 Chapter 8. TADPOLE Challenge: Prediction of Evolution in Alzheimer’s Disease
(ID 1–4).
Table 8.3: Types of submissions that can be made by participants, for different types of
training sets, prediction sets and post-processing pipelines.
The multiclass Area Under the ROC Curve (mAUC) is a simple generalisation of the
area under the ROC curve applicable to problems with more than two classes [267]. The
AUC Â(ci |cj ) for classification of a class ci against another class cj , is:
Si − ni (ni + 1)/2
Â(ci |cj ) = (8.1)
ni nj
where ni and nj are the number of points belonging to classes i and j, respectively; while
Si is the sum of the ranks of the class i test points after ranking all the class i and j
data points in increasing likelihood of belonging to class i. We further define the average
AUC for classes i and j as Â(ci , cj ) = 0.5(Â(ci |cj ) + Â(cj |ci )). The overall mAUC is then
8.9. Forecast Evaluation 145
L i
2 XX
mAU C = Â(ci , cj ) (8.2)
L(L − 1) i=2 j=1
where L is the number of classes. The class probabilities that go into the calculation of
Si in the first equation are pCN , pM CI and pAD , which are derived from the likelihoods of
each ADNI subject being assigned to each diagnostic class, by normalising to have unity
sum.
where TP, FP, TN, FN represent the number of true positives, false positives, true neg-
atives and false negatives for classification as class i. In this case, true positives are data
points with true label i and correctly classified as such, while the false negatives are the
data points with true label i and incorrectly classified to a different class j 6= i. True
negatives and false positives are defined similarly. The overall BCA is given by the mean
of all the balanced accuracies for every class.
N
1 X
M AE = M̃i − Mi (8.4)
N i=1
where N is the number of data points (forecasts) evaluated, Mi is the actual biomarker
value in individual i in future data, and M̃i is the participant’s best prediction for Mi .
146 Chapter 8. TADPOLE Challenge: Prediction of Evolution in Alzheimer’s Disease
where C̃i is the participant’s relative confidence in their M̃i estimate. We estimate C̃i as
the inverse of the width of the 50% confidence interval of their biomarker estimate:
CP A = |ACP − N CP | (8.7)
where N CP is the nominal coverage probability, the target for the confidence intervals,
and ACP is the actual coverage probability, defined as the proportion of measurements
that fall within the corresponding confidence interval. In TADPOLE, we set N CP to
be 0.5, which means that ideally only 50% of the measurements would fall inside the
confidence interval. The CPA can take values between 0 and 1, and lower scores are
better.
8.10 Prizes
We are extremely grateful to Alzheimer’s Research UK, The Alzheimer’s Society, and
The Alzheimer’s Association for sponsoring a prize fund of £30,000. At the time of first
submission, we proposed six separate prizes, as outlined in Table 8.4, but reserve the
right to reallocate the prize money depending on the numbers of participants eligible for
each prize. The first four are general categories (open to all challenge participants) and
constitute one prize for the best forecast of each feature as well as one for overall best
performance. The last two prizes are for two different student categories.
8.11. Discussion 147
Prize Performance
Outcome measure Eligibility
amount Metric
£5,000 Clinical status mAUC all
£5,000 ADAS13 MAE all
£5,000 Ventricle volume MAE all
Lowest sum of
£5,000 Overall best all
ranks*
University
£5,000 Clinical status mAUC
teams
High-school
£5,000 Clinical status mAUC
teams
Table 8.4: Prize allocation scheme using funds from Alzheimer’s Research UK, The
Alzheimer’s Society and The Alzheimer’s Association. There are 6 prizes awarded to
different outcome measures, the last two of which are eligible only for university and
high-school teams. (*) The overall best team will be the team that obtains the lowest
sum of ranks in the clinical status, ADAS13 and ventricle volume categories.
8.11 Discussion
We have outlined the design of the TADPOLE Challenge, which aims to identify algo-
rithms and features that can best predict the evolution of Alzheimer’s disease. Challenge
participants use historical data from ADNI in order to predict three key outcomes: clini-
cal diagnosis, ADAS-Cog13 and ventricle volume. Determining which features and algo-
rithms best predict AD evolution can aid refinement of cohorts and endpoint assessment
for clinical trials, and can provide accurate prognostic information in clinical settings.
The TADPOLE Challenge was designed to be transparent and accessible. To this end,
all of our scripts are available in an open repository5 . We also created a public forum6
where we answer participant questions. Finally, in order to enable participants to share
algorithm performance results throughout the competition, we created a leaderboard
system7 that evaluates submissions on an existing test dataset and publishes the results
live on our website.
Going forward, we hope that by November 2018 sufficient data will be available from
ADNI3 rollovers for a first meaningful evaluation of the forecasts. We plan to publish the
results on the website in January 2019, and then submit a publication of the results soon
after. However, we reserve the right to delay evaluation until sufficient data is available.
At that time, we will also evaluate the impact and interest of the first phase of TADPOLE
within the community, to guide decisions on whether to organise further submission and
evaluation phases.
The fact that the D4 test set could have different properties from the training set
is something that can affect the performance of certain algorithms. For example, some
algorithms could perform better on different forecast time windows (short-term vs long-
term) or on subjects with different properties (e.g. those with more follow-up training
data vs those with less data). At the evaluation stage, we thus take into consideration
doing the evaluation on different splits of the test set, in order to understand what kind
5
TADPOLE repository: https://github.com/noxtoby/TADPOLE
6
TADPOLE forum: https://groups.google.com/forum/#!forum/tadpolechallenge
7
Leaderboard: https://tadpole.grand-challenge.org/leaderboard/
148 Chapter 8. TADPOLE Challenge: Prediction of Evolution in Alzheimer’s Disease
8.12 Conclusion
In this section I presented the TADPOLE Challenge, which aims to identify algorithms
and features that best predict the evolution of subjects at risk of Alzheimer’s disease.
The outcomes of the challenge will be made available early in 2019, after sufficient data
has been acquired. In the next chapter, I will present future work on the TADPOLE
Challenge, as well as the other chapters of the thesis.
Chapter 9
Conclusions
9.1 Summary
In chapter 2, I first gave an overview of Alzheimer’s disease (section 2.1) by describing its
symptoms, disease causes and mechanisms, various risk factors involved, how it is cur-
rently diagnosed and the key biomarkers available to quantitatively measure Alzheimer’s
disease pathology. Afterwards, in section 2.2 I described the progression of AD biomarkers
and the Braak staging scheme. Finally, in section 2.3 I performed a literature review on
PCA, and described its symptoms, disease causes, diagnosis, management, neuroimaging
and heterogeneity. Throughout the section, I compared and contrasted the differences
between PCA and typical AD.
In chapter 3, I presented the state of the art in disease progression modelling. I
started with the hypothetical model by Jack et al. [1] (section 3.1), then presented early
models of progression based on symptomatic groups (section 2.1.1), then moved to con-
tinuous models which regress against one biomarker (section 3.3) and survival analysis
models that compute time until an event such as clinical conversion occurs (section 3.4).
I then presented state of the art methods that combine multiple biomarker measurements
and generally compute latent time shifts and other hidden variables. I categorised them
into models based on scalar biomarker measurements (section 3.5), spatiotemporal mod-
els (section 3.6) which model changes both in brain structure and over time, as well as
mechanistic models (section 3.7) which can be used to infer underlying disease mecha-
nisms. Finally, I presented a summary of key machine learning methods that have been
frequently used in medical imaging, especially for diagnosis and prognosis (section 3.8).
In chapter 4, I presented a longitudinal comparison of Posterior Cortical Atrophy
with typical Alzheimer’s disease, analysing the progression of atrophy from MRI. I first
presented the demographics (section 4.3.1) of the cohort from the Dementia Research
Centre, UK that I analysed, using data obtained by my collaborators. I then described
the methodology I applied, which involved adaptations of the event-based model and
the differential equation model to this specific dataset (section 4.3.3). I showed that
150 Chapter 9. Conclusions
there were differences in the progression of brain volumes in PCA as opposed to typical
AD, where phenotype-specific areas were affected early in the disease process (section
4.4.1). Moreover, I also showed that there were differences in atrophy progression in
three cognitively-defined PCA subtypes, highlighting the amount of heterogeneity within
PCA (section 4.4.2). Finally, in section 4.5 I discussed the findings of our study, the
strengths and limitations of our methods, and suggested directions for future research.
In chapter 5, I presented methodological advances in two disease progression models,
the event-based model and the differential equation model. In section 5.3.3, I presented
novel performance metrics that I designed, which enable us to compare the performance
of these novel methods against the standard implementations. In section 5.4, I showed
that novel EBM methods perform better than the standard EBM, while the novel DEM
methods performs equally well to the standard method on those datasets. This also
suggested that the novel metrics that we proposed are sensitive to these small changes in
the EBM and DEM methodologies.
In chapter 6 I presented Data-Driven Inference of Vertexwise Evolution (DIVE), a
novel spatiotemporal disease progression model of brain pathology in neurodegenerative
disorders. In section 6.2 I first reviewed existing literature and motivated the need for such
a model, due to the presence of dispersed atrophy patterns in AD caused by disruption
in underlying brain connectomes [38]. I then presented the mathematical formulation of
DIVE in section 6.3. I performed simulations to show that DIVE can reliably estimate
cluster assignments, trajectory parameters and subject time-shifts in the presence of
ground truth (section 6.3.8). Afterwards, I tested DIVE on four different datasets with
distinct diseases (typical AD and PCA) and modalities (MRI and PET), and showed
that it can recover meaningful patterns of pathology, which agree with previous findings
in the literature, but offer us more spatial resolution, along with estimates of biomarker
dynamics and subject-specific time shifts. Finally, in section 6.4.3 I validated DIVE
by showing that the estimated clusters and trajectories are robust under 10-fold cross-
validation, and that it has favourable predictive performance compared to simpler models.
In chapter 7 I presented Disease Knowledge Transfer (DKT), a novel model that
robustly learns patterns of progression from several types of dementia combined. This
allows the inference of biomarker signatures in rare, atypical types of dementia, which
is otherwise difficult due to the lack of multimodal, longitudinal data. In section 7.4, I
presented the DKT framework, which I designed to be flexible, allowing one to plug-in any
disease progression model within each disease-agnostic and disease-specific unit. Using
simulations, I then showed in section 7.5.1 that DKT can accurately estimate biomarker
trajectories in two distinct diseases, and even when there is a lack of data for one of the
diseases, through correlations with other known markers. When applied to patient data
(section 7.5.2), I showed that DKT can estimate plausible biomarker trajectories, and
showed that is has favourable performance compared to standard models. Compared to
previous deep transfer learning approaches, DKT is also interpretable and can predict
the future evolution of subjects at risk of neurodegenerative diseases.
In chapter 8, I presented the design of the TADPOLE Challenge, which aims to
identify algorithms and features that can best predict the progression of subjects at risk
of AD. The challenge was organised jointly by myself and my collaborators, and we
had 33 international teams who made more than 90 submissions. For the challenge, I
helped write the website, assembled the main training dataset, built a live leaderboard
system that allowed instant evaluation of the predictions, and promoted the competition
at various conferences. I also wrote the paper describing the design of the challenge [236].
9.2. Future Research Directions 151
• Imaging predicting cognition: If we split the PCA population based on the discrep-
ancy between occipital-hippocampal values at baseline, does that predict distinct
patterns of cognitive impairment? One can hypothesise that relatively lower occip-
ital volumes for basic visual-PCA predict early visual deficits, with memory deficits
later on. On the other hand, relatively lower hippocampal volumes would predict
early multi-domain cognitive deficits, with visual deficits later on.
• Relationship between posterior and anterior patterns of atrophy: Does greater in-
ferior posterior atrophy predict greater inferior anterior atrophy, and vice-versa?
Moreover, based on the cognitively-defined subgroups, is atrophy in dorsolateral
prefrontal lobe different in the three cognitive subgroups, in the following manner:
(highest) space > object > vision (lowest)? Similarly, is inferior prefrontal atrophy
different between the three subgroups in the following manner: (highest) object >
space > vision (lowest)?
models on biomarkers other than MRI brain volumes, such as cortical thickness from
MRI, PET biomarkers (amyloid, tau, FDG), as well as DTI biomarkers (FA, MD, AD).
The multimodal biomarker trajectories estimated in PCA with the EBM, DEM and DIVE
models can also be compared with the ones inferred by DKT.
(GRN) and C9ORF72 genes. So far, an extension of the event-based model, which es-
timates multiple progression patterns in sub-populations, has been applied to FTD [29].
Applying spatiotemporal models such as DIVE or multi-disease models such as DKT
would help understand the heterogeneity and progression of FTD, find early biomarkers
and allow better stratification in FTD clinical trials. Moreover, the heterogeneity present
in FTD, combined with genetic information, can be used to further validate the DKT
model by checking how robustly it can transfer biomarker trajectories between different
FTD genetic groups.
actually have underlying mixed pathologies [282, 283], this analysis requires both in-vivo
longitudinal data along with post-mortem pathological confirmation. This has recently
become available in ANDI, which now has autopsy data for 56 AD and 52 age-matched
controls [284].
Such associations were not significant for simpler hippocampal volume or cortical amyloid
markers on their own [247]. Extending such work by adding other types of biomarker
data available in ADNI can identify further loci. Moreover, associations can also be found
between genes and various regions in the brain, and even with pathology identified at
voxelwise level from DIVE, using an approach similar to [285].
Longitudinal Neuroanatomical
Progression of Posterior Cortical
Atrophy
– Lateral Surface
∗ Frontal Pole (FRP)
∗ Superior Frontal Gyrus (SFG)
∗ Middle Frontal Gyrus (MFG)
∗ Opercular part of the Inferior Frontal Gyrus (OpIFG)
∗ Orbital part of the Inferior Frontal Gyrus (OrIFG)
∗ Triangular part of the Inferior Frontal Gyrus (TrIFG)
∗ Precentral Gyrus (PrG)
– Medial Surface
∗ Superior Frontal Gyrus, medial segment (MSFG)
∗ Supplementary Motor Cortex (SMC)
∗ Medial Frontal Cortex (MFC)
∗ Gyrus Rectus (GRe)
prcs-med
MP
MP
SMC
rG
splen-ant cs-med
SFG SPL
oG
genu-post
MSFG
MCgG MFG PrG PoG SMG
PCu AnG SOG
PCgG
ant
ACgG FRP
mhos-ant
occ-
OpIFG
mhos-ant
calc-ant STG
calc-pos
hip-ant
OFuG Ent
FuG
ITG ITG
IOG
STG
Figure A.1: Labels of the different areas analysed in the EBM progression snapshots from
chapter 4.
158 Appendix A. Longitudinal Neuroanatomical Progression of PCA
– Lateral Surface
∗ Temporal Pole (TMP)
∗ Superior Temporal Gyrus (STG)
∗ Middle Temporal Gyrus (MTG)
∗ Inferior Temporal Gyrus (ITG)
– Supratemporal Surface
∗ Planum Polare (PP)
∗ Transverse Temporal Gyrus (TTG)
∗ Planum Temporal (PT)
– Inferior Surface
∗ Fusiform Gyrus (FuG)
– Lateral Surface
∗ Postcentral Gyrus (PoG)
∗ Supramarginal Gyrus (SMG)
∗ Superior Parietal Lobule (SPL)
∗ Angular Gyrus (AnG)
– Medial Surface
∗ Postcentral Gyrus, medial segment (MPoG)
∗ Precuneus (PCu)
– Lateral Surface
159
– Cingulate Cortex
∗ Anterior cingulate gyrus (ACgG)
∗ Middle cingulate gyrus (MCgG)
∗ Posterior cingulate gyrus (PCgG)
– Medial Temporal Cortex
∗ Parahippocampal Gyrus (PHG)
∗ Entorhinal Area (Ent)
160 Appendix A. Longitudinal Neuroanatomical Progression of PCA
Figure A.2: Bootstrap samples of the atrophy sequence as estimated by the event-based
model, for the PCA and typical AD cohorts. The maximum likelihood sequences were
estimated using the EBM from 100 bootstrap datasets, with replacement, stratified by
diagnosis.
161
Figure A.3: Hypothesis testing of ordering of events within PCA (top) and typical AD
(bottom). We sampled 10,000 sequences from the EBM posterior using MCMC sampling
and only kept every 1/100 in order to remove correlation between samples. We applied
the non-parametric paired Wilcoxon signed rank test for every pair of biomarkers (x,y).
The null hypothesis is defined as H0: event A (Y-axis) becomes abnormal at the same
time as event B (X-axis), while the alternative hypothesis H1: event A (Y-axis) become
abnormal before event B (X-axis). The black squares show the pair of biomarkers where
the null hypothesis was rejected at alpha=0.05/(N*(N-1)/2), thus surviving Bonferroni
correction.
162 Appendix A. Longitudinal Neuroanatomical Progression of PCA
Figure A.4: Positional variance diagram estimated by the event-based model, for the
three PCA sugroups: Basic visual impairment group, Space perception impairment and
Object perception impairment.
163
Figure A.5: Bootstrap samples of the atrophy sequence as estimated by the event-based
model, for the three PCA sugroups: Basic visual impairment group, Space perception
impairment and Object perception impairment.
164 Appendix A. Longitudinal Neuroanatomical Progression of PCA
Figure A.6: Hypothesis testing of the ordering of events within the three PCA subgroups.
Hypothesis tests were designed as in A.3
165
Region Whole Br. Ventricles Hippo. Entorh. Occipital Temporal Frontal Parietal
Whole Brain - - - - - - - -
Ventricles 1.74e-04* - - - - - - -
Hippo. 1.20e-02 4.95e-02 - - - - - -
Entorhinal 1.61e-12* 1.27e-06* 5.29e-10* - - - - -
Occipital 7.93e-03 4.16e-06* 1.20e-04* 9.44e-12* - - - -
Temporal 2.66e-01 1.17e-02 3.12e-01 5.90e-10* 1.81e-03 - - -
Frontal 9.58e-01 1.57e-04* 1.07e-02 1.52e-12* 8.68e-03 2.49e-01 - -
Parietal 3.45e-04* 1.31e-08* 2.52e-07* 3.17e-15* 8.84e-01 7.91e-05* 4.08e-04* -
Table A.1: Statistical testing for significant differences in volumes of different brain re-
gions of PCA subjects at -10 years before reference t0 . Shown here are p-values from
two-tailed t-tests. (*) Statistically significant differences at significance level = 1.78e-3,
Bonferroni corrected for all 28 comparisons.
Region Whole Br. Ventricles Hippo. Entorh. Occipital Temporal Frontal Parietal
Whole Brain - - - - - - - -
Ventricles 1.52e-16* - - - - - - -
Hippo. 6.03e-13* 8.95e-06* - - - - - -
Entorhinal 4.78e-14* 5.66e-01 9.60e-04* - - - - -
Occipital 1.32e-06* 3.17e-17* 1.45e-14* 5.25e-16* - - - -
Temporal 3.57e-01 1.75e-16* 5.22e-13* 2.90e-14* 1.66e-05* - - -
Frontal 7.31e-12* 1.67e-04* 7.72e-01 4.38e-03 1.62e-14* 3.50e-12* - -
Parietal 1.53e-07* 1.41e-21* 3.30e-19* 2.68e-19* 2.20e-01 8.33e-06* 3.39e-18* -
Table A.2: Statistical testing for significant differences in volumes of different brain re-
gions of PCA subjects at t0 . See Supp. Table A.1 for information on statistical testing.
Region Whole Br. Ventricles Hippo. Entorh. Occipital Temporal Frontal Parietal
Whole Brain - - - - - - - -
Ventricles 5.97e-01 - - - - - - -
Hippo. 7.63e-13* 4.14e-13* - - - - - -
Entorhinal 5.88e-11* 2.34e-11* 1.23e-03* - - - - -
Occipital 4.04e-02 1.44e-01 3.00e-17* 1.06e-15* - - - -
Temporal 2.83e-03 1.22e-02 1.51e-15* 2.66e-14* 1.54e-01 - - -
Frontal 8.90e-15* 5.73e-15* 6.77e-02 7.35e-07* 2.19e-19* 2.99e-17* - -
Parietal 7.38e-02 2.07e-01 1.25e-14* 4.00e-13* 9.44e-01 1.73e-01 1.91e-16* -
Table A.3: Statistical testing for significant differences in volumes of different brain re-
gions of PCA subjects at 10 years after t0 . See Supp. Table A.1 for information on
statistical testing.
166 Appendix A. Longitudinal Neuroanatomical Progression of PCA
Region Whole Br. Ventricles Hippo. Entorh. Occipital Temporal Frontal Parietal
Whole Brain - - - - - - - -
Ventricles 9.26e-03 - - - - - - -
Hippo. 2.04e-10* 2.88e-14* - - - - - -
Entorhinal 2.21e-02 3.40e-01 6.82e-09* - - - - -
Occipital 3.38e-03 2.01e-06* 3.84e-04* 9.98e-04* - - - -
Temporal 3.93e-01 1.04e-01 4.72e-11* 7.64e-02 7.51e-04* - - -
Frontal 8.30e-01 1.04e-02 4.79e-09* 2.41e-02 1.15e-02 3.26e-01 - -
Parietal 4.94e-03 2.13e-06* 3.75e-05* 4.63e-04* 7.35e-01 8.64e-04* 1.57e-02 -
Table A.4: Statistical testing for significant differences in volumes of different brain re-
gions of tAD subjects at -10 years before t0 . See Supp. Table A.1 for information on
statistical testing.
Region Whole Br. Ventricles Hippo. Entorh. Occipital Temporal Frontal Parietal
Whole Brain - - - - - - - -
Ventricles 3.50e-11* - - - - - - -
Hippo. 4.12e-19* 3.61e-25* - - - - - -
Entorhinal 7.83e-02 2.64e-10* 2.93e-13* - - - - -
Occipital 7.65e-02 2.51e-08* 7.95e-10* 7.94e-01 - - - -
Temporal 6.29e-03 4.63e-13* 1.84e-14* 5.12e-01 8.07e-01 - - -
Frontal 2.01e-04* 2.65e-04* 4.16e-20* 2.84e-05* 1.98e-04* 3.31e-07* - -
Parietal 3.56e-03 6.00e-11* 1.94e-10* 2.45e-01 4.77e-01 4.81e-01 2.12e-06* -
Table A.5: Statistical testing for significant differences in volumes of different brain re-
gions of tAD subjects at t0 . See Supp. Table A.1 for information on statistical testing.
Region Whole Br. Ventricles Hippo. Entorh. Occipital Temporal Frontal Parietal
Whole Brain - - - - - - - -
Ventricles 2.92e-02 - - - - - - -
Hippo. 2.83e-01 1.67e-03* - - - - - -
Entorhinal 8.13e-03 6.50e-01 2.63e-04* - - - - -
Occipital 8.40e-02 9.92e-01 1.13e-02 7.28e-01 - - - -
Temporal 2.76e-12* 1.46e-14* 5.41e-11* 6.35e-15* 2.54e-10* - - -
Frontal 1.24e-09* 1.91e-06* 3.87e-11* 4.81e-06* 1.27e-04* 2.43e-19* - -
Parietal 7.92e-01 7.53e-02 1.98e-01 2.51e-02 1.57e-01 5.36e-11* 3.13e-08* -
Table A.6: Statistical testing for significant differences in volumes of different brain re-
gions of tAD subjects at 10 years after t0 . See Supp. Table A.1 for information on
statistical testing.
167
Table A.7: Statistical testing for significant differences in volumes of different brain re-
gions between PCA and tAD at -10, 0 and 10 years from t0 . Shown here are p-values from
two-tailed t-tests. (*) Statistically significant differences at significance level = 2.08e-3,
Bonferroni corrected for all 28 comparisons.
168 Appendix A. Longitudinal Neuroanatomical Progression of PCA
Figure A.7: Testing for statistically significant differences in positions of each biomarker
in the EBM abnormality sequences, for both PCA and typical AD. (*) Statistically sig-
nificant differences in position of a biomarker in the EBM sequences for PCA and tAD at
99% confidence, Bonferroni corrected for multiple comparisons (significance level = 5e-5).
A non-parametric Mann-Whitney U test has been applied because of non-gaussianity of
the data, which represents discrete ranks in a sequence. Most biomarkers show significant
differences – it is likely that there are differences in atrophy progression between PCA
and tAD.
169
Figure A.8: Testing for statistically significant differences in biomarker positions in the
EBM sequences of PCA subgroups for (a) Object vs Visual (b) Space vs Object and (c)
Visual vs Space subgroups. Only the first 10 biomarkers from the EBM sequence of one
disease (A – visual B – object C – space) are shown. The images also show only the
first 20 positions on the x-axis to aid visualisation. (*) Statistically significant differences
in biomarker positions between pairs of PCA subgroups at 99% confidence, Bonferroni
corrected for multiple comparisons (significance level = 5e-5). A non-parametric Mann-
Whitney U test has again been applied because of data non-gaussianity. All biomarkers
show significant differences – it is likely that there are differences in the progression of
atrophy between PCA subgroups.
170 Appendix A. Longitudinal Neuroanatomical Progression of PCA
Appendix B
DIVE: A Spatiotemporal
Progression Model of Brain
Pathology in Neurodegenerative
Disorders
172 Appendix B. DIVE: A Spatiotemporal Progression Model of Brain Pathology
A B
C D
Figure B.1: Error in DPS scores (A) and trajectory estimation (B) for Scenario 2 in
simulation experiments. (C-D) The same error scores for Scenario 3. We notice that
as the problem becomes more difficult, the errors in the DIVE estimated parameters
increase. Errors were measured as sum of squared differences (SSD) between the true
parameters and estimated parameters. For the trajectories, the SSD was calculated only
based on the sigmoid centres, due to different scaling of the other sigmoidal parameters.
• ROI-based model: groups vertices according to an a-priori defined ROI atlas. This
model is equivalent to the model by Jedynak et al., Neuroimage, 2012 and is a
special case of our model, where the latent variables zlk are fixed instead of being
marginalised as in equation 6.
• No-staging model: This is a model that doesn’t perform any time-shift of patients
along the disease progression timeline. It fixes αi = 1, βi = 0 for every subject,
which means that the disease progression score of every subject is age.
We performed this comparison using 10-fold cross-validation. For each subject in the
test set, we computed their DPS score and correlated all the DPS values with the same
four cognitive tests used previously. We also tested how well the models can predict
the future vertex-wise measurements as follows: for every subject i in the test set, we
used their first two scans to estimate αi = 1, βi = 0 and then used the rest of the
scans to compute the prediction error. For one vertex location on the cortical surface,
the prediction error was computed as the root mean squared error (RMSE) between its
predicted measure and the actual measure. This was then averaged across all subjects
and visits.
B.2.3 Results
Table B.1 shows the results of the model comparison, on ADNI MRI dataset. Each
row represents one model tested, while each column represents a different performance
measure: correlations with four different cognitive tests and accuracy in the prediction of
future vertexwise measurements. In each entry, we give the mean and standard deviation
of the correlation coefficients or RMSE across the 10 cross-validation folds.
Model CDRSOB (ρ) ADAS13 (ρ) MMSE (ρ) RAVLT (ρ) Prediction (RMSE)
DIVE 0.37 +/- 0.09 0.37 +/- 0.10 0.36 +/- 0.11 0.32 +/- 0.12 1.021 +/- 0.008
ROI-based model 0.36 +/- 0.10 0.35 +/- 0.11 0.34 +/- 0.13 0.30 +/- 0.13 1.019 +/- 0.010
No-staging model *0.09 +/- 0.06 *0.03 +/- 0.09 *0.05 +/- 0.06 *0.02 +/- 0.06 *1.062 +/- 0.024
Table B.1: Comparison of DIVE with two more simplistic models on the ADNI MRI
dataset. For each of the three models, we show the correlation of the disease progression
scores (DPS) with respect to several cognitive tests: CDRSOB, ADAS13, MMSE and
RAVLT. The correlation numbers represent the mean correlation across the 10 cross-
validation folds.
K
X
(u)
M = arg max p(Z = (z1 , ..., zL )|V, M (u−1) ) [log p(V, Z|M )] + log p(M ) (B.1)
M
z1,...,zL
The E-step involves computing p Z = (z1 , ..., zL )|V, M (u−1) , while the M-step com-
prises of solving the above equation.
B.3.1 E-step
In this step we need to estimate p(Z|V, M (u−1) ). For notational simplificy we will drop
the (u − 1) superscript from M
L
1 Y Y Y
p(Z|V, M ) = p(V, Z|M ) = N (Vlij |f (αi tij + βi |θZl ), σZl ) Ψ(Zl , Zl2 )
C l l2 ∈Nl
(i,j)∈I
(B.2)
where Nl is the set of neighbours of vertex l. However, this doesn’t directly factorise
over the vertices l due to the MRF terms Ψ(Zl , Zl2 ). It is however necessary to find
a form that factorises over the vertices, otherwise we won’t be able to represent in
memory the Q joint distribution over all Z variables. If we make the approximation
p(Z|V, M ) ≈ Ll p(Vl |Zl , M ) then we loose out all the MRF terms and the model won’t
account for spatial correlation. We instead do a first-degree approximation by condition-
(u−1)
ing on the values of ZNl , the labels of nearby vertices from the previous iteration. The
approximation is thus:
L
Y h i
(u−1)
p(Z|V, M ) ≈ EZ (u−1) |V ,M p(Zl |Vl , M, ZNl ) (B.3)
Nl l
l
This form allows us to factorise over all the vertices to get p(Zl |Vl , M ):
1 X (u−1) (u−1)
p(Zl |Vl , M ) ≈ p(Vl |Zl , M )p(Zl |ZNl )p(ZNl |Vl , M ) (B.4)
C (u−1)
ZN
l
where C is aQ
normalistion constant that can be dropped. We can now further factorise
(u−1) (u−1)
p(Zl |ZNl ) ≈ m∈{1,...,Nl } p(Zl |M, ZNl (m) = zNl (m) ) and apply a similar factorisation to
(u−1)
the prior p(ZNl |Vl , M ), resulting in:
1 X Y (u−1)
p(Zl |Vl , M ) ≈ p(Vl |Zl , M ) p(Zl |ZNl (m) = zNl (m) )
C z Nl (1) ,..,zNl (|Nl |) m∈{1,...,Nl }
(u−1)
p(ZNl (m) = zNl (m) |Vl , M ) (B.5)
Y X (u−1) (u−1)
p(Zl |Vl , M ) = p(Vl |Zl , M ) p(Zl |Zl2 = zl2 )p(Zl2 = zl2 |Vl , M ) (B.6)
l2 ∈Nl zl2
B.3. Derivation of the Generalised EM Algorithm 175
Y X (u−1) (u−1)
p(Zl |Vl , M ) = p(Vl |Zl , M ) p(Zl |Zl2 = k2 )p(Zl2 = k2 |Vl , M ) (B.7)
l2 ∈Nl k2
We shall also denote zlk = p(Zl |Vl , M ). Further simplifications result in:
" #" K
#
(u)
Y Y X (u−1)
zlk ∝ N (Vlij |f (αi tij + βi |θk ), σk ) zl2 k2 Ψ(Zl = k, Zl2 = k2 ) (B.8)
i,j∈I l2 ∈Nl k2 =1
" #
(u) log (2πσk2 )
X 1 ij 2
log zlk ∝ − − 2 (Vl − f (αi tij + βi |θk )) +
i,j∈I
2 2σk
" K
#
X X (u−1)
+ log zl2 k2 (δk2 k exp(λ) + (1 − δk2 k ) exp(−λ2 )) (B.9)
l2 ∈Nl k2 =1
K
(u) (u−1)
X X
log zlk ∝ Dlk + log zl2 k2 (δk2 k (exp(λ) − exp(−λ2 )) + exp(−λ2 )) (B.11)
l2 ∈Nl k2 =1
Finally, we simplify the sum over k2 to get the update equation for zlk :
" #
X h i
(u) (u−1)
log zlk ∝ Dlk + log exp(−λ2 ) + zl2 k (exp(λ) − exp(−λ2 )) (B.12)
l2 ∈Nl
In practice, we cannot naively compute the exponential term zlk = exp(log(zlk )) due
to precision loss. However, we go around this by recomputing the exponentiation and
normalisation of zlk simultaneously. Denoting x(k) = log zlk , for k ∈ [1 . . . K], we get:
ex(k) 1
zlk = = (B.13)
ex(1) + ex(2) + · · · + ex(K) ex(1)−x(k) + ex(2)−x(k) + · · · + ex(K)−x(k)
B.3.2 M-step
The M-step itself does not have a closed-form analytical solution. We choose to solve it
by successive refinements of the cluster trajectory parameters and the subject time shifts.
176 Appendix B. DIVE: A Spatiotemporal Progression Model of Brain Pathology
Taking equation B.1 and fixing the subject time-shifts α, β and measurement noise
σ, we can find its maximum with respect to θ only. More precisely, we want:
K
X
θ = arg max p(Z = (z1 , ..., zL )|V, M (u−1) ) [log p(V, Z|M )] + log p(θ) (B.14)
θ z1,...,zL
We observe that for each cluster the individual θk ’s are conditionally independent,
i.e. θk ⊥⊥ Q
θm |{Z, α, β, σ} ∀k, m. We also assume that the prior factorizes for each θk :
log p(θ) = K k log p(θk ). This allows us to optimise each θk independently:
K
X
θk = arg max p(Z = (z1 , ..., zL )|V, M (u−1) ) [log p(V, Z|M )] + log p(θk ) (B.15)
θk z1,...,zL
K
X
θk = arg max p(Z = (z1 , ..., zL )|V, M (u−1) )
θk z1,...,zL
L Y
Y
log N (Vlij |f (αi tij + βi |θzl ), σzl ) + log p(θk ) (B.16)
l=1 (i,j)∈I
Note that we didn’t include the MRF clique terms, since they are not a function of
θk . We propagate the logarithm inside the products:
K
X L X
X
θk = arg max p(Z = (z1 , ..., zL )|V, M (u−1)
) log N (Vlij |f (αi tij +βi |θzl ), σzl )+
θk z1,...,zL l=1 (i,j)∈I
We next assume that Zl , the hidden cluster assignment for vertex l, is condition-
ally independent of the other vertex assignments Zm , ∀m 6= l (See E-step approxima-
tion from Eq. B.3). This independence assumption induces the following factorization:
p(Z = (z1 , ..., zL )|V, M (u−1) ) = Ll p(Zl = zl |V, M (u−1) ). Propagating this product inside
Q
the sum over the vertices, we get:
L X
X K X
θk = arg max p(Zl = zl |V, M (u−1) ) log N (Vlij |f (αi tij + βi |θzl ), σzl ) + log p(θk )
θk
l=1 zl =1 (i,j)∈I
(B.18)
The terms which don’t contain θk dissapear:
B.3. Derivation of the Generalised EM Algorithm 177
L
X X
θk = arg max p(Zl = k|V, M (u−1) ) log N (Vlij |f (αi tij + βi |θk ), σk ) + log p(θk )
θk
l=1 (i,j)∈I
(B.19)
We further expand the Gaussian noise model:
L
X
θk = arg max p(Zl = k|V, M (u−1) )
θk
l=1
X 1 ij
−1/2 2
log (2πσk ) − 2 (Vl − f (αi tij + βi |θk )) + log p(θk ) (B.20)
2σk
(i,j)∈I
Constants dissapear due to the arg max and we get the final update equation for θk :
L
X X 1
(u−1) ij 2
θk = arg max p(Zl = k|V, M ) − 2 (Vl − f (αi tij + βi |θk )) + log p(θk )
θk
l=1
2σk
(i,j)∈I
(B.21)
Measurement noise - σ
L
X X
σk = arg max p(Zl = k|V, M (u−1) ) log N (Vlij |f (αi tij + βi |θk ), σk ) (B.22)
σk
l=1 (i,j)∈I
Note that, just as for θ above, the MRF clique terms were not included because they
are not a function of σk . Expanding the noise model we get:
L X
X
(u−1) 2 −1/2 1 ij 2
σk = arg max p(Zl = k|V, M ) log (2πσk ) − 2 (Vl − f (αi tij + βi |θk ))
σk
l=1
2σk
(i,j)∈I
(B.23)
The maximum of a function l(σk ) can be computed by taking the derivative of the
function l and setting it to zero. This is under the assumption that l is differentiable,
which it is but we won’t prove it here. This gives:
L X δ
δl(σk |.) X (u−1) 2 −1/2 1 ij 2
= p(Zl = k|V, M ) log (2πσk ) − 2 (Vl − f (αi tij + βi |θk ))
δσk l=1
δσ k 2σk
(i,j)∈I
(B.24)
Propagating the differential operator further inside the sums we get:
178 Appendix B. DIVE: A Spatiotemporal Progression Model of Brain Pathology
L X δ log σ 2
δl(σk |.) X (u−1) k δ 1 ij 2
= p(Zl = k|V, M ) − − 2
(Vl − f (αi tij + βi |θk ))
δσk l=1
δσ k 2 δσk 2σk
(i,j)∈I
(B.25)
We next perform several small manipulations to reach a more suitable form of the
derivative and then set it to be equal to zero:
L X 1
δl(σk |.) X (u−1) −2 ij 2
= p(Zl = k|V, M ) − − 3 (Vl − f (αi tij + βi |θk )) (B.26)
δσk l=1
σ k 2σ k
(i,j)∈I
L X σ2
δl(σk |.) X (u−1) k 1 ij 2
= p(Zl = k|V, M ) − 3 + 3 (Vl − f (αi tij + βi |θk )) (B.27)
δσk l=1
σk σk
(i,j)∈I
L
δl(σk |.) X X
p(Zl = k|V, M (u−1) ) −σk2 + (Vlij − f (αi tij + βi |θk ))2 = 0
= (B.28)
δσk l=1 (i,j)∈I
L X
K
X X 0
αi , βi = arg max p(Zl = k|V, M (u−1) ) log N (Vli j |f (αi0 ti0 j + βi0 |θk ), σk )+
αi ,βi
l=1 k=1 (i0 ,j)∈I
L X
X K X
αi , βi = arg max p(Zl = k|V, M (u−1) ) log N (Vlij |f (αi tij +βi |θk ), σk )+log p(αi , βi )
αi ,βi
l=1 k=1 j∈Ii
(B.31)
Expanding the Gaussian noise model we get:
L X
X K
αi , βi = arg max p(Zl = k|V, M (u−1) )
αi ,βi
l=1 k=1
X 1 ij
log (2πσk2 )−1/2 2
− 2 (Vl − f (αi tij + βi |θk )) + log p(αi , βi ) (B.32)
j∈Ii
2σk
B.3. Derivation of the Generalised EM Algorithm 179
After removing constant terms we end up with the final update equation for αi , βi :
" L K #
XX 1 X ij
αi , βi = arg min p(Zl = k|V, M (u−1) ) 2 (Vl − f (αi tij + βi |θk ))2 −
αi ,βi
l=1 k=1
2σ k j∈I i
λ(u) = arg max Ep(Z|V,M (u−1) ,λ,Z (u−1) ) [log p(V, Z|M (u−1) )] (B.34)
λ
Note that p(Z|V, M (u−1) , λ, Z (u−1) ) is a function of λ, so for each lambda we estimate
zlk through approximate inference. We do this because otherwise the optimisation of λ
will only take into account the clique terms and completely exclude the data terms. We
further simplify the objective function for lambda as follows:
K
X
(u)
λ = arg max p(Z = (z1 , ..., zL )|V, M (u−1) , λ, Z (u−1) )
λ z1,...,zL
YL Y L Y
Y
log N (Vlij |f (αi tij + βi |θzl ), σzl ) Ψ(zl , zl2 ) (B.35)
l=1 (i,j)∈I l=1 l2 ∈Nl
K
X
λ(u) = arg max p(Z = (z1 , ..., zL )|V, M (u−1) , λ, Z (u−1) )
λ z1,...,zL
L
X X L
XX
log N (Vlij |..) + log Ψ(zl , zl2 ) (B.36)
l=1 (i,j)∈I l=1 l2 ∈Nl
Let us denote zlk = p(Zl = k|V, M (u−1) , λ, Z (u−1) ). Assuming independence between
the latent variables Zl we get:
L X
X K X
λ = arg max zlk log N (Vlij |..) +
λ
l=1 k=1 (i,j)∈I
X K
L X K
X X
+ zlk zl2 k log Ψ(Zl = k, Zl2 = k2 ) (B.37)
l=1 k=1 l2 ∈Nl k2 =1
!
X h i
(u−1)
ζlk (λ) = exp Dlk + log exp(−λ2 ) + zl2 k (exp(λ) − exp(−λ2 )) (B.38)
l2 ∈Nl
where Dlk is as defined in Eq B.10. We then replace zlk with ζlk (λ) and introduce the
chosen MRF clique model to get:
L X
X K XL X
K X X
K
(u)
ζlk (λ)ζl2 k (λ) δkk2 λ + (1 − δkk2 )(−λ2 )
λ = arg max ζlk (λ)Dlk +
λ
l=1 k=1 l=1 k l2 ∈Nl k2 =1
(B.39)
We separate the cliques that have matching clusters to the ones that don’t:
L X
K L X X
K
" #
X X X
λ(u) = arg max ζlk (λ)Dlk + ζlk (λ)ζl2 k (λ) λ + ζlk (λ)ζl2 k (λ)(−λ2 )
λ
l=1 k=1 l=1 l2 ∈Nl k k26=k
(B.40)
We also factorise the clique terms:
L X
X K L X X
X K
(u)
λ = arg max ζlk (λ)Dlk + λ ζlk (λ)ζl2 k (λ) +
λ
l=1 k=1 l=1 l2 ∈Nl k
L X X
X K
2
+ (−λ ) ζlk (λ)(1 − ζl2 k (λ)) (B.41)
l=1 l2 ∈Nl k
L X
K
" #
X X X
λ(u) = arg max ζlk (λ) Dlk + λ ζl2 k (λ) − λ2 (1 − ζl2 k (λ)) (B.42)
λ
l=1 k=1 l2 ∈Nl l2 ∈Nl
where < V ij >Ẑk is the mean value of the vertices belonging to cluster k. Math-
ematically, we define Ẑk = [z1k γk , z2k γk , . . . , zLk γk ] where γk = ( Ll=1 zlk )−1 is the
P
normalisation constant. Moreover, we have that < V ij >Ẑk = Ll=1 zlk γk V ij . We take the
P
derivative of the likelihood function lf ast of the fast implementation (Eq. B.43) with
respect to θk and perform several simplifications:
L
!2
δlf ast (θk |.) δ X X ij
= zlk γk V − f (αi tij + βi |θk ) (B.44)
δθk δθk l=1
(i,j)∈I
L
! !
δlf ast (θk |.) X X −δf (.)
= 2 γk zlk V ij − f (αi tij + βi |θk ) (B.45)
δθk l=1
δθk
(i,j)∈I
L
!
δlf ast (θk |.) X X −δf (.)
γk zlk V ij − f (αi tij + βi |θk )
= 2 (B.46)
δθk l=1
δθk
(i,j)∈I
L
!
δlf ast (θk |.) X −δf (.) X
ij
= 2γk zlk V − f (αi tij + βi |θk ) (B.47)
δθk δθk l=1
(i,j)∈I
By setting the derivative to zero, the optimal θ is thus a solution of the following
equation:
L
!
X −δf (.) X
zlk V ij − f (αi tij + βi |θk )
=0 (B.48)
δθk l=1
(i,j)∈I
Taking the derivative of the function above (lslow ) with respect to θk we get:
L
δlslow (θk |.) X X ij δf (.)
= zlk 2(Vl − f (αi tij + βi |θk )) − =0 (B.50)
δθk l=1
δθ k
(i,j)∈I
182 Appendix B. DIVE: A Spatiotemporal Progression Model of Brain Pathology
X δf (.) XL
− zlk (Vlij − f (αi tij + βi |θk )) = 0 (B.51)
δθk l=1
(i,j)∈I
This is the same optimisation problem as in Eq. B.48, which proves that the two
formulations are equivalent with respect to θ.
In order to prove that this is equivalent to the slow version, we need to take the
derivative of the likelihood function (lf ast ) from the above equation with respect to αi ,
βi and set it to zero:
K
δlf ast (αi , βi |.) δ X −1 1 X
= γk 2
(< Vlij >Ẑk −f (αi tij + βi |θk ))2 = 0 (B.53)
δαi , βi δαi , βi k=1 2σk j∈I
i
We expand the average across the vertices and slide the derivative operator inside the
sums:
K L
!
−1 1 −δf (.)
X X X ij
γk 2
2 γk zlk Vl − f (αi tij + βi |θk ) (B.54)
k=1
2σk j∈I l=1
δαi , βi
i
PL
Since l=1 γk zlk = 1 we get:
K L
!
X 1 X −δf (.) X
2 γk−1 2 γk zlk (Vlij − f (αi tij + βi |θk )) (B.55)
k=1
2σk j∈I δαi , βi l=1
i
K L
!
X 1 X −δf (.) X
γk−1 γk 2 zlk (Vlij − f (αi tij + βi |θk )) (B.56)
k=1
2σk j∈I δαi , βi l=1
i
PL
Further sliding l=1 zlk to the left we get the final optimisation problem:
K L
X 1 X X −δf (.) ij
zlk (V − f (αi tij + βi |θk )) (B.57)
k=1
2σk2 l=1 j∈I
δαi , βi l
i
This is the same problem as the fast implementation one from Eq. B.57, thus the fast
model is equivalent to the slow model with respect to α, β.
184 Appendix B. DIVE: A Spatiotemporal Progression Model of Brain Pathology
Appendix C
10 0 10 10 0 10 10 0 10
biomarker 3 biomarker 4 biomarker 5
10 0 10 10 0 10 10 0 10
Disease Progression (years)
Figure C.1: Estimated biomarker trajectories for the ”synthetic AD” disease, plotted
alongside true trajectories. Estimation of the trajectories in biomarkers 0,1,4 and 5 has
been done without any data from the ”synthetic PCA” disease, only based on the disease-
agnostic correlations with biomarkers 2 and 3.
Appendix D
D.1.1 M-step
In the M-step we aim to find the arguments θ∗ that maximise the expected log-likelihood
of the complete data θ∗ = arg maxθ Q(θ|θold ).
P X
" P
#
X Y
Q(θ|θold ) = C + p(Zi = zi |Xi , θold ) log p(Xi |Zi = zi , θ) (D.4)
i=1 zi i=1
After moving the log inside the products and removing the constant C we get:
P X
" z N
#
X X i X
Q(θ|θold ) = p(Zi = zi |Xi , θold ) log p(xij |ES(j) ) + log p(xij |¬ES(j) )
i=1 zi j=1 j=zi +1
(D.5)
Replacing p(x|E) and p(x|¬E) with the pdf of a Gaussian distribution we get:
P X
X P
X
Q(θ|θold ) = p(Zi = zi |Xi , θold )
i=1 zi i=1
" zi N
#
X X
a
log N (xij |µaS(j) , σS(j) )+ n
log N (xij |µnS(j) , σS(j) )
j=1 j=zi +1
(D.6)
The function Q(θ|θold ) is differentiable with respect to all parameters apart from S
(which is discrete). We can thus find θ∗ by solving ∇θ Q(θ|θold ) = 0. We show the
derivation for parameter µnk , which is the solution of dµdn Q(θ|θold ) = 0. Using the result
k
from equation D.6 and moving the derivation operator inside the sums we get:
P
d old
XX
Q(θ|θ ) = p(Zi = zi |Xi , θold )
dµnk i=1 zi
" z N
#
X di X d
n
log N (xij |µaS(j) , σS(j)
a
)+ n
log N (xij |µnS(j) , σS(j)
n
) = 0 (D.7)
j=1
dµk j=z +1
dµk
i
The derivative term cancels all likelihood terms apart from the one where S(j) = k:
P X
" N
#
X X d
p(Zi = zi |Xi , θold ) I[S(j) = k] n log N (xij |µnk , σkn ) = 0 (D.8)
i=1 zi j=z +1
dµk
i
P X
X
old d n n −1
p(Zi = zi |Xi , θ ) log N (xik |µk , σk )I[S (k) > zi ] = 0 (D.10)
i=1 zi
dµnk
Further rearranging the sum terms we get:
P N
X d X
n
log N (x |µ
ik k
n
, σk
n
) I[S −1 (k) > zi ] p(Zi = zi |X, θold ) = 0 (D.11)
i=1
dµ k z =0 i
D.1. EBM Fitting using Expectation-Maximisation 189
P
X d
n
log N (xik |µnk , σkn ) p(S −1 (k) > Zi |X, θold ) = 0 (D.12)
i=1
dµ k
which results in the update rule for µnk , the mean of p(x|¬Ek )
P
X
µnk = xik win (D.14)
i=1
where
p(S −1 (k) > Zi |X, θold )
win = PP (D.15)
−1 (k) > Z |X, θ old )
i=1 p(S i
and
K
X
−1 old
p(S (k) > Zi |X, θ ) = p(Zi = l|X, θold ) (D.16)
l=S −1 (k)+1
Using a similar approach we get the update rules for σkn , µak , σka :
v
u P
uX
n
σk = t win (xik − µnk )2 (D.17)
i=1
P
X
µak = xik wia (D.18)
i=1
v
u P
uX
a
σk = t wia (xik − µak )2 (D.19)
i=1
where
p(S −1 (k) ≤ Zi |X, θold )
wia = PP (D.20)
−1 (k) ≤ Z |X, θ old )
i=1 p(S i
Solving for S in the M-step is intractable, so we use MCMC sampling where at each
step of the sampling process we propose a new sequence S new , find the optimal distribution
parameters for each biomarker given S new using the EM update rules and then evaluate
the likelihood Q(θ|θold ). The sequence and parameters that maximise the likelihood are
chosen and the EM proceeds to a new iteration. Although this approach might not
guarantee that we truly find the optimal parameters, it still results in an increase of
Q(θ|θold ). This approach, called generalised EM, still guarantees that the method will
still converge to a local maxima [195]. For parameter initialisation, we use the mean and
standard deviation of the control and patient populations.
190 Appendix D. Novel Extensions to the EBM and DEM
D.1.2 E-step
In the E-step, we simply estimate the latent disease stages Zi for every subject i. The
probability p(Zi = l|X, θold ) that subject i is at stage l in the abnormality sequence,
conditioned on the previous parameters θold , has a closed-form solution given by:
Ql a a
QN
old j=1 N (xi,s(j) |µS(j) , σS(j) ) j=l+1 log N (xi,s(j) |µnS(j) , σS(j)
n
)
p(Zi = l|X, θ ) = PK Qm a a
QN
m=0 j=1 N (xi,s(j) |µS(j) , σS(j) ) j=m+1
n
log N (xi,s(j) |µnS(j) , σS(j) )
(D.21)
Appendix E
Bibliography
[1] Clifford R Jack, David S Knopman, William J Jagust, Leslie M Shaw, Paul S Aisen,
Michael W Weiner, Ronald C Petersen, and John Q Trojanowski. Hypothetical
model of dynamic biomarkers of the Alzheimer’s pathological cascade. The Lancet
Neurology, 9(1):119–128, 2010.
[2] Bruno M Jedynak, Andrew Lang, Bo Liu, Elyse Katz, Yanwei Zhang, Bradley T
Wyman, David Raunig, C Pierre Jedynak, Brian Caffo, Jerry L Prince, et al. A com-
putational neurodegenerative disease progression score: method and results with
the Alzheimer’s Disease Neuroimaging Initiative cohort. Neuroimage, 63(3):1478–
1486, 2012.
[4] Murat Bilgel, Jerry L Prince, Dean F Wong, Susan M Resnick, and Bruno M Jedy-
nak. A multivariate nonlinear mixed effects model for longitudinal image analysis:
Application to amyloid imaging. NeuroImage, 134:658–670, 2016.
[5] Igor Koval, J-B Schiratti, Alexandre Routier, Michael Bacci, Olivier Colliot,
Stéphanie Allassonnière, Stanley Durrleman, and Alzheimers Disease Neuroimag-
ing Initiative. Statistical learning of spatiotemporal patterns from longitudinal
manifold-valued networks. In International Conference on Medical Image Comput-
ing and Computer-Assisted Intervention, pages 451–459. Springer, 2017.
[6] Ashish Raj, Amy Kuceyeski, and Michael Weiner. A network diffusion model of
disease progression in dementia. Neuron, 73(6):1204–1215, 2012.
[7] Alistair Burns and Steve Iliffe. Alzheimer’s disease. BMJ, 338:467–471 ST –
Alzheimer’s disease, 2009.
[8] World Health Organization (WHO) et al. Dementia fact sheet N. 362 2012.
[10] Charles Marcus, Esther Mena, and Rathan M Subramaniam. Brain PET in the
diagnosis of Alzheimer’s disease. Clinical nuclear medicine, 39(10):e413, 2014.
[11] Amritpal Mudher and Simon Lovestone. Alzheimer’s disease–do tauists and baptists
finally shake hands? Trends in neurosciences, 25(1):22–26, 2002.
[12] Dev Mehta, Robert Jackson, Gaurav Paul, Jiong Shi, and Marwan Sabbagh. Why
do trials for Alzheimer’s disease drugs keep failing? A discontinued drug perspective
for 2010-2015. Expert opinion on investigational drugs, 26(6):735–739, 2017.
[13] Clare J Galton, Karalyn Patterson, John H Xuereb, and John R Hodges. Atypical
and typical presentations of Alzheimer’s disease: a clinical, neuropsychological,
neuroimaging and pathological study of 13 cases. Brain, 123(3):484–498, 2000.
[14] Melissa E Murray, Neill R Graff-Radford, Owen A Ross, Ronald C Petersen, Ranjan
Duara, and Dennis W Dickson. Neuropathologically defined subtypes of Alzheimer’s
disease with distinct clinical characteristics: a retrospective study. The Lancet
Neurology, 10(9):785–796, 2011.
[15] Wei Qiang, Wai-Ming Yau, Jun-Xia Lu, John Collinge, and Robert Tycko. Struc-
tural variation in amyloid-β fibrils from Alzheimer’s disease clinical subtypes. Na-
ture, 541(7636):217, 2017.
[17] D Frank Benson, R Jeffrey Davis, and Bruce D Snyder. Posterior cortical atrophy.
Archives of neurology, 45(7):789–793, 1988.
[18] Sebastian J Crutch, Manja Lehmann, Jonathan M Schott, Gil D Rabinovici, Mar-
tin N Rossor, and Nick C Fox. Posterior cortical atrophy. The Lancet Neurology,
11(2):170–178, 2012.
[19] François-Xavier Borruat. Posterior cortical atrophy: review of the recent literature.
Current neurology and neuroscience reports, 13(12):1–8, 2013.
[20] Manja Lehmann, Sebastian J Crutch, Gerard R Ridgway, Basil H Ridha, Josephine
Barnes, Elizabeth K Warrington, Martin N Rossor, and Nick C Fox. Cortical
thickness and voxel-based morphometry in posterior cortical atrophy and typical
Alzheimer’s disease. Neurobiology of aging, 32(8):1466–1476, 2011.
[22] Yaakov Stern. Cognitive reserve in ageing and Alzheimer’s disease. The Lancet
Neurology, 11(11):1006–1012, 2012.
195
[23] Hubert M Fonteijn, Marc Modat, Matthew J Clarkson, Josephine Barnes, Manja
Lehmann, Nicola Z Hobbs, Rachael I Scahill, Sarah J Tabrizi, Sebastien Ourselin,
Nick C Fox, et al. An event-based model for disease progression and its application
in familial Alzheimer’s disease and Huntington’s disease. NeuroImage, 60(3):1880–
1889, 2012.
[24] Alexandra L Young, Neil P Oxtoby, Pankaj Daga, David M Cash, Nick C Fox,
Sebastien Ourselin, Jonathan M Schott, and Daniel C Alexander. A data-driven
model of biomarker changes in sporadic Alzheimer’s disease. Brain, 137(9):2564–
2577, 2014.
[25] Victor L Villemagne, Samantha Burnham, Pierrick Bourgeat, Belinda Brown,
Kathryn A Ellis, Olivier Salvado, Cassandra Szoeke, S Lance Macaulay, Ralph
Martins, Paul Maruff, et al. Amyloid β deposition, neurodegeneration, and cogni-
tive decline in sporadic Alzheimer’s disease: a prospective cohort study. The Lancet
Neurology, 12(4):357–367, 2013.
[26] Bruno M. Jedynak, Andrew Lang, Bo Liu, Elyse Katz, Yanwei Zhang, Bradley T.
Wyman, David Raunig, C. Pierre Jedynak, Brian Caffo, and Jerry L. Prince. A com-
putational neurodegenerative disease progression score: Method and results with
the Alzheimer’s disease neuroimaging initiative cohort. NeuroImage, 63(3):1478–
1486, 2012.
[27] J-B Schiratti, Stéphanie Allassonniere, Alexandre Routier, Olivier Colliot, Stanley
Durrleman, and Alzheimers Disease Neuroimaging Initiative. A mixed-effects model
with time reparametrization for longitudinal univariate manifold-valued data. In
International Conference on Information Processing in Medical Imaging, pages 564–
575. Springer, 2015.
[28] Yasser Iturria-Medina, Roberto C Sotero, Paule J Toussaint, José Marı́a Mateos-
Pérez, Alan C Evans, Michael W Weiner, Paul Aisen, Ronald Petersen, Clifford R
Jack, William Jagust, et al. Early role of vascular dysregulation on late-onset
Alzheimers disease based on multifactorial data-driven analysis. Nature communi-
cations, 7:11934, 2016.
[29] Alexandra L Young et al. Uncovering the heterogeneity and temporal complexity
of neurodegenerative diseases with Subtype and Stage Inference. Nature Commu-
nications, in press, 2018.
[30] Neil P Oxtoby, Alexandra L Young, David M Cash, Tammie LS Benzinger, Anne M
Fagan, John C Morris, Randall J Bateman, Nick C Fox, Jonathan M Schott, and
Daniel C Alexander. Data-driven models of dominantly-inherited Alzheimer’s dis-
ease progression. Brain, 141(5):1529–1544, 2018.
[31] SJ Ross, Naida Graham, Lindsay Stuart-Green, Miriam Prins, John Xuereb, Kar-
alyn Patterson, and John R Hodges. Progressive biparietal atrophy: an atypical
presentation of Alzheimer’s disease. Journal of Neurology, Neurosurgery & Psychi-
atry, 61(4):388–395, 1996.
[32] Maarten Goethals and Patrick Santens. Posterior cortical atrophy. Two case reports
and a review of the literature. Clinical neurology and neurosurgery, 103(2):115–119,
2001.
196 Appendix F. Bibliography
[33] Anna Rita Giovagnoli, Anna Aresi, Fabiola Reati, Alice Riva, Clara Gobbo, and
Alberto Bizzi. The neuropsychological and neuroradiological correlates of slowly
progressive visual agnosia. Neurological sciences, 30(2):123–131, 2009.
[34] Zheng Chang, Paul Lichtenstein, Henrik Larsson, and Seena Fazel. Substance use
disorders, psychiatric disorders, and mortality after release from prison: a nation-
wide longitudinal cohort study. The Lancet Psychiatry, 2(5):422–430, 2015.
[35] Jonathan Kennedy, Manja Lehmann, Magdalena J Sokolska, Hilary Archer, Eliza-
beth K Warrington, Nick C Fox, and Sebastian J Crutch. Visualizing the emergence
of posterior cortical atrophy. Neurocase, 18(3):248–257, 2012.
[36] Sebastian J Crutch, Jonathan M Schott, Gil D Rabinovici, Melissa Murray, Julie S
Snowden, Wiesje M van der Flier, Bradford C Dickerson, Rik Vandenberghe, Sam-
rah Ahmed, Thomas H Bak, et al. Consensus classification of posterior cortical
atrophy. Alzheimer’s & Dementia, 13(8):870–884, 2017.
[37] Manja Lehmann, Josephine Barnes, Gerard R Ridgway, Natalie S Ryan, Eliza-
beth K Warrington, Sebastian J Crutch, and Nick C Fox. Global gray matter
changes in posterior cortical atrophy: a serial imaging study. Alzheimer’s & De-
mentia, 8(6):502–512, 2012.
[38] William W Seeley, Richard K Crawford, Juan Zhou, Bruce L Miller, and Michael D
Greicius. Neurodegenerative diseases target large-scale human brain networks. Neu-
ron, 62(1):42–52, 2009.
[39] Esther E Bron, Marion Smits, Wiesje M Van Der Flier, Hugo Vrenken, Frederik
Barkhof, Philip Scheltens, Janne M Papma, Rebecca ME Steketee, Carolina Méndez
Orellana, Rozanna Meijboom, et al. Standardized evaluation of algorithms for
computer-aided diagnosis of dementia based on structural MRI: the CADDementia
challenge. NeuroImage, 111:562–579, 2015.
[40] Alessia Sarica, Cerasa Antonio, Quattrone Aldo, and Calhoun Vince. A machine
learning neuroimaging challenge for automated diagnosis of Mild Cognitive Impair-
ment. in press, 2018.
[41] Henry W Querfurth and Frank M LaFerla. Mechanisms of disease. New England
Journal of Medicine, 362(4):329–344, 2010.
[42] M Prince, A Wimo, M Guerchet, et al. The global impact of Dementia: an analysis
of prevalence, incidence, cost and trends. 2015. London, UK: Alzheimer’s Disease
International.
[43] Hans Förstl and Alexander Kurz. Clinical features of Alzheimer’s disease. European
archives of psychiatry and clinical neuroscience, 249(6):288–290, 1999.
[45] Joseph J Locascio, John H Growdon, and Suzanne Corkin. Cognitive test perfor-
mance in detecting, staging, and tracking Alzheimer’s disease. Archives of neurol-
ogy, 52(11):1087–1099, 1995.
197
[46] Vanessa Moore and Maria A Wyke. Drawing disability in patients with senile
dementia. Psychological Medicine, 14(1):97–105, 1984.
[48] Alastair Burns, Robin Jacoby, and Raymond Levy. Psychiatric phenomena in
Alzheimer’s disease. IV: Disorders of behaviour. The British Journal of Psychi-
atry, 157(1):86–94, 1990.
[49] William W Beatty, David P Salmon, Nelson Butters, William C Heindel, and Eric L
Granholm. Retrograde amnesia in patients with Alzheimer’s disease or Huntington’s
disease. Neurobiology of aging, 9:181–186, 1988.
[51] Jeffrey L Cummings, John P Houlihan, and Mary Ann Hill. The pattern of reading
deterioration in dementia of the Alzheimer type: Observations and implications.
Brain and language, 29(2):315–323, 1986.
[52] Jean Neils, Francois Boller, Bernice Gerdeman, and Monroe Cole. Descriptive
writing abilities in Alzheimer’s disease. Journal of Clinical and Experimental neu-
ropsychology, 11(5):692–698, 1989.
[53] Barry Reisberg, Stefanie R Auer, Isabel Monteiro, Istvan Boksay, and Steven G
Sclan. Behavioral disturbances of dementia: an overview of phenomenology and
methodologic concerns. International Psychogeriatrics, 8(S2):169–182, 1996.
[55] Clive Ballard, Serge Gauthier, Anne Corbett, Carol Brayne, Dag Aarsland, and
Emma Jones. Alzheimer’s disease. The Lancet, 377(9770):1019–1031, 2011.
[56] John Hardy and David Allsop. Amyloid deposition as the central event in the
aetiology of Alzheimer’s disease. Trends in pharmacological sciences, 12:383–388,
1991.
[58] J Götz, F Chen, Jo Van Dorpe, and RM Nitsch. Formation of neurofibrillary tangles
in P301L tau transgenic mice induced by Aβ42 fibrils. Science, 293(5534):1491–
1495, 2001.
198 Appendix F. Bibliography
[59] Jada Lewis, Dennis W Dickson, Wen-Lang Lin, Louise Chisholm, Anthony Corral,
Graham Jones, Shu-Hui Yen, Naruhiko Sahara, Lisa Skipper, Debra Yager, et al.
Enhanced neurofibrillary degeneration in transgenic mice expressing mutant tau
and APP. Science, 293(5534):1487–1491, 2001.
[60] Erik D Roberson, Kimberly Scearce-Levie, Jorge J Palop, Fengrong Yan, Irene H
Cheng, Tiffany Wu, Hilary Gerstein, Gui-Qiu Yu, and Lennart Mucke. Reducing
endogenous tau ameliorates amyloid ß-induced deficits in an Alzheimer’s disease
mouse model. Science, 316(5825):750–754, 2007.
[61] George S Bloom. Amyloid-β and tau: the trigger and bullet in Alzheimer disease
pathogenesis. JAMA neurology, 71(4):505–508, 2014.
[62] Karen K Hsiao, David R Borchelt, Kristine Olson, Rosa Johannsdottir, Cheryl Kitt,
Wael Yunis, Sherry Xu, Chris Eckman, Steven Younkin, Donald Price, et al. Age-
related CNS disorder and early death in transgenic FVB/N mice overexpressing
Alzheimer amyloid precursor proteins. Neuron, 15(5):1203–1218, 1995.
[63] Michael C Irizarry, Megan McNamara, Kerri Fedorchak, Karen Hsiao, and
Bradley T Hyman. APPSw transgenic mice develop age-related Aβ deposits and
neuropil abnormalities, but no neuronal loss in CA1. Journal of Neuropathology &
Experimental Neurology, 56(9):965–973, 1997.
[65] H Braak and E Braak. Evolution of neuronal changes in the course of Alzheimer’s
disease. In Ageing and dementia, pages 127–140. Springer, 1998.
[66] F Braak, Heiko Braak, and E-M Mandelkow. A sequence of cytoskeleton changes
related to the formation of neurofibrillary tangles and neuropil threads. Acta neu-
ropathologica, 87(6):554–567, 1994.
[68] Paul T Francis, Alan M Palmer, Michael Snape, and Gordon K Wilcock. The
cholinergic hypothesis of Alzheimer’s disease: a review of progress. Journal of
Neurology, Neurosurgery & Psychiatry, 66(2):137–147, 1999.
[69] Peter Davies and AJF Maloney. Selective loss of central cholinergic neurons in
Alzheimer’s disease. The Lancet, 308(8000):1403, 1976.
[70] Alessandro Martorana, Zaira Esposito, and Giacomo Koch. Beyond the cholinergic
hypothesis: do current drugs work in Alzheimer’s disease? CNS neuroscience &
therapeutics, 16(4):235–245, 2010.
[73] John S Meyer, Gaiane Rauch, Ronald A Rauch, and A Haque. Risk factors for
cerebral hypoperfusion, mild cognitive impairment, and dementia. Neurobiology of
aging, 21(2):161–169, 2000.
[74] Monique Breteler. Vascular involvement in cognitive decline and dementia: epi-
demiologic evidence from the Rotterdam Study and the Rotterdam Scan Study.
Annals of the New York Academy of Sciences, 903(1):457–465, 2000.
[76] MC Polidori, M Marvardi, A Cherubini, U Senin, and P Mecocci. Heart disease and
vascular risk factors in the cognitively impaired elderly: implications for Alzheimer’s
dementia. Aging Clinical and Experimental Research, 13(3):231–239, 2001.
[77] Albert Hofman, Alewijn Ott, Monique MB Breteler, Michiel L Bots, Arjen JC
Slooter, Frans van Harskamp, Cornelia N van Duijn, Christine Van Broeck-
hoven, and Diederick E Grobbee. Atherosclerosis, apolipoprotein E, and preva-
lence of dementia and Alzheimer’s disease in the Rotterdam Study. The Lancet,
349(9046):151–154, 1997.
[78] Morgan Robinson, Brenda Y Lee, and Francis T Hane. Recent progress in
Alzheimer’s disease research, part 2: genetics and epidemiology. Journal of
Alzheimer’s Disease, 57(2):317–330, 2017.
[79] AL Mina Bergem, Knut Engedal, and Einar Kringlen. The role of heredity in late-
onset Alzheimer disease and vascular dementia: a twin study. Archives of General
Psychiatry, 54(3):264–270, 1997.
[80] Margaret Gatz, Nancy L Pedersen, Stig Berg, Boo Johansson, Kurt Johansson,
James A Mortimer, Samuel F Posner, Matti Viitanen, Bengt Winblad, and Anders
Ahlbom. Heritability for Alzheimer’s disease: the study of dementia in Swedish
twins. The Journals of Gerontology Series A: Biological Sciences and Medical Sci-
ences, 52(2):M117–M125, 1997.
[81] Vincent Chouraki and Sudha Seshadri. Genetics of Alzheimer’s disease. In Advances
in genetics, volume 87, pages 245–294. Elsevier, 2014.
[82] George G Glenner and Caine W Wong. Alzheimer’s disease and Down’s syndrome:
sharing of a unique cerebrovascular amyloid fibril protein. Biochemical and bio-
physical research communications, 122(3):1131–1135, 1984.
[84] Dmitry Goldgaber, Michael I Lerman, O Westley McBride, Umberto Saffiotti, and
D Carleton Gajdusek. Characterization and chromosomal localization of a cDNA
encoding brain amyloid of Alzheimer’s disease. Science, 235(4791):877–880, 1987.
[87] Ephrat Levy-Lahad, Wilma Wasco, Parvoneh Poorkaj, Donna M Romano, Junko
Oshima, Warren H Pettingell, Chang En Yu, Paul D Jondro, Stephen D Schmidt,
Kai Wang, et al. Candidate gene for the chromosome 1 familial Alzheimer’s disease
locus. Science, 269(5226):973–977, 1995.
[91] Denise Harold, Richard Abraham, Paul Hollingworth, Rebecca Sims, Amy Ger-
rish, Marian L Hamshere, Jaspreet Singh Pahwa, Valentina Moskvina, Kimber-
ley Dowzell, Amy Williams, et al. Genome-wide association study identifies vari-
ants at CLU and PICALM associated with Alzheimer’s disease. Nature genetics,
41(10):1088, 2009.
[93] Sudha Seshadri, Annette L Fitzpatrick, M Arfan Ikram, Anita L DeStefano, Vil-
mundur Gudnason, Merce Boada, Joshua C Bis, Albert V Smith, Minerva M Car-
rasquillo, Jean Charles Lambert, et al. Genome-wide analysis of genetic loci asso-
ciated with Alzheimer disease. JAMA, 303(18):1832–1840, 2010.
201
[94] Paul Hollingworth, Denise Harold, Rebecca Sims, Amy Gerrish, Jean-Charles
Lambert, Minerva M Carrasquillo, Richard Abraham, Marian L Hamshere,
Jaspreet Singh Pahwa, Valentina Moskvina, et al. Common variants at ABCA7,
MS4A6A/MS4A4E, EPHA1, CD33 and CD2AP are associated with Alzheimer’s
disease. Nature genetics, 43(5):429, 2011.
[95] Adam C Naj, Gyungah Jun, Gary W Beecham, Li-San Wang, Badri Narayan Var-
darajan, Jacqueline Buros, Paul J Gallins, Joseph D Buxbaum, Gail P Jarvik,
Paul K Crane, et al. Common variants at MS4A4/MS4A6E, CD2AP, CD33
and EPHA1 are associated with late-onset Alzheimer’s disease. Nature genetics,
43(5):436, 2011.
[96] Madhav Thambisetty, Yang An, and Toshiko Tanaka. Alzheimer’s disease risk genes
and the age-at-onset phenotype. Neurobiology of aging, 34(11):2696–e1, 2013.
[97] Madhav Thambisetty, Lori L Beason-Held, Yang An, Michael Kraut, Michael Nalls,
Dena G Hernandez, Andrew B Singleton, Alan B Zonderman, Luigi Ferrucci, Simon
Lovestone, et al. Alzheimer risk variant CLU and brain function during aging.
Biological psychiatry, 73(5):399–405, 2013.
[99] Lori B Chibnik, Joshua M Shulman, Sue E Leurgans, Julie A Schneider, Robert S
Wilson, Dong Tran, Cristin Aubin, Aron S Buchman, Christopher B Heward,
Amanda J Myers, et al. CR1 is associated with amyloid plaque burden and age-
related cognitive decline. Annals of neurology, 69(3):560–569, 2011.
[100] Lyzel S Elias-Sonnenschein, Seppo Helisalmi, Teemu Natunen, Anette Hall, Teemu
Paajanen, Sanna-Kaisa Herukka, Marjo Laitinen, Anne M Remes, Anne M
Koivisto, Kari M Mattila, et al. Genetic loci associated with Alzheimer’s dis-
ease and cerebrospinal fluid biomarkers in a Finnish case-control cohort. PloS one,
8(4):e59676, 2013.
[101] John SK Kauwe, Carlos Cruchaga, Celeste M Karch, Brooke Sadler, Mo Lee, Kevin
Mayo, Wayne Latu, Manti Su’a, Anne M Fagan, David M Holtzman, et al. Fine
mapping of genetic variants in BIN1, CLU, CR1 and PICALM for association with
cerebrospinal fluid biomarkers for Alzheimer’s disease. PloS one, 6(2):e15918, 2011.
[102] Janita Bralten, Barbara Franke, Alejandro Arias-Vásquez, Angelien Heister, Han G
Brunner, Guillén Fernández, and Mark Rijpkema. CR1 genotype is associated
with entorhinal cortex volume in young healthy adults. Neurobiology of aging,
32(11):2106–e7, 2011.
[106] Matthew Baumgart, Heather M Snyder, Maria C Carrillo, Sam Fazio, Hye Kim,
and Harry Johns. Summary of the evidence on modifiable risk factors for cognitive
decline and dementia: a population-based perspective. Alzheimer’s & Dementia,
11(6):718–726, 2015.
[107] Stephen Todd, Stephen Barr, Mark Roberts, and A Peter Passmore. Survival in
dementia and predictors of mortality: a review. International journal of geriatric
psychiatry, 28(11):1109–1124, 2013.
[108] Janine K Cataldo, Judith J Prochaska, and Stanton A Glantz. Cigarette smoking
is a risk factor for Alzheimer’s Disease: an analysis controlling for tobacco industry
affiliation. Journal of Alzheimer’s disease, 19(2):465–480, 2010.
[109] Paula Valencia Moulton and Wei Yang. Air pollution, oxidative stress, and
Alzheimer’s disease. Journal of environmental and public health, 2012, 2012.
[110] J Eric Ahlskog, Yonas E Geda, Neill R Graff-Radford, and Ronald C Petersen.
Physical exercise as a preventive or disease-modifying treatment of dementia and
brain aging. In Mayo Clinic Proceedings, volume 86, pages 876–884. Elsevier, 2011.
[111] Guy McKhann, David Drachman, Marshall Folstein, Robert Katzman, Donald
Price, and Emanuel M Stadlan. Clinical diagnosis of Alzheimer’s disease Report of
the NINCDS-ADRDA Work Group* under the auspices of Department of Health
and Human Services Task Force on Alzheimer’s Disease. Neurology, 34(7):939–939,
1984.
[112] Karl Herholz. Use of FDG PET as an imaging biomarker in clinical trials of
Alzheimer’s disease. Biomarkers in medicine, 6(4):431–439, 2012.
[113] William E Klunk, Henry Engler, Agneta Nordberg, Yanming Wang, Gunnar
Blomqvist, Daniel P Holt, Mats Bergström, Irina Savitcheva, Guo-Feng Huang,
Sergio Estrada, et al. Imaging brain amyloid in Alzheimer’s disease with Pitts-
burgh Compound-B. Annals of neurology, 55(3):306–319, 2004.
[114] Kaj Blennow and Harald Hampel. CSF markers for incipient Alzheimer’s disease.
The Lancet Neurology, 2(10):605–613, 2003.
[115] Bruno Dubois, Howard H Feldman, Claudia Jacova, Steven T DeKosky, Pascale
Barberger-Gateau, Jeffrey Cummings, André Delacourte, Douglas Galasko, Serge
Gauthier, Gregory Jicha, et al. Research criteria for the diagnosis of Alzheimer’s
disease: revising the NINCDS–ADRDA criteria. The Lancet Neurology, 6(8):734–
746, 2007.
203
[116] Bruno Dubois, Howard H Feldman, Claudia Jacova, Jeffrey L Cummings, Steven T
DeKosky, Pascale Barberger-Gateau, André Delacourte, Giovanni Frisoni, Nick C
Fox, Douglas Galasko, et al. Revising the definition of Alzheimer’s disease: a new
lexicon. The Lancet Neurology, 9(11):1118–1127, 2010.
[117] Urban Ekman, Daniel Ferreira, and Eric Westman. The A/T/N biomarker scheme
and patterns of brain atrophy assessed in mild cognitive impairment. Scientific
reports, 8(1):8431, 2018.
[118] Martin Reuter, Nicholas J Schmansky, H Diana Rosas, and Bruce Fischl. Within-
subject template estimation for unbiased longitudinal image analysis. Neuroimage,
61(4):1402–1418, 2012.
[119] Clifford R Jack Jr, David S Knopman, William J Jagust, Ronald C Petersen,
Michael W Weiner, Paul S Aisen, Leslie M Shaw, Prashanthi Vemuri, Heather J
Wiste, Stephen D Weigand, et al. Update on hypothetical model of Alzheimer’s
disease biomarkers. Lancet neurology, 12(2):207, 2013.
[120] Ann D Cohen and William E Klunk. Early detection of Alzheimer’s disease using
PiB and FDG PET. Neurobiology of disease, 72:117–122, 2014.
[121] Val J Lowe, Geoffry Curran, Ping Fang, Amanda M Liesinger, Keith A Josephs,
Joseph E Parisi, Kejal Kantarci, Bradley F Boeve, Mukesh K Pandey, Tyler Bru-
insma, et al. An autoradiographic evaluation of AV-1451 Tau PET in dementia.
Acta neuropathologica communications, 4(1):58, 2016.
[123] Perminder S Sachdev, Lin Zhuang, Nady Braidy, and Wei Wen. Is Alzheimer’s a
disease of the white matter? Current opinion in psychiatry, 26(3):244–251, 2013.
[125] Yu Zhang, Norbert Schuff, An-Tao Du, Howard J Rosen, Joel H Kramer,
Maria Luisa Gorno-Tempini, Bruce L Miller, and Michael W Weiner. White matter
damage in frontotemporal dementia and Alzheimer’s disease measured by diffusion
MRI. Brain, 132(9):2579–2592, 2009.
[126] Juan Zhou, Efstathios D Gennatas, Joel H Kramer, Bruce L Miller, and William W
Seeley. Predicting regional neurodegeneration from the healthy brain functional
connectome. Neuron, 73(6):1216–1227, 2012.
[127] Hui Zhang, Torben Schneider, Claudia A Wheeler-Kingshott, and Daniel C Alexan-
der. NODDI: practical in vivo neurite orientation dispersion and density imaging
of the human brain. Neuroimage, 61(4):1000–1016, 2012.
204 Appendix F. Bibliography
[129] Basil H Ridha, Josephine Barnes, Jonathan W Bartlett, Alison Godbolt, Tracey
Pepple, Martin N Rossor, and Nick C Fox. Tracking atrophy progression in familial
Alzheimer’s disease: a serial MRI study. The Lancet Neurology, 5(10):828–834,
2006.
[130] NC Fox, RI Scahill, WR Crum, and MN Rossor. Correlation between rates of brain
atrophy and cognitive decline in AD. Neurology, 52(8):1687–1687, 1999.
[131] Rachael I Scahill, Jonathan M Schott, John M Stevens, Martin N Rossor, and
Nick C Fox. Mapping the evolution of regional atrophy in Alzheimer’s disease: un-
biased analysis of fluid-registered serial MRI. Proceedings of the National Academy
of Sciences, 99(7):4703–4707, 2002.
[133] Jonathan M Schott, Nick C Fox, Chris Frost, Rachael I Scahill, John C Janssen,
Dennis Chan, Rhian Jenkins, and Martin N Rossor. Assessing the onset of structural
change in familial Alzheimer’s disease. Annals of neurology, 53(2):181–188, 2003.
[135] Clifford R Jack, Ronald C Petersen, Yue Cheng Xu, Stephen C Waring, Peter C
O’Brien, Eric G Tangalos, Glenn E Smith, Robert J Ivnik, and Emre Kokmen. Me-
dial temporal atrophy on MRI in normal aging and very mild Alzheimer’s disease.
Neurology, 49(3):786–794, 1997.
[136] Stephane Lehericy, Michel Baulac, Jacques Chiras, Laurent Pierot, Nadine Martin,
Bernard Pillon, Bernard Deweer, Bruno Dubois, and Claude Marsault. Amygdalo-
hippocampal MR volume measurements in the early stages of Alzheimer disease.
American Journal of Neuroradiology, 15(5):929–937, 1994.
[137] Kirsi Juottonen, Mikko P Laakso, Kaarina Partanen, and Hilkka Soininen. Com-
parative MR analysis of the entorhinal cortex and hippocampus in diagnosing
Alzheimer disease. American Journal of Neuroradiology, 20(1):139–144, 1999.
[139] Christopher M Clark, Julie A Schneider, Barry J Bedell, Thomas G Beach, War-
ren B Bilker, Mark A Mintun, Michael J Pontecorvo, Franz Hefti, Alan P Carpenter,
Matthew L Flitter, et al. Use of florbetapir-PET for imaging β-amyloid pathology.
JAMA, 305(3):275–283, 2011.
[141] Katia Andrade, Dalila Samri, Marie Sarazin, Leonardo C de Souza, Laurent Cohen,
Michel T de Schotten, Bruno Dubois, and Paolo Bartolomeo. Visual neglect in
posterior cortical atrophy. BMC neurology, 10(1):68, 2010.
[142] Katia Andrade, Aurélie Kas, Romain Valabrègue, Dalila Samri, Marie Sarazin,
Marie-Odile Habert, Bruno Dubois, and Paolo Bartolomeo. Visuospatial deficits in
posterior cortical atrophy: structural and functional correlates. Journal of Neurol-
ogy, Neurosurgery & Psychiatry, 2012.
[143] Katia Andrade, Aurélie Kas, Dalila Samri, Marie Sarazin, Bruno Dubois, Marie-
Odile Habert, and Paolo Bartolomeo. Visuospatial deficits and hemispheric perfu-
sion asymmetries in posterior cortical atrophy. Cortex, 49(4):940–947, 2013.
[144] Manja Lehmann, Josephine Barnes, Gerard R Ridgway, John Wattam-Bell, Eliza-
beth K Warrington, Nick C Fox, and Sebastian J Crutch. Basic visual function and
cortical thickness patterns in posterior cortical atrophy. Cerebral cortex, 21(9):2122–
2132, 2011.
[145] Raphaël Depaz, Stéphane Haik, Katell Peoch, Danielle Seilhean, David Grabli,
Savine Vicart, Marie Sarazin, Bertrand DeToffol, Catherine Remy, Catherine Fallet-
Bianco, et al. Long-standing prion dementia manifesting as posterior cortical atro-
phy. Alzheimer Disease & Associated Disorders, 26(3):289–292, 2012.
[146] Mario F Mendez, Mehdi Ghajarania, and Kent M Perryman. Posterior cortical
atrophy: clinical characteristics and differences compared to Alzheimer’s disease.
Dementia and geriatric cognitive disorders, 14(1):33–40, 2002.
[149] Raffaella Migliaccio, Federica Agosta, Katya Rascovsky, Anna Karydas, Stephen
Bonasera, Gil D Rabinovici, BL Miller, and ML Gorno-Tempini. Clinical syndromes
associated with posterior atrophy early age at onset AD spectrum. Neurology,
73(19):1571–1578, 2009.
206 Appendix F. Bibliography
[150] Jonathan M Schott, Basil H Ridha, Sebastian J Crutch, Daniel G Healy, James B
Uphill, Elizabeth K Warrington, Martin N Rossor, and Nick C Fox. Apolipoprotein
e genotype modifies the phenotype of Alzheimer disease. Archives of neurology,
63(1):155–156, 2006.
[152] Martin A Goldstein, Iliyan Ivanov, and Michael E Silverman. Posterior cortical
atrophy: an exemplar for renovating diagnostic formulation in neuropsychiatry.
Comprehensive psychiatry, 52(3):326–333, 2011.
[155] Patrick R Hof, Brent A Vogt, Constantin Bouras, and John H Morrison. Atypical
form of Alzheimer’s disease with prominent posterior cortical atrophy: a review
of lesion distribution and circuit disconnection in cortical visual pathways. Vision
research, 37(24):3609–3625, 1997.
[157] Tomokatsu Yoshida, Kensuke Shiga, Kenji Yoshikawa, Kei Yamada, and Masanori
Nakagawa. White matter loss in the splenium of the corpus callosum in a case of
posterior cortical atrophy: a diffusion tensor imaging study. European neurology,
52(2):77–81, 2004.
[158] Raffaella Migliaccio, Federica Agosta, Monica N Toba, Dalila Samri, Fabian Corlier,
Leonardo C De Souza, Marie Chupin, Michael Sharman, Maria L Gorno-Tempini,
Bruno Dubois, et al. Brain networks in posterior cortical atrophy: a single case
tractography study and literature review. Cortex, 48(10):1298–1309, 2012.
[159] Aurelie Kas, Leonardo Cruz De Souza, Dalila Samri, Paolo Bartolomeo, Lucette
Lacomblez, Michel Kalafat, Raffaella Migliaccio, Michel Thiebaut de Schotten, Lau-
rent Cohen, Bruno Dubois, et al. Neural correlates of cognitive impairment in
posterior cortical atrophy. Brain, 134(5):1464–1478, 2011.
[160] Simona Gardini, Letizia Concari, Salvatrice Pagliara, Caterina Ghetti, Annalena
Venneri, and Paolo Caffarra. Visuo-spatial imagery impairment in posterior cortical
207
[161] Judith Aharon-Peretz, Ora Israel, Dorit Goldsher, and Aharon Peretz. Posterior
cortical atrophy variants of Alzheimer’s disease. Dementia and geriatric cognitive
disorders, 10(6):483–487, 1999.
[162] Pietro Pietrini, Maura L Furey, Neill Graff-Radford, Ulderico Freo, et al. Prefer-
ential metabolic involvement of visual cortical areas in a subtype of Alzheimer’s
disease: clinical implications. The American journal of psychiatry, 153(10):1261,
1996.
[163] Steven Y Ng, Victor L Villemagne, Colin L Masters, and Christopher C Rowe. Eval-
uating Atypical Dementia Syndromes Using Positron Emission Tomography With
Carbon 11–Labeled Pittsburgh Compound B. Archives of neurology, 64(8):1140–
1144, 2007.
[164] Taiki Kambe, Yumiko Motoi, Kenji Ishii, and Nobutaka Hattori. Posterior cortical
atrophy with 11–C Pittsburgh compound B accumulation in the primary visual
cortex. Journal of neurology, 257(3):469–471, 2010.
[165] Olli Tenovuo, Nina Kemppainen, Sargo Aalto, Kjell Någren, and Juha O Rinne.
Posterior cortical atrophy: A rare form of dementia with in vivo evidence of
amyloid-β accumulation. Journal of Alzheimer’s Disease, 15(3):351–355, 2008.
[166] Maı̈té Formaglio, Nicolas Costes, Jérémie Seguin, Yannick Tholance, Didier
Le Bars, Isabelle Roullet-Solignac, Bernadette Mercier, Pierre Krolak-Salmon, and
Alain Vighetto. In vivo demonstration of amyloid burden in posterior cortical atro-
phy: a case series with PET and CSF findings. Journal of neurology, 258(10):1841–
1851, 2011.
[167] Leonardo Cruz De Souza, Fabian Corlier, Marie-Odile Habert, Olga Uspenskaya,
Renaud Maroy, Foudil Lamari, Marie Chupin, Stéphane Lehéricy, Olivier Colliot,
Valérie Hahn-Barma, et al. Similar amyloid-β burden in posterior cortical atrophy
and Alzheimer’s disease. Brain, 134(7):2036–2043, 2011.
[168] David N Levine, John M Lee, and CM Fisher. The visual variant of Alzheimer’s
disease A clinicopathologic case study. Neurology, 43(2):305–305, 1993.
[169] Clifford R Jack Jr, Val J Lowe, Matthew L Senjem, Stephen D Weigand, Bradley J
Kemp, Maria M Shiung, David S Knopman, Bradley F Boeve, William E Klunk,
Chester A Mathis, et al. 11C PiB and structural MRI provide complementary infor-
mation in imaging of Alzheimer’s disease and amnestic mild cognitive impairment.
Brain, 131(3):665–680, 2008.
[172] Bradford C Dickerson, Akram Bakkour, David H Salat, Eric Feczko, Jenni Pacheco,
Douglas N Greve, Fran Grodstein, Christopher I Wright, Deborah Blacker, H Di-
ana Rosas, et al. The cortical signature of Alzheimer’s disease: regionally specific
cortical thinning relates to symptom severity in very mild to mild AD dementia
and is detectable in asymptomatic amyloid-positive individuals. Cerebral cortex,
19(3):497–510, 2009.
[173] Paul M Thompson, Michael S Mega, Roger P Woods, Chris I Zoumalan, Chris J
Lindshield, Rebecca E Blanton, Jacob Moussai, Colin J Holmes, Jeffrey L Cum-
mings, and Arthur W Toga. Cortical change in Alzheimer’s disease detected with
a disease-specific population-based brain atlas. Cerebral Cortex, 11(1):1–16, 2001.
[175] Mert R Sabuncu, Rahul S Desikan, Jorge Sepulcre, Boon Thye T Yeo, Hesheng
Liu, Nicholas J Schmansky, Martin Reuter, Michael W Weiner, Randy L Buckner,
Reisa A Sperling, et al. The dynamics of cortical and hippocampal atrophy in
Alzheimer disease. Archives of neurology, 68(8):1040–1048, 2011.
[176] Clifford R Jack, Prashanthi Vemuri, Heather J Wiste, Stephen D Weigand, Tim-
othy G Lesnick, Val Lowe, Kejal Kantarci, Matt A Bernstein, Matthew L Sen-
jem, Jeffrey L Gunter, et al. Shapes of the trajectories of 5 major biomarkers of
Alzheimer disease. Archives of neurology, 69(7):856–867, 2012.
[177] Rachelle S Doody, Valory Pavlik, Paul Massman, Susan Rountree, Eveleen Darby,
and Wenyaw Chan. Predicting progression of Alzheimer’s disease. Alzheimer’s
research & therapy, 2(1):2, 2010.
[178] I Driscoll, C Davatzikos, Y An, X Wu, D Shen, M Kraut, and SM Resnick. Lon-
gitudinal pattern of regional brain volume change differentiates normal aging from
MCI. Neurology, 72(22):1906–1913, 2009.
[179] Randall J Bateman, Chengjie Xiong, Tammie LS Benzinger, Anne M Fagan, Alison
Goate, Nick C Fox, Daniel S Marcus, Nigel J Cairns, Xianyun Xie, Tyler M Blazey,
et al. Clinical and biomarker changes in dominantly inherited Alzheimer’s disease.
New England Journal of Medicine, 367(9):795–804, 2012.
[180] Tammie L Benzinger, Tyler Blazey, Clifford R Jack, Robert A Koeppe, Yi Su,
Chengjie Xiong, Marcus E Raichle, Abraham Z Snyder, Beau M Ances, Ran-
dall J Bateman, et al. Regional variability of imaging biomarkers in autosomal
dominant Alzheimer’s disease. Proceedings of the National Academy of Sciences,
110(47):E4502–E4509, 2013.
[182] FH Bouwman, SNM Schoonenboom, WM van Der Flier, EJ Van Elk, A Kok,
F Barkhof, MA Blankenstein, and Ph Scheltens. CSF biomarkers and medial tem-
poral lobe atrophy predict dementia in mild cognitive impairment. Neurobiology of
aging, 28(7):1070–1074, 2007.
[184] Oskar Hansson, Henrik Zetterberg, Peder Buchhave, Elisabet Londos, Kaj Blennow,
and Lennart Minthon. Association between CSF biomarkers and incipient
Alzheimer’s disease in patients with mild cognitive impairment: a follow-up study.
The Lancet Neurology, 5(3):228–234, 2006.
[188] John C Morris, Martha Storandt, J Phillip Miller, Daniel W McKeel, Joseph L
Price, Eugene H Rubin, and Leonard Berg. Mild cognitive impairment represents
early-stage Alzheimer disease. Archives of neurology, 58(3):397–405, 2001.
[189] Pedro J Modrego and Jaime Ferrández. Depression in patients with mild cogni-
tive impairment increases the risk of developing dementia of Alzheimer type: a
prospective cohort study. Archives of neurology, 61(8):1290–1293, 2004.
[190] David B Carr, Steven Gray, Jack Baty, and John C Morris. The value of informant
versus individuals complaints of memory impairment in early dementia. Neurology,
55(11):1724–1727, 2000.
[192] J Wesson Ashford and Frederick A Schmitt. Modeling the time-course of Alzheimer
dementia. Current psychiatry reports, 3(1):20–28, 2001.
[193] Eric Yang, Michael Farnum, Victor Lobanov, Tim Schultz, Nandini Raghavan, Ma-
hesh N Samtani, Gerald Novak, Vaibhav Narayan, and Allitia DiBernardo. Quanti-
fying the pathophysiological timeline of Alzheimer’s disease. Journal of Alzheimer’s
Disease, 26(4):745–753, 2011.
210 Appendix F. Bibliography
[194] A Caroli, GB Frisoni, and Alzheimer’s Disease Neuroimaging Initiative. The dy-
namics of Alzheimer’s disease biomarkers in the Alzheimer’s Disease Neuroimaging
Initiative cohort. Neurobiology of aging, 31(8):1263–1274, 2010.
[195] C Bishop. Pattern Recognition and Machine Learning (Information Science and
Statistics), 1st edn. 2006. corr. 2nd printing edn, 2007.
[197] Jean-Baptiste Schiratti, Stéphanie Allassonniere, Olivier Colliot, and Stanley Dur-
rleman. Learning spatiotemporal trajectories from manifold-valued longitudinal
data. In Advances in Neural Information Processing Systems, pages 2404–2412,
2015.
[198] Igor Koval, Jean-Baptiste Schiratti, Alexandre Routier, Michael Bacci, Olivier Col-
liot, Stephanie Allassonniere, and Stanley Durrleman. Spatiotemporal Propagation
of the Cortical Atrophy: Population and Individual Patterns. Frontiers in Neurol-
ogy, 9, 2018.
[199] Ashish Raj, Eve LoCastro, Amy Kuceyeski, Duygu Tosun, Norman Relkin, Michael
Weiner, and Alzheimer’s Disease Neuroimaging Initiative. Network diffusion
model of progression predicts longitudinal patterns of atrophy and metabolism in
Alzheimer’s disease.
[200] Nicolas Villain, Marine Fouquet, Jean-Claude Baron, Florence Mézenge, Brigitte
Landeau, Vincent de La Sayette, Fausto Viader, Francis Eustache, Béatrice Des-
granges, and Gaël Chételat. Sequential relationships between grey matter and
white matter atrophy and brain metabolic abnormalities in early Alzheimer’s dis-
ease. Brain, 133(11):3301–3314, 2010.
[201] E Englund, A Brun, and C Alling. White matter changes in dementia of Alzheimer’s
type. Brain, 111(6):1425–1439, 1988.
[202] Beth Kuczynski, Elizabeth Targan, Cindee Madison, Michael Weiner, Yu Zhang,
Bruce Reed, Helena C Chui, and William Jagust. White matter integrity and
cortical metabolic associations in aging and dementia. Alzheimer’s & dementia,
6(1):54–62, 2010.
[203] TEJ Behrens, H Johansen Berg, Saad Jbabdi, MFS Rushworth, and MW Woolrich.
Probabilistic diffusion tractography with multiple fibre orientations: What can we
gain? Neuroimage, 34(1):144–155, 2007.
[204] Risi Imre Kondor and John Lafferty. Diffusion kernels on graphs and other discrete
input spaces. In ICML, volume 2, pages 315–322, 2002.
[208] Zhiqiang Lao, Dinggang Shen, Zhong Xue, Bilge Karacali, Susan M Resnick, and
Christos Davatzikos. Morphological classification of brains via high-dimensional
shape transformations and machine learning methods. Neuroimage, 21(1):46–57,
2004.
[209] Yong Fan, Dinggang Shen, and Christos Davatzikos. Classification of structural
images via high-dimensional image warping, robust feature extraction, and SVM.
In International Conference on Medical Image Computing and Computer-Assisted
Intervention, pages 1–8. Springer, 2005.
[210] Janaina Mourão-Miranda, Arun LW Bokde, Christine Born, Harald Hampel, and
Martin Stetter. Classifying brain states and determining the discriminating acti-
vation patterns: support vector machine on functional MRI data. NeuroImage,
28(4):980–995, 2005.
[211] Yasuhiro Kawasaki, Michio Suzuki, Ferath Kherif, Tsutomu Takahashi, Shi-
Yu Zhou, Kazue Nakamura, Mie Matsui, Tomiki Sumiyoshi, Hikaru Seto, and
Masayoshi Kurachi. Multivariate voxel-based morphometry successfully differen-
tiates schizophrenia patients from healthy controls. Neuroimage, 34(1):235–242,
2007.
[212] Tin Kam Ho. Random decision forests. In Document Analysis and Recognition,
1995., Proceedings of the Third International Conference on, volume 1, pages 278–
282. IEEE, 1995.
[214] Katherine R Gray, Paul Aljabar, Rolf A Heckemann, Alexander Hammers, Daniel
Rueckert, and Alzheimer’s Disease Neuroimaging Initiative. Random forest-based
similarity measures for multi-modal classification of Alzheimer’s disease. NeuroIm-
age, 65:167–175, 2013.
[215] Daniel C Alexander, Darko Zikic, Jiaying Zhang, Hui Zhang, and Antonio Criminisi.
Image quality transfer via random forest regression: applications in diffusion MRI.
In International Conference on Medical Image Computing and Computer-Assisted
Intervention, pages 225–232. Springer, 2014.
[216] Victor Lempitsky, Michael Verhoek, J Alison Noble, and Andrew Blake. Ran-
dom forest classification for automatic delineation of myocardium in real-time 3D
echocardiography. In International Conference on Functional Imaging and Model-
ing of the Heart, pages 447–456. Springer, 2009.
212 Appendix F. Bibliography
[217] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural com-
putation, 9(8):1735–1780, 1997.
[218] Sweta Karlekar, Tong Niu, and Mohit Bansal. Detecting Linguistic Character-
istics of Alzheimer’s Dementia by Interpreting Neural Models. arXiv preprint
arXiv:1804.06440, 2018.
[219] Narges Razavian, Jake Marcus, and David Sontag. Multi-task prediction of dis-
ease onsets from longitudinal laboratory tests. In Machine Learning for Healthcare
Conference, pages 73–100, 2016.
[220] Zachary C Lipton, David C Kale, Charles Elkan, and Randall Wetzel. Learning to
diagnose with LSTM recurrent neural networks. arXiv preprint arXiv:1511.03677,
2015.
[221] Melissa Aczon, David Ledbetter, L Ho, Alec Gunny, Alysia Flynn, Jon Williams,
and Randall Wetzel. Dynamic mortality risk predictions in pediatric critical care
using recurrent neural networks. arXiv preprint arXiv:1701.06675, 2017.
[222] Hrayr Harutyunyan, Hrant Khachatrian, David C Kale, and Aram Galstyan. Mul-
titask learning and benchmarking with clinical time series data. arXiv preprint
arXiv:1703.07771, 2017.
[223] Konstantinos Georgiadis, Selina Wray, Sébastien Ourselin, Jason D Warren, and
Marc Modat. Computational modelling of pathogenic protein spread in neurode-
generative diseases. PloS one, 13(2):e0192518, 2018.
[224] Guy M McKhann, David S Knopman, Howard Chertkow, Bradley T Hyman, Clif-
ford R Jack Jr, Claudia H Kawas, William E Klunk, Walter J Koroshetz, Jennifer J
Manly, Richard Mayeux, et al. The diagnosis of dementia due to Alzheimer’s dis-
ease: Recommendations from the National Institute on Aging-Alzheimers Associ-
ation workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimer’s &
dementia, 7(3):263–269, 2011.
[225] M Jorge Cardoso, Marc Modat, Robin Wolz, Andrew Melbourne, David Cash,
Daniel Rueckert, and Sebastien Ourselin. Geodesic Information Flows: Spatially-
Variant Graphs and Their Application to Segmentation and Fusion. 2015.
[228] Julie A Schneider, Zoe Arvanitakis, Woojeong Bang, and David A Bennett. Mixed
brain pathologies account for most dementia cases in community-dwelling older
persons. Neurology, 69(24):2197–2204, 2007.
[229] Julie A Schneider, Zoe Arvanitakis, Sue E Leurgans, and David A Bennett. The
neuropathology of probable Alzheimer disease and mild cognitive impairment. An-
nals of Neurology: Official Journal of the American Neurological Association and
the Child Neurology Society, 66(2):200–208, 2009.
213
[230] Alexandra L Young, Neil P Oxtoby, Jonathan Huang, Razvan V Marinescu, Pankaj
Daga, David M Cash, Nick C Fox, Sebastien Ourselin, Jonathan M Schott, Daniel C
Alexander, et al. Multiple Orderings of Events in Disease Progression. In Informa-
tion Processing in Medical Imaging, pages 711–722. Springer, 2015.
[232] Peter Freeborough, Nick C Fox, et al. The boundary shift integral: an accurate and
robust measure of cerebral volume changes from registered repeat MRI. Medical
Imaging, IEEE Transactions on, 16(5):623–629, 1997.
[233] Kelvin K Leung, Matthew J Clarkson, Jonathan W Bartlett, Shona Clegg, Clif-
ford R Jack, Michael W Weiner, Nick C Fox, Sébastien Ourselin, and Alzheimer’s
Disease Neuroimaging Initiative. Robust atrophy rate measurement in Alzheimer’s
disease using multi-site serial MRI: tissue-specific intensity normalization and pa-
rameter selection. Neuroimage, 50(2):516–523, 2010.
[235] John Platt. Sequential minimal optimization: A fast algorithm for training support
vector machines. 1998.
[236] Razvan V Marinescu, Neil P Oxtoby, Alexandra L Young, Esther E Bron, Arthur W
Toga, Michael W Weiner, Frederik Barkhof, Nick C Fox, Stefan Klein, Daniel C
Alexander, et al. TADPOLE Challenge: Prediction of Longitudinal Evolution in
Alzheimer’s Disease. arXiv preprint arXiv:1805.03909, 2018.
[237] Rafid Sukkar, Elyse Katz, Yanwei Zhang, David Raunig, and Bradley T Wyman.
Disease progression modeling using hidden Markov models. In 2012 Annual In-
ternational Conference of the IEEE Engineering in Medicine and Biology Society,
pages 2845–2848. IEEE, 2012.
[238] Gordana Derado, F DuBois Bowman, and Clinton D Kilts. Modeling the spatial
and temporal dependence in fMRI data. Biometrics, 66(3):949–957, 2010.
[239] Jung Won Hyun, Yimei Li, Chao Huang, Martin Styner, Weili Lin, Hongtu Zhu,
and Alzheimer’s Disease Neuroimaging Initiative. STGP: Spatio-temporal Gaussian
process models for longitudinal neuroimaging data. NeuroImage, 134:550–562, 2016.
[240] Marco Lorenzi, Gabriel Ziegler, Daniel C Alexander, and Sebastien Ourselin. Ef-
ficient Gaussian process-based modelling and prediction of image time series. In
International Conference on Information Processing in Medical Imaging, pages 626–
637. Springer, 2015.
214 Appendix F. Bibliography
[241] Keith A Johnson, Nick C Fox, Reisa A Sperling, and William E Klunk. Brain
imaging in Alzheimer disease. Cold Spring Harbor perspectives in medicine, page
a006213, 2012.
[242] Douglas N Greve, Claus Svarer, Patrick M Fisher, Ling Feng, Adam E Hansen,
William Baare, Bruce Rosen, Bruce Fischl, and Gitte M Knudsen. Cortical surface-
based analysis reduces bias and variance in kinetic modeling of brain PET data.
Neuroimage, 92:225–236, 2014.
[244] Bradford C Dickerson, Akram Bakkour, David H Salat, Eric Feczko, Jenni Pacheco,
Douglas N Greve, Fran Grodstein, Christopher I Wright, Deborah Blacker, H Di-
ana Rosas, et al. The cortical signature of Alzheimer’s disease: regionally specific
cortical thinning relates to symptom severity in very mild to mild AD dementia
and is detectable in asymptomatic amyloid-positive individuals. Cerebral cortex,
19(3):497–510, 2008.
[245] Vivek Singh, Howard Chertkow, Jason P Lerch, Alan C Evans, Adrienne E Dorr,
and Noor Jehan Kabani. Spatial patterns of cortical thinning in mild cognitive
impairment and Alzheimer’s disease. Brain, 129(11):2885–2893, 2006.
[246] Marco Lorenzi, Maurizio Filippone, Daniel C Alexander, and Sebastien Ourselin.
Disease Progression Modeling and Prediction through Random Effect Gaussian
Processes and Time Transformation. arXiv preprint arXiv:1701.01668, 2017.
[247] Marzia A Scelsi, Raiyan R Khan, Marco Lorenzi, Leigh Christopher, Michael D
Greicius, Jonathan M Schott, Sebastien Ourselin, and Andre Altmann. Genetic
study of multimodal imaging Alzheimer’s disease progression score implicates novel
loci. Brain, 2018.
[248] Marcia Hon and Naimul Khan. Towards Alzheimer’s disease classification through
transfer learning. arXiv preprint arXiv:1711.11117, 2017.
[249] Bo Cheng, Mingxia Liu, Dinggang Shen, Zuoyong Li, Daoqiang Zhang, and
Alzheimers Disease Neuroimaging Initiative. Multi-domain transfer learning for
early diagnosis of alzheimers disease. Neuroinformatics, 15(2):115–132, 2017.
[250] Bo Cheng, Mingxia Liu, Daoqiang Zhang, Brent C Munsell, and Dinggang Shen.
Domain transfer learning for MCI conversion prediction. IEEE Transactions on
Biomedical Engineering, 62(7):1805–1817, 2015.
[252] Paul S Aisen, Ronald C Petersen, Michael C Donohue, Anthony Gamst, Rema
Raman, Ronald G Thomas, Sarah Walter, John Q Trojanowski, Leslie M Shaw,
Laurel A Beckett, et al. Clinical Core of the Alzheimer’s Disease Neuroimaging Ini-
tiative: progress and plans. Alzheimer’s & dementia: the journal of the Alzheimer’s
Association, 6(3):239–246, 2010.
[253] Giovanni B Frisoni, Nick C Fox, Clifford R Jack Jr, Philip Scheltens, and Paul M
Thompson. The clinical use of structural MRI in Alzheimer disease. Nature Reviews
Neurology, 6(2):67, 2010.
[254] Ricardo Guerrero, Alexander Schmidt-Richberg, Christian Ledig, Tong Tong,
Robin Wolz, Daniel Rueckert, and ADNI. Instantiated mixed effects modeling
of Alzheimer’s disease markers. NeuroImage, 142:113–125, 2016.
[255] Daoqiang Zhang, Yaping Wang, Luping Zhou, Hong Yuan, Dinggang Shen, and
ADNI. Multimodal classification of Alzheimer’s disease and mild cognitive impair-
ment. Neuroimage, 55(3):856–867, 2011.
[256] Jonathan Young, Marc Modat, Manuel J Cardoso, Alex Mendelson, Dave Cash,
Sebastien Ourselin, and Alzheimer’s Disease Neuroimaging Initiative. Accurate
multimodal probabilistic prediction of conversion to Alzheimer’s disease in patients
with mild cognitive impairment. NeuroImage: Clinical, 2:735–745, 2013.
[257] Jussi Mattila, Juha Koikkalainen, Arho Virkki, Anja Simonsen, Mark van Gils,
Gunhild Waldemar, Hilkka Soininen, and Jyrki Lötjönen. A disease state fingerprint
for evaluation of Alzheimer’s disease. Journal of Alzheimer’s Disease, 27(1):163–
176, 2011.
[258] Stanley Durrleman, Xavier Pennec, Alain Trouvé, José Braga, Guido Gerig, and
Nicholas Ayache. Toward a comprehensive framework for the spatiotemporal statis-
tical analysis of longitudinal shape data. International journal of computer vision,
103(1):22–59, 2013.
[259] Marco Lorenzi, Xavier Pennec, Giovanni B Frisoni, and Nicholas Ayache. Disen-
tangling normal aging from Alzheimer’s disease in structural magnetic resonance
images. Neurobiology of aging, 36:S42–S52, 2015.
[260] Michael W Weiner, Dallas P Veitch, Paul S Aisen, Laurel A Beckett, Nigel J Cairns,
Robert C Green, Danielle Harvey, Clifford R Jack, William Jagust, John C Morris,
et al. Recent publications from the Alzheimer’s Disease Neuroimaging Initiative:
Reviewing progress toward improved AD clinical trials. Alzheimer’s & dementia:
the journal of the Alzheimer’s Association, 13(4):e1–e85, 2017.
[261] Genevera I Allen, Nicola Amoroso, Catalina Anghel, Venkat Balagurusamy,
Christopher J Bare, Derek Beaton, Roberto Bellotti, David A Bennett, Kevin L
Boehme, Paul C Boutros, et al. Crowdsourced estimation of cognitive decline
and resilience in Alzheimer’s disease. Alzheimer’s & dementia: the journal of the
Alzheimer’s Association, 12(6):645–653, 2016.
[262] Ronald Carl Petersen, PS Aisen, LA Beckett, MC Donohue, AC Gamst, DJ Harvey,
CR Jack, WJ Jagust, LM Shaw, AW Toga, et al. Alzheimer’s Disease Neuroimaging
Initiative (ADNI) clinical characterization. Neurology, 74(3):201–209, 2010.
216 Appendix F. Bibliography
[263] William J Jagust, Dan Bandy, Kewei Chen, Norman L Foster, Susan M Landau,
Chester A Mathis, Julie C Price, Eric M Reiman, Daniel Skovronsky, and Robert A
Koeppe. The Alzheimer’s Disease Neuroimaging Initiative positron emission tomog-
raphy core. Alzheimer’s & dementia: the journal of the Alzheimer’s Association,
6(3):221–229, 2010.
[264] John Ashburner. Computational anatomy with the SPM software. Magnetic reso-
nance imaging, 27(8):1163–1174, 2009.
[265] Talia M Nir, Neda Jahanshad, Julio E Villalon-Reina, Arthur W Toga, Clifford R
Jack, Michael W Weiner, Paul M Thompson, and ADNI. Effectiveness of regional
DTI measures in distinguishing Alzheimer’s disease, MCI, and normal aging. Neu-
roImage: clinical, 3:180–195, 2013.
[266] Kenichi Oishi, Andreia Faria, Hangyi Jiang, Xin Li, Kazi Akhter, Jiangyang Zhang,
John T Hsu, Michael I Miller, Peter CM van Zijl, Marilyn Albert, et al. Atlas-
based whole brain white matter analysis using large deformation diffeomorphic
metric mapping: application to normal elderly and Alzheimer’s disease participants.
Neuroimage, 46(2):486–499, 2009.
[267] David J Hand and Robert J Till. A simple generalisation of the area under the ROC
curve for multiple class classification problems. Machine learning, 45(2):171–186,
2001.
[268] Kay Henning Brodersen, Cheng Soon Ong, Klaas Enno Stephan, and Joachim M
Buhmann. The balanced accuracy and its posterior distribution. In International
Conference on Pattern recognition (ICPR), pages 3121–3124. IEEE, 2010.
[269] Meryl A Butters, Oscar L Lopez, and James T Becker. Focal temporal lobe dys-
function in probable Alzheimer’s disease predicts a slow rate of cognitive decline.
Neurology, 46(3):687–692, 1996.
[270] J Green, John C Morris, J Sandson, DW McKeel, and JW Miller. Progressive
aphasia: A precursor of global dementia? Neurology, 40(3 Part 1):423–423, 1990.
[271] John DW Greene, Karalyn Patterson, John Xuereb, and John R Hodges. Alzheimer
disease and nonfluent progressive aphasia. Archives of Neurology, 53(10):1072–1078,
1996.
[272] Jason D Warren, Jonathan D Rohrer, and Martin N Rossor. Frontotemporal de-
mentia. Bmj, 347:f4827, 2013.
[273] Marvin M Goldenberg. Multiple sclerosis review. Pharmacy and Therapeutics,
37(3):175, 2012.
[274] Arman Eshaghi, Razvan V Marinescu, Alexandra L Young, Nicholas C Firth, Fer-
ran Prados, M Jorge Cardoso, Carmen Tur, Floriana De Angelis, Niamh Cawley,
Wallace J Brownlee, et al. Progression of regional grey matter atrophy in multiple
sclerosis. Brain, 141(6):1665–1677, 2018.
[275] Werner Poewe, Klaus Seppi, Caroline M Tanner, Glenda M Halliday, Patrik
Brundin, Jens Volkmann, Anette-Eleonore Schrag, and Anthony E Lang. Parkinson
disease. Nature reviews Disease primers, 3:17013, 2017.
217
[276] Nuria Caballol, Maria J Martı́, and Eduardo Tolosa. Cognitive dysfunction and de-
mentia in Parkinson disease. Movement disorders: official journal of the Movement
Disorder Society, 22(S17):S358–S366, 2007.
[277] Raymund AC Roos. Huntington’s disease: a clinical review. Orphanet journal of
rare diseases, 5(1):40, 2010.
[278] Nellie Georgiou-Karistianis, Anthony J Hannan, and Gary F Egan. Magnetic res-
onance imaging as an approach towards identifying neuropathological biomarkers
for Huntington’s disease. Brain research reviews, 58(1):209–225, 2008.
[279] Andrew Feigin, Klaus L Leenders, James R Moeller, John Missimer, Gabriella
Kuenig, Phoebe Spetsieris, Angelo Antonini, and David Eidelberg. Metabolic Net-
work Abnormalities in Early Huntington’s Disease: An 18F FDG PET Study. Jour-
nal of Nuclear Medicine, 42(11):1591–1595, 2001.
[280] Peter A Wijeratne, Alexandra L Young, Neil P Oxtoby, Razvan V Marinescu,
Nicholas C Firth, Eileanoir B Johnson, Amrita Mohan, Cristina Sampaio, Rachael I
Scahill, Sarah J Tabrizi, et al. An image-based model of brain volume biomarker
changes in Huntington’s disease. Annals of clinical and translational neurology,
5(5):570–582, 2018.
[281] David M Cash, Jonathan D Rohrer, Natalie S Ryan, Sebastien Ourselin, and Nick C
Fox. Imaging endpoints for clinical trials in Alzheimer’s disease. Alzheimer’s re-
search & therapy, 6(9):87, 2014.
[282] Gabor G Kovacs, Ivan Milenkovic, Adelheid Wöhrer, Romana Höftberger, Ellen
Gelpi, Christine Haberler, Selma Hönigschnabl, Angelika Reiner-Concin, Harald
Heinzl, Susanne Jungwirth, et al. Non-Alzheimer neurodegenerative pathologies
and their combinations are more frequent than commonly believed in the elderly
brain: a community-based autopsy series. Acta neuropathologica, 126(3):365–384,
2013.
[283] Bryan D James, Robert S Wilson, Patricia A Boyle, John Q Trojanowski, David A
Bennett, and Julie A Schneider. TDP-43 stage, mixed pathologies, and clinical
Alzheimers-type dementia. Brain, 139(11):2983–2993, 2016.
[284] John Q Trojanowski, Hugo Vandeerstichele, Magdalena Korecka, Christopher M
Clark, Paul S Aisen, Ronald C Petersen, Kaj Blennow, Holly Soares, Adam Simon,
Piotr Lewczuk, et al. Update on the biomarker core of the Alzheimer’s Disease
Neuroimaging Initiative subjects. Alzheimer’s & Dementia, 6(3):230–238, 2010.
[285] Jason L Stein, Xue Hua, Suh Lee, April J Ho, Alex D Leow, Arthur W Toga,
Andrew J Saykin, Li Shen, Tatiana Foroud, Nathan Pankratz, et al. Voxelwise
genome-wide association study (vGWAS). Neuroimage, 53(3):1160–1174, 2010.
[286] Kathryn A Ellis, Ashley I Bush, David Darby, Daniela De Fazio, Jonathan Foster,
Peter Hudson, Nicola T Lautenschlager, Nat Lenzo, Ralph N Martins, Paul Maruff,
et al. The Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging:
methodology and baseline characteristics of 1112 individuals recruited for a longitu-
dinal study of Alzheimer’s disease. International Psychogeriatrics, 21(4):672–687,
2009.
218 Appendix F. Bibliography
[287] John C Morris, Paul S Aisen, Randall J Bateman, Tammie LS Benzinger, Nigel J
Cairns, Anne M Fagan, Bernardino Ghetti, Alison M Goate, David M Holtzman,
William E Klunk, et al. Developing an international network for Alzheimer research:
the Dominantly Inherited Alzheimer Network. Clinical investigation, 2(10):975,
2012.
[288] Kenneth Marek, Danna Jennings, Shirley Lasch, Andrew Siderowf, Caroline Tan-
ner, Tanya Simuni, Chris Coffey, Karl Kieburtz, Emily Flagg, Sohini Chowdhury,
et al. The parkinson progression marker initiative (PPMI). Progress in neurobiol-
ogy, 95(4):629–635, 2011.
[289] Sarah J Tabrizi, Douglas R Langbehn, Blair R Leavitt, Raymund AC Roos, Alexan-
dra Durr, David Craufurd, Christopher Kennard, Stephen L Hicks, Nick C Fox,
Rachael I Scahill, et al. Biological and clinical manifestations of Huntington’s dis-
ease in the longitudinal TRACK-HD study: cross-sectional analysis of baseline
data. The Lancet Neurology, 8(9):791–801, 2009.
[290] M Arfan Ikram, Guy GO Brusselle, Sarwa Darwish Murad, Cornelia M van Duijn,
Oscar H Franco, André Goedegebure, Caroline CW Klaver, Tamar EC Nijsten,
Robin P Peeters, Bruno H Stricker, et al. The Rotterdam Study: 2018 update on
objectives, design and main results. European Journal of Epidemiology, 32(9):807–
850, 2017.
[291] Yu-Liang Hsu, Pau-Choo Chung, Wei-Hsin Wang, Ming-Chyi Pai, Chun-Yao Wang,
Chien-Wen Lin, Hao-Li Wu, and Jeen-Shing Wang. Gait and balance analysis for
patients with Alzheimer’s disease using an inertial-sensor-based wearable instru-
ment. IEEE journal of biomedical and health informatics, 18(6):1822–1830, 2014.
[292] J Thomas Hutton, JA Nagel, and Ruth B Loewenson. Eye tracking dysfunction in
Alzheimer-type dementia. Neurology, 34(1):99–99, 1984.
[293] Ildikó Hoffmann, Dezso Nemeth, Cristina D Dye, Magdolna Pákáski, Tamás Irinyi,
and János Kálmán. Temporal parameters of spontaneous speech in Alzheimer’s
disease. International journal of speech-language pathology, 12(1):29–34, 2010.
[294] Geoffrey E Hinton and Sam T Roweis. Stochastic neighbor embedding. In Advances
in neural information processing systems, pages 857–864, 2003.
[295] Bishesh Khanal, Marco Lorenzi, Nicholas Ayache, and Xavier Pennec. A biophysical
model of brain deformation to simulate and analyze longitudinal MRIs of patients
with Alzheimer’s disease. NeuroImage, 134:35–52, 2016.
[298] Clifford R Jack, Heather J Wiste, Timothy G Lesnick, Stephen D Weigand, David S
Knopman, Prashanthi Vemuri, Vernon S Pankratz, Matthew L Senjem, Jeffrey L
Gunter, Michelle M Mielke, et al. Brain β-amyloid load approaches a plateau.
Neurology, 80(10):890–896, 2013.
[299] Neil P Oxtoby, Alexandra L Young, Nick C Fox, Pankaj Daga, David M Cash,
Sebastien Ourselin, Jonathan M Schott, Daniel C Alexander, and Alzheimers Dis-
ease Neuroimaging Initiative. Learning imaging biomarker trajectories from noisy
Alzheimer’s disease data using a Bayesian multilevel model. In Bayesian and grAph-
ical Models for Biomedical Imaging, pages 85–94. Springer, 2014.
[300] Alexandra L Young, Neil P Oxtoby, Jonathan M Schott, and Daniel C Alexander.
Data-driven models of neurodegenerative disease.
[301] Stephen M Stigler. Francis Galton’s account of the invention of correlation. Statis-
tical Science, 4(2):73–79, 1989.
[302] Joseph Lee Rodgers and W Alan Nicewander. Thirteen ways to look at the corre-
lation coefficient. The American Statistician, 42(1):59–66, 1988.
[303] Gabor J Szekely and Maria L Rizzo. Hierarchical clustering via joint between-within
distances: Extending Ward’s minimum variance method. Journal of classification,
22(2):151–183, 2005.
[304] Robert R Sokal. A statistical method for evaluating systematic relationships. Univ
Kans Sci Bull, 38:1409–1438, 1958.
[305] Bernhard E Boser, Isabelle M Guyon, and Vladimir N Vapnik. A training algo-
rithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on
Computational learning theory, pages 144–152. ACM, 1992.
[306] Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine learning,
20(3):273–297, 1995.
[307] D Luenberger and Y Ye. Linear and nonlinear optimization. Linear and Nonlinear
Optimization, 1984.
[309] John Moody and Christian J Darken. Fast learning in networks of locally-tuned
processing units. Neural computation, 1(2):281–294, 1989.
[311] W Keith Hastings. Monte Carlo sampling methods using Markov chains and their
applications. Biometrika, 57(1):97–109, 1970.
[312] Vikas Dhikav and Kuljeet Anand. Potential predictors of hippocampal atrophy in
Alzheimer’s disease. Drugs & aging, 28(1):1–11, 2011.
220 Appendix F. Bibliography
[313] Sebastian J Crutch, Jonathan M Schott, Gil D Rabinovici, Bradley F Boeve, Ste-
fano F Cappa, Bradford C Dickerson, Bruno Dubois, Neill R Graff-Radford, Pierre
Krolak-Salmon, Manja Lehmann, et al. Shining a light on posterior cortical atrophy.
Alzheimer’s & Dementia, 9(4):463–465, 2013.
[314] Clifford R Jack, Matt A Bernstein, Nick C Fox, Paul Thompson, Gene Alexander,
Danielle Harvey, Bret Borowski, Paula J Britson, Jennifer L Whitwell, Chadwick
Ward, et al. The Alzheimer’s disease neuroimaging initiative (ADNI): MRI meth-
ods. Journal of Magnetic Resonance Imaging, 27(4):685–691, 2008.
[315] Wilma G Rosen, Richard C Mohs, and Kenneth L Davis. A new rating scale for
Alzheimer’s disease. The American journal of psychiatry, 1984.
[317] Elizabeth K Warrington and Merle James. The visual object and space perception
battery. Thames Valley Test Company Bury St Edmunds, 1991.
[318] Elizabeth K Warrington, Tim Shallice, et al. Category specific semantic impair-
ments. Brain, 107(3):829–853, 1984.
[319] David Wechsler. Wechsler adult intelligence scale–Fourth Edition (WAIS–IV). San
Antonio, TX: NCS Pearson, 2008.
[321] Nick C Fox and Peter A Freeborough. Brain atrophy progression measured from
registered serial MRI: validation and application to Alzheimer’s disease. Journal of
Magnetic Resonance Imaging, 7(6):1069–1075, 1997.
[323] Klaus Schmidtke, PD Dr Michael Hüll, and Jochen Talazko. Posterior cortical
atrophy: variant of Alzheimer’s disease? Journal of neurology, 252(1):27–35, 2005.
[324] Roger Bullock. Future directions in the treatment of Alzheimer’s disease. Expert
opinion on investigational drugs, 13(4):303–314, 2004.
[325] Hans-Wolfgang Klafki, Matthias Staufenbiel, Johannes Kornhuber, and Jens Wilt-
fang. Therapeutic approaches to Alzheimer’s disease. Brain, 129(11):2840–2855,
2006.
[327] Elizabeth Forsyth and Pamela D Ritzline. An overview of the etiology, diagnosis,
and treatment of Alzheimer disease. Physical therapy, 78(12):1325–1331, 1998.
221
[328] Eric Yang, Michael Farnum, Victor Lobanov, Tim Schultz, R Verbeeck, N Ragha-
van, MN Samtani, G Novak, V Narayan, and A DiBernardo. Quantifying the
pathophysiological timeline of Alzheimer’s disease. Journal of Alzheimer’s disease:
JAD, 26(4):745–753, 2010.
[330] Eunhee Kim, Yunsoo Lee, Jongkeol Lee, and Seol-Heui Han. A case with
cholinesterase inhibitor responsive asymmetric posterior cortical atrophy. Clini-
cal neurology and neurosurgery, 108(1):97–101, 2005.
[334] Antonio Convit, Mony de Leon, Chaim Tarshish, Susan De Santi, Alan Kluger,
Henry Rusinek, and AjaxE George. Hippocampal volume losses in minimally im-
paired elderly. The Lancet, 345(8944):266, 1995.
[335] Y Xu, CR Jack, PC Obrien, E Kokmen, Glenn E Smith, Robert J Ivnik, Bradley F
Boeve, RG Tangalos, and Ronald C Petersen. Usefulness of MRI measures of
entorhinal cortex versus hippocampus in AD. Neurology, 54(9):1760–1767, 2000.
[336] Motohiro Kiyosawa, Thomas M Bosley, John Chawluk, Dara Jamieson, Norman J
Schatz, Peter J Savino, Robert C Sergott, Martin Reivich, and Abass Alavi.
Alzheimer’s disease with prominent visual symptoms: clinical and metabolic eval-
uation. Ophthalmology, 96(7):1077–1086, 1989.
[337] Corina Pennanen, Miia Kivipelto, Susanna Tuomainen, Päivi Hartikainen, Tuomo
Hänninen, Mikko P Laakso, Merja Hallikainen, Matti Vanhanen, Aulikki Nissinen,
Eeva-Liisa Helkala, et al. Hippocampus and entorhinal cortex in mild cognitive
impairment and early AD. Neurobiology of aging, 25(3):303–310, 2004.
[338] Heiko Braak and Eva Braak. Morphological criteria for the recognition of
Alzheimer’s disease and the distribution pattern of cortical changes related to this
disorder. Neurobiology of aging, 15(3):355–356, 1994.
222 Appendix F. Bibliography
[339] Mikko P Laakso, Giovanni B Frisoni, Mervi Könönen, Mia Mikkonen, Alberto Bel-
tramello, Claudia Geroldi, Angelo Bianchetti, Marco Trabucchi, Hilkka Soininen,
and Hannu J Aronen. Hippocampus and entorhinal cortex in frontotemporal de-
mentia and Alzheimer’s disease: a morphometric MRI study. Biological psychiatry,
47(12):1056–1063, 2000.
[340] João Maroco, Dina Silva, Ana Rodrigues, Manuela Guerreiro, Isabel Santana, and
Alexandre de Mendonça. Data mining methods in the prediction of Dementia:
A real-data comparison of the accuracy, sensitivity and specificity of linear dis-
criminant analysis, logistic regression, neural networks, support vector machines,
classification trees and random forests. BMC research notes, 4(1):299, 2011.
[341] Rik Ossenkoppele, Brendan I Cohn-Sheehy, Renaud La Joie, Jacob W Vogel, Chris-
tiane Möller, Manja Lehmann, Bart NM van Berckel, William W Seeley, Yolande A
Pijnenburg, Maria L Gorno-Tempini, et al. Atrophy patterns in early clinical
stages across distinct phenotypes of A lzheimer’s disease. Human brain mapping,
36(11):4421–4437, 2015.
[342] Bengt Winblad, Philippe Amouyel, Sandrine Andrieu, Clive Ballard, Carol Brayne,
Henry Brodaty, Angel Cedazo-Minguez, Bruno Dubois, David Edvardsson, Howard
Feldman, et al. Defeating Alzheimer’s disease and other dementias: a priority for
European science and society. The Lancet Neurology, 15(5):455–532, 2016.
[343] Heiko Braak and Kelly Del Tredici. Potential pathways of abnormal tau and α-
synuclein dissemination in sporadic Alzheimer’s and Parkinson’s diseases. Cold
Spring Harbor perspectives in biology, page a023630, 2016.
[344] Bess Frost and Marc I Diamond. Prion-like mechanisms in neurodegenerative dis-
eases. Nature Reviews Neuroscience, 11(3):155, 2010.
[345] John Hardy and Tamas Revesz. The spread of neurodegenerative disease. New
England Journal of Medicine, 366(22):2126–2128, 2012.
[346] Zeshan Ahmed, Jane Cooper, Tracey K Murray, Katya Garn, Emily McNaughton,
Hannah Clarke, Samira Parhizkar, Mark A Ward, Annalisa Cavallini, Samuel Jack-
son, et al. A novel in vivo model of tau propagation with rapid and progressive
neurofibrillary tangle pathology: the pattern of spread is determined by connectiv-
ity, not proximity. Acta neuropathologica, 127(5):667–683, 2014.
[347] Johannes Brettschneider, Kelly Del Tredici, Virginia M-Y Lee, and John Q Tro-
janowski. Spreading of pathology in neurodegenerative diseases: a focus on human
studies. Nature Reviews Neuroscience, 16(2):109, 2015.
[348] Michel Goedert. Alzheimers and Parkinsons diseases: The prion concept in relation
to assembled Aβ, tau, and α-synuclein. Science, 349(6248):1255555, 2015.
[349] Jeffrey L Cummings. Neurodegenerative Disorders as Proteinopathies: Phenotypic
Relationships. In Genotype–Proteotype–Phenotype Relationships in Neurodegener-
ative Diseases, pages 1–10. Springer, 2005.
[350] Massimo Filippi and Federica Agosta. Structural and functional network connec-
tivity breakdown in Alzheimer’s disease studied with magnetic resonance imaging
techniques. Journal of Alzheimer’s Disease, 24(3):455–474, 2011.
223
[351] Jason D Warren, Jonathan D Rohrer, Jonathan M Schott, Nick C Fox, John Hardy,
and Martin N Rossor. Molecular nexopathies: a new paradigm of neurodegenerative
disease. Trends in neurosciences, 36(10):561–569, 2013.
[352] Patrick R Hof, Constantin Bouras, Jean Constantinidis, and John H Morrison. Se-
lective disconnection of specific visual association pathways in cases of Alzheimer’s
disease presenting with Balint’s syndrome. Journal of neuropathology and experi-
mental neurology, 49(2):168–184, 1990.
[353] DF Tang-Wai, KA Josephs, Bradley F Boeve, DW Dickson, JE Parisi, and RC Pe-
tersen. Pathologically confirmed corticobasal degeneration presenting with visu-
ospatial dysfunction. Neurology, 61(8):1134–1135, 2003.
[354] DF Tang-Wai, KA Josephs, Bradley F Boeve, RC Petersen, JE Parisi, and
DW Dickson. Coexistent Lewy body disease in a case of visual variant of Alzheimer’s
disease. Journal of Neurology, Neurosurgery & Psychiatry, 74(3):389–389, 2003.
[355] JA Renner, JM Burns, CE Hou, DW McKeel, M Storandt, and JC Morris. Progres-
sive posterior cortical dysfunction A clinicopathologic series. Neurology, 63(7):1175–
1180, 2004.
[356] Manja Lehmann, Andrew Melbourne, John C Dickson, Rebekah M Ahmed, Marc
Modat, M Jorge Cardoso, David L Thomas, Enrico De Vita, Sebastian J Crutch,
Jason D Warren, et al. A novel use of arterial spin labelling MRI to demonstrate
focal hypoperfusion in individuals with posterior cortical atrophy: a multimodal
imaging study. J Neurol Neurosurg Psychiatry, pages jnnp–2015, 2016.
[357] Rik Ossenkoppele, Niklas Mattsson, Charlotte E Teunissen, Frederik Barkhof,
Yolande Pijnenburg, Philip Scheltens, Wiesje M van der Flier, and Gil D Rabi-
novici. Cerebrospinal fluid biomarkers and cerebral atrophy in distinct clinical
variants of probable Alzheimer’s disease. Neurobiology of aging, 36(8):2340–2347,
2015.
[358] Rik Ossenkoppele, Daniel R Schonhaut, Michael Schöll, Samuel N Lockhart, Nage-
han Ayakta, Suzanne L Baker, James P ONeil, Mustafa Janabi, Andreas Lazaris,
Averill Cantwell, et al. Tau PET patterns mirror clinical and neuroanatomical
variability in Alzheimer’s disease. Brain, 139(5):1551–1567, 2016.
[359] Guillaume Dorothée, Michel Bottlaender, Edmond Moukari, Leonardo C De Souza,
Renaud Maroy, Fabian Corlier, Olivier Colliot, Marie Chupin, Foudil Lamari,
Stephane Lehéricy, et al. Distinct patterns of antiamyloid-β antibodies in typi-
cal and atypical Alzheimer disease. Archives of neurology, 69(9):1181–1185, 2012.
[360] William C Kreisl, Chul Hyoung Lyoo, Jeih-San Liow, Joseph Snow, Emily Page,
Kimberly J Jenko, Cheryl L Morse, Sami S Zoghbi, Victor W Pike, R Scott Turner,
et al. Distinct patterns of increased translocator protein in posterior cortical atrophy
and amnestic Alzheimer’s disease. Neurobiology of aging, 51:132–140, 2017.
[361] Keir Yong, Kishan Rajdev, Elizabeth Warrington, Jennifer Nicholas, Jason Warren,
and Sebastian Crutch. A longitudinal investigation of the relationship between
crowding and reading: A neurodegenerative approach. Neuropsychologia, 85:127–
136, 2016.
224 Appendix F. Bibliography
[362] Silvia Primativo, Keir XX Yong, Timothy J Shakespeare, and Sebastian J Crutch.
The oral spelling profile of posterior cortical atrophy and the nature of the
graphemic representation. Neuropsychologia, 94:61–74, 2017.
[363] Rik Ossenkoppele, Brendan I Cohn-Sheehy, Renaud La Joie, Jacob W Vogel, Chris-
tiane Möller, Manja Lehmann, Bart NM van Berckel, William W Seeley, Yolande A
Pijnenburg, Maria L Gorno-Tempini, et al. Atrophy patterns in early clinical
stages across distinct phenotypes of A lzheimer’s disease. Human brain mapping,
36(11):4421–4437, 2015.
[366] Bradford C Dickerson, David A Wolk, and Alzheimer’s Disease Neuroimaging Ini-
tiative. Dysexecutive versus amnesic phenotypes of very mild Alzheimer’s disease
are associated with distinct clinical, genetic and cortical thinning characteristics.
Journal of Neurology, Neurosurgery & Psychiatry, pages jnnp–2009, 2010.
[367] Jennifer L Whitwell, Stephen D Weigand, Bradley F Boeve, Matthew L Senjem, Jef-
frey L Gunter, Mariely DeJesus-Hernandez, Nicola J Rutherford, Matthew Baker,
David S Knopman, Zbigniew K Wszolek, et al. Neuroimaging signatures of fron-
totemporal dementia genetics: C9ORF72, tau, progranulin and sporadics. Brain,
135(3):794–806, 2012.
[368] Helena Chang Chui, Evelyn Lee Teng, Victor W Henderson, and Arthur C Moy.
Clinical subtypes of dementia of the Alzheimer type. Neurology, 35(11):1544–1544,
1985.
[369] Nancy J Fisher, Byron P Rourke, Linas Bieliauskas, Bruno Giordani, Stan-
ley Berent, and Norman L Foster. Neuropsychological subgroups of patients
with Alzheimer’s disease. Journal of clinical and experimental neuropsychology,
18(3):349–370, 1996.
[370] Julene K Johnson, Elizabeth Head, Ronald Kim, Arnold Starr, and Carl W Cot-
man. Clinical and pathological evidence for a frontal variant of Alzheimer disease.
Archives of neurology, 56(10):1233–1239, 1999.
[371] Benjamin Lam, Mario Masellis, Morris Freedman, Donald T Stuss, and Sandra E
Black. Clinical, imaging, and pathological heterogeneity of the Alzheimer’s disease
syndrome. Alzheimer’s research & therapy, 5(1):1, 2013.
lobe dysfunction and patients with diffuse cognitive impairment. Journal of Neu-
rology, Neurosurgery & Psychiatry, 70(1):22–27, 2001.
[373] Suvarna Alladi, John Xuereb, Thomas Bak, Peter Nestor, Jonathan Knibb, Karalyn
Patterson, and JR Hodges. Focal cortical presentations of Alzheimer’s disease.
Brain, 130(10):2636–2645, 2007.
[374] Răzvan Valentin Marinescu, Arman Eshaghi, Marco Lorenzi, Alexandra L Young,
Neil P Oxtoby, Sara Garbarino, Timothy J Shakespeare, Sebastian J Crutch,
Daniel C Alexander, and Alzheimers Disease Neuroimaging Initiative. A vertex
clustering model for disease progression: application to cortical thickness images.
In International Conference on Information Processing in Medical Imaging, pages
134–145. Springer, 2017.