XIth International Myeloma Workshop & IVth International Workshop on
Waldenström's macroglobulinemia
Kos Island, Greece, 25-30 June 2007
From oncogenome mining to
functional validation of Multiple
Myeloma cancer genes
Giovanni Tonon, M.D., Ph.D.
Medical Oncology, Dana-Farber Cancer Institute
Pathology Department, BWH
Harvard Medical School
Boston, MA
Dana Farber Cancer Institute
Harvard Medical School


Outline
· Ouverture:
Ouverture gNMF classification of MM samples.
· First
First act: from above: merging expression and
aCGH data.
·
·Second
Second act
act: pathway analysis: Gene Set
Enrichment Analy
Anal sis (
y(GSEA)
GSEA in specific MM
subgroups.
·
·Third
Third act
act: in the trench: Gene Weight to address
oncogenic overexpressed genes.
·
·Finale
Finale: the future, functional genomics.

Outline
· Ouverture:
Ouverture gNMF classification of MM samples.
· First
First act: from above: merging expression and
aCGH data.
·
·Second
Second act
act: in the trench: Gene Weight to
address oncog
onco enic overexp
overex
gpressed genes.
·
·Third
Third act
act: pathway analysis: Gene Set
Enrichment Analysis (GSEA) in specific MM
subgroups.
·
·Finale
Finale: the future, functional genomics.

Nonnegative matrix factorization (NMF)
A rank-2 reduction of a DNA microarray of N genes and M samples is obtained by NMF, A [~] WH
Brunet, Jean-Philippe et al. (2004) Proc. Natl. Acad. Sci. USA 101, 4164-4169
reduce the dimensionality of expression data from thousands of genes to a handful of metagenes.
Copyright ©2004 by the National Academy of Sciences

NMF finds evidence of
two major karyotype patterns and up to four
subclasses among 64 primary MM samples
profiled with aCGH. Consensus matrices
are shown for NMF ranks of 2,3,4 and 5
components (100 iterations). Cophenetic
Cophenetic
correlation (top right) drops after rank 4,
suggesting that up to four subdivisions may
be present in the data.

gNMF (rank 2) shows two subgroups that mirror
the two major
ajo cytogen
oge eti
e c subclasses of MM
Gains of "odd chromosomes:
Hyperdiploid Group
chr 3, 5, 7, 9, 11, 15, 19, 21
bgroupsuS
K2
1
2
3 4 5
6
7
8
9
10
11
12 13 14 15 16
17 18 19 20 21 22
Non-Hyperdiploid Group

s
Group
1
2
3 4 5
6
7
8
9
10
11
12 1314 15 16
17 18 19 202122
10
1.0
kB (
29)
n=
KB
0.8
o)
KA
kA (
38)
KA
kA (n=
0.6
rvival(rati
0.4
r
Su
0.2
p=0.8
p=0.8
0.0
0
200
400
600
800
1000
1200
Time (days)

sp
Subgrou
1.0
1.0
k1(n=21)
k1
)
0.8
)
0.8
k4(n=17)
o
o
k3(n=10)
0.6
k2-4
0.6
al(rati
al(rati
v
v
k2
k2(n
(n=19)
0.4
0.4
Survi
Survi
0.2
0.2
k1 vs. k2: p=0.05
k1 vs
vs. k3: p=0
p=0 0
. 3
03
p=0.03
k1 vs. k4: p=0.12
0.0
0.0
0
200
400
600
800
1000
1200
0
200
400
600
800
1000
1200
Time (days)
Time (days)

Cyclin D1 overexpression
Association of MMSET/FGFR3 tr.
with 1q genes overexpression and 13 losses
s
Subgroup
Association of c-MAF tr. with chr.13 losses
Cyclin D1 tr.
Comparison with the T/C classification

Outline
· Ouverture:
Ouverture gNMF classification of MM samples.
· First
First act: from above: merging expression and
aCGH data.
·
·Second
Second act
act: in the trench: Gene Weight to
address oncog
onco enic overexp
overex
gpressed genes.
·
·Third
Third act
act: pathway analysis: Gene Set
Enrichment Analysis (GSEA) in specific MM
subgroups.
·
·Finale
Finale: the future, functional genomics.

Comparison of the genomes of AC and
SCC histological subtypes
· Array
Array--CGH
CGH profiles
­ By
By clustering, global profiles do not classify the subtypes
­ By permutation, one regional difference is significant: 3q26-
3q26 q29 in SCC.
SCC
· Expression
Expression profiles
­ SAM
SAM analysis of expression profiles identified 297 probes (q value < 0.05)
­ One
One genomic region, 3q26
3q26--3q29
3q29 is significantly enriched for genes
belong
l
ing
i
to this
thi li t
s f
o 297, b
d
ase on Fi h
s
'
er s
' E
t
xac T t
es
li
app d
e to a
moving window of 10MB.
p
2.0
2.
Array
Array--CGH
CGH
10
1.5
1.
1.0
1.
--loglog value
0.5
0.
0
15
adj.
expression
10
10
5
--loglog pvalue
0
1
2
3
4
5
6
7
8
29910
10
11
12
13
15
17
19
21
X

Affymetrix Probe
Cytogenetic band
Position
Gene
Gene full name
209863_s_at
3q28
190,831,918
TP73L
tumor protein p73-like
211194_s_at
at
3q28
3q
190 831
,
918
,
TP73L
tu
t mor
o pro
pr t
o e
t in
i p73-lik
like
211195_s_at
3q28
190,831,918
TP73L
tumor protein p73-like
218182_s_at
3q28
191,506,196
CLDN1
claudin 1
222549_at
3q28
191,506,196
CLDN1
claudin 1
1552291_at
3q29
197,927,625
PIGX
phosphatidylinositol glycan, class X
202514_at
3q29
198,258,815
DLG1
discs, large homolog 1 (Drosophila)

sp
Subgrou
Probes located in 1q and 13 differentially expressed between K1 and K2.
Mapping to the corresponding
corresponding chromosomal cytogenetic bands.

k1 versus k2: 1q gain
Significantly incre
incr ased
eased expression
expr
in k2 versus k1
258 (198 genes) of 2210 probes mapping to chr1q (FDR=10%)
*
1q
es/10MBn
ge
ficant
Signi
MB
In most cases, 1q gains affect most of the long arm
Overexpressed probes map to a relatively small region on 1q, that
correspond to focal amplifications present in a few cases (mostly cell
lines).
Dyregulated expression driven by chromatin remodeling or other
mechanisms, "genetic independent"?

k1 versus k2: 13 loss
Decreased
Decr
expre
expr ssion
ession in k2 versus k1
74 (65 genes) of 1163 probes mapping to chr13 (FDR=10%)
B
*
M
13
enes/10g
nificantgSi
MB
In all our cases, 13 losses affect the whole chromosome 13
Downregulated probes map to a relatively small region on 13, that
correspond to focal deletions/LOH (in ~15% patients) reported in the
literature (13q14, RB1)
Dl
Dyregul t
a d
e expression d i
r ven by ch
t
roma i
tin
d
remo
l
e iling or t
o h
ther
mechanisms, "genetic independent"?
RB1 is not among the downregulated genes in this region

Outline
· Ouverture:
Ouverture gNMF classification of MM samples.
· First
First act: from above: merging expression and
aCGH data.
·
·Second
Second act
act: pathway analysis: Gene Set
Enrichment Analy
Anal sis (
y(GSEA)
GSEA in specific MM
subgroups.
·Second act: in the trench: Gene Weight to
address oncogenic overexpressed genes.
·
·Finale
Finale: the future, functional genomics.

Gene Set Enrichment Analysis
No difference reported using SAM or Neighborhood analysis.

Ranking of the genes according to
difference in expression between 2
groups.
Null hypothesis: the rank ordering of
the genes is random with regard to
the diagnostic categorization of the
samples

Pathways
MM versus normal
K1
K2
K3
K4
Number of samples
65
21
19
8
17
KEGG_Proteasome
mtorPathway
igf1mtorPathway
p53_sig
i n
g alling
KRAS_FINAL_MIT
bcl2family_and_reg_network
KEGG_Glyco_Gluco
HOXA9_UP
wntPathway
rac1Pathway
shhPathway
cell_cycle_checkpointII
cell_cycle_regulator
cell_growth_and_or_maintenance
KEGG_TGF-Beta
insulin_signalling
p53Pathway
rasPathway
p27Pathway
ptenPathway
bd
ba P
dPath
thway
igf1rPathway
erk5Pathway
il3Pathway
igf1Pathway
insulinPathway
cxcr4Pathway
erkPathway
gleevecPathway
caspasePathway
ptdinsPathway
cel
ce lll_cycl
cy e_ch
chec
e kp
k oi
o nt
il6Pathway
Green: FDR < 0.25, P<0.05
aktPathway
arfPathway
Yellow: FDR >0.25, P<0.05
rabPathway
il2rbPathway

Pathways
MM versus normal
K1
K2
K3
K4
Number of samples
65
21
19
8
17
KEGG_Proteasome
mtorPathway
igf1mtorPathway
p53_sig
i n
g alling
KRAS_FINAL_MIT
bcl2family_and_reg_network
KEGG_Glyco_Gluco
HOXA9_UP
wntPathway
rac1Pathway
shhPathway
cell_cycle_checkpointII
cell_cycle_regulator
cell_growth_and_or_maintenance
KEGG_TGF-Beta
insulin_signalling
p53Pathway
rasPathway
p27Pathway
ptenPathway
bd
ba P
dPath
thway
igf1rPathway
erk5Pathway
il3Pathway
igf1Pathway
insulinPathway
cxcr4Pathway
erkPathway
gleevecPathway
caspasePathway
ptdinsPathway
cel
ce lll_cycl
cy e_ch
chec
e kp
k oi
o nt
il6Pathway
Green: FDR < 0.25, P<0.05
aktPathway
arfPathway
Yellow: FDR >0.25, P<0.05
rabPathway
il2rbPathway

Outline
· Ouverture:
Ouverture gNMF classification of MM samples.
· First
First act: from above: merging expression and
aCGH data.
·
·Second
Second act
act: pathway analysis: Gene Set
Enrichment Analy
Anal sis (
y(GSEA)
GSEA in specific MM
subgroups.
·
·Third
Third act
act: in the trench: Gene Weight to address
oncogenic overexpressed genes.
· Finale:
Finale: the future, functional genomics.

Multiple Myeloma MCRs
87 hi h
high-
fid
-con
con
ence ho
h tspots identifi d
e
> 0.8 log
2 ratio
Presence
P
in at least one primary tumor.
t
Median size = 1.3 Mb
Median # of known genes
g
= 14
48% span 1 Mb or less, harboring on
average
g 6
d
annotate genes
g
Known oncogenes & TSGs on the list
e.g. Myc, MCL1, KRAS2, RB1

Defining features of regional
amplf
lifications
d
an d l
e etions
CNAs
· Copy number events are
inferred by Segmentation
1
Algorithm
ple
· CNAs are defined by
p
·segment with Log
Sam
2
> 0.4
· Segment size < 20MB
· MCRs are defined by
· overlap of CNAs from
at least two profiles
2
· boundaries determined
ple
by one or two samples
Sam
· High-priority MCRs
·Log
2 >=0.8
{
MCR
· Presence in primary
tumor

MCRs
Size
#
Cytogenetic Band Position (Mb)
(Mb)
Max/Min value
Genes
Candidate Genes
Amplifications
4 A
lifi
mp
t
ca i
tions
MCL1, IL6R, PSMD4, PSMB4,
UBE2Q, UBAP2L, RBM8A,
1q21.1-1q22
142.60 - 152.10
9.5
2.5
228
RPS27, PIAS3, POLR3C,
HIST2H2AA, LASS2, MRPL9,
JTB, HAX1, SHC1, APH-1A,
BCL9, ZNF364
DEPDC6, MRPL13, DERL1,
8q24.12-8q24.13
120.50 - 126.52
6.0
0.8
37
ZHX1, TATDN1, RNF139,
FBXO32
8q24.21-8q24.3
128.50 - 146.20
17.7
2.5
127
MYC, PVT1,
20q12-20q13.12
39.14 -
43.39
4.3
1.3
43
TDE1, SFRS6, YWHAB, PPIA
Deletions
11 Deletions
1p36.22
10.41 -
10.62
0.2
-1.0
4
DFFA
1p35.2
30.86 -
31.04
0.2
-1.0
4
LAPTM5
1p32.3-
3 1p32
3 .2
54.23
23 -
57.13
29
2.9
-09
0.9
23
SSBP3, USP2
USP 4
2
DENND2D, DDX20, ST7L,
1p13.3-1p12
111.49 - 118.21
6.7
-1.0
71
PPP2CZ, LRIG2, PTPN22,
CD58, IGSF2
DNAJA2, SIAH1, PAPD5,
16q1
q 1.2 16
- q1
q 2 2
.
45.17
17 -
52.87
77
7.7
-17
1.7
45
NKD1
NKD , CARD15
CARD15, RB
RBL2
L2, FT
FTS,
CYLD
16q13
56.23 -
56.57
0.1
-1.1
9
GPR56, KATNB1, KIFC3,
CNGB1
16q24.3
88.23 -
88.63
0.4
-1.3
19
GAS8
17p13
13 2
.
33
3. 1
31 -
4.79
15
1.5
-13
1.3
43
TAX1BP
TAX
3
1BP , GS
GSG2
G
17p13.1-17p12
6.45 -
13.92
7.5
-1.0
125
TP53, TNFSF12, TNFSF13
20p12.1
13.08 -
17.41
4.3
-0.9
21
OTOR
20q13.12
43.97 -
44.11
0.1
-1.1
6
PCIF1

Expression Analysis of MCR Resident Genes
Cb
Copy number driven expression:
Overexpression i
w thout
lifi
amp cation:
Standard deviation of the normals vs.
T test
expression level on the PT showing AMP
Gene weight
Gene weight (permu
(perm tation based)
based)
400
300
BLUE: all known NCBI genes within the MCRs
RED: Genes with copy-number driven
200
and increased expr
p ession
100
Significance determined by permuting sample labels for
20
expression data (
p(1000 permutations, p value % 0.001)
10
Hyman, E.,Cancer Res., 2002
0
25% of MCR genes show oncogene-like expression patterns

25000
The only gene surviving gene weight in the MYC locus was...
MYC
10000
5000
3130
1000
500
538
100
50
Only a few samples showing amplification in c-MYC...
Normals
10
Tumors
N01
N30
D-PL208
D-PL019
D-PL012
D-PL386
D-PL407
D-PL454
D-PL466
D-St889
D-PL1233

25000
The only gene surviving gene weight in the HGF locus was...
HGF
10000
5000
1000
634
500
100
50
42
10
5
Only a few samples showing amplification in the HGF region...
Normals
Tum
Tu o
m r
o s
N01
N30
D-PL208
D-PL019
D-PL012
D-PL386
D-PL407
D-PL454
D-PL466
D-St889
D-PL1233
At least for focal regions, aCGH useful only to point to regions, where critical
oncogenic genes are located, overexpressed also by other means

Outline
· Ouverture:
Ouverture gNMF classification of MM samples.
· First
First act: from above: merging expression and
aCGH data.
·
·Second
Second act
act: in the trench: Gene Weight to
address oncog
onco enic overexp
overex
gpressed genes.
·
·Third
Third act
act: pathway analysis: Gene Set
Enrichment Analysis (GSEA) in specific MM
subgroups.
·
·Finale
Finale: the future, functional genomics.

Cancer
Cancer--relevant
relevant gene Filtering
Flow Chart
47 amplicons
2151
Total number
ff
A ymetrix probes
Copy
Copy--number
number driven
567
AND Overexpressed
Affymetrix probes
MoAb targets
Both
Small Molecule targets
37
14
100

RAS
RAS/cMYC
Gene 1
Gene 2

2.25
48 hours
2.25
72 hours
96 hours
2.00
120 hours
1.75
1.50
1.25
STW
1.00
0.75
05
0. 0
50
0.25
0.00
+
+
3+
3
IL
IL3-
1
IL
P IL3-
ne 1
EGF
Ge
Gene
EGFP

Ron DePinho
Ruben Carrasco
Cameron Brennan
Al
i
exe Protopopov
P
John Shaughnessy
Lynda Chin
Arkansas Medical Center
Raktim Sinha
Bin Feng
Yunyu Zhang
Elena Ivanova
Marina Protopopova
Kwok
Kwok--Kin
Kin Wong
Mike Kuehl
G t
au am
t
M lik
au
Zhaohui Yang
Hongbin Ji
Leif Bergsagel
Rf
Ra
R fael Fonseca
F
Ken Anderson
Nikhil Munshi
Funding
·International Myeloma Foundation
·Fund to Cure Myeloma
·Multiple Myeloma SPORE
·Leukemia and Lymphoma Society