Mathematical and Statistical Problems
from DNA Sequencing
The ability to produce DNA sequence has dramatically increased since the 2001 announcement of the human genome
sequence. This lecture will survey a few of the algorithmic and statistical issues in this area. The volume of raw
sequence reads from which genomes must be estimated has caused a shift from the computational methods used in
the Human Genome Project. Those methods have been used and improved since the mid 1970s. A new approach
reduced computation time, but required much more memory. This result brought about new memory reduction
techniques. At the same time, traditional sequence comparison methods used since the early 1970s have in part been
replaced by so-called alignment-free methods.
Professor Michael Waterman
University Professor at University of Southern California
Professor Michael Waterman holds an Endowed Associates Chair at USC. He came to University of Southern California (USC) in
1982 after positions at Los Alamos National Laboratory and Idaho State University. He has a bachelor’s degree in Mathematics from
Oregon State University, and a PhD in Statistics and Probability is from Michigan State University. He has held visiting positions at
the University of Hawaii (1979–80), the University of California at San Francisco (1982), Mt. Sinai Medical School (1988), Chalmers
University (2000), and in 2000–2001 he held the Aisenstadt Chair at University of Montreal.
Professor Michael Waterman was named a Guggenheim Fellow (1995). He is an elected member of the American Academy of Arts
and Sciences (1995), the National Academy of Sciences (2001) and the National Academy of Engineering (2012). Also he is an elected
Fellow of the American Association for the Advancement of Science (1990), Institute of Mathematical Statistics (1991), Society of
Industrial and Applied Mathematics (2009) and International Society of Computational Biology (2009). In fall 2000 he became the
first Fellow of Celera Genomics. He received a Gairdner Foundation International Award (2002) and the Friendship Award from the
Chinese government (2013). He is an elected Foreign Member of the French Académie des Sciences (2005) and the Chinese Academy
of Sciences (2013). He received Doctor Philosophiae Honors Causia from Tel Aviv University (2011) and Southern Denmark
During 2003–2008, Professor Waterman held a 5-year term as Faculty Master of Parkside International Residence College at USC. PIRC is a residential college
that is home to over 600 undergraduates and serves as a center for internationally oriented cultural, academic and social events.
From May 2008 to May 2013, in addition to his USC appointment Professor Michael Waterman became Chair Professor at Tsinghua University in Beijing. He
led a team of distinguished scientists consisting of Michael Zhang (Cold Spring Harbor Labs), Wing Wong (Stanford), Jun Liu (Harvard), Tao Jiang (UC
Riverside) and Fengzhu Sun (USC). The team collectively worked to enhance Tsinghua’s programs in bioinformatics and computational biology.
Professor Waterman works in the area of Computational Biology and Bioinformatics, concentrating on the creation and application of mathematics,
statistics and computer science to molecular biology, particularly to DNA, RNA and protein sequence data. He is the co-developer of the Smith-Waterman
algorithm for sequence comparison and of the Lander-Waterman formula for physical mapping. His paper with Idury in 1995 introduced the use of Eulerian
and De Bruijn graphs for sequence assembly.