Current Projects

The projects for the 2017-2018 academic year are listed below.  Applications to participate in these projects are closed. Projects from previews years can be viewed on the project archive page.

Project 1: Analysis of Complex Functional Data

Faculty Mentor: Hans-Georg Müller

The goal of this project is to apply and extend existing methods for functional data with complex structure, such as point processes, densities and functional data with spatio-temporal dependencies. The developments will be primarily data-driven. Possible data sources are brain imaging (fMRI), demography and transportation and methods include appropriate versions of functional principal component analysis. The RTG students will download and preprocess data and participate in data analysis and code development.

Prerequisites: Required-Strong computing skills, Python or Matlab or R, Probability at the level of 131A, calculus and linear algebra, 106/108. Desirable but not required-131BC, 135, 141A, 141BC, 106/108.

Number of students: up to 3

Project 2: Image Pattern Generation using Cellular Automata

Faculty Mentor: Thomas Lee

Cellular automata are discrete dynamical systems which evolve on a discrete grid. Recent studies have shown that cellular automata with relatively simple rules can produce highly complex patterns. This project aims to develop methods for learning such rules, and apply the learned rules to generate image patterns and textures.

Prerequisites: Strong statistical background (131AB and possibly C), time series highly desired (137), and strong programming skills

Number of students: 2-3

Project 3: Survey of nonlinear dimension reduction

Faculty Mentor: Xiaodong Li

The goal of this project is to understand and explore the concepts, applications and empirical behaviors of nonlinear dimension reduction methods, such as MDS, Isomap, LLE, tSNE, etc, particularly for data visualization.

Prerequisites: STA106, STA108, MAT22A/MAT167.

Number of students: 3-4

Project 4: Analysis of high-dimensional proteomics data using networks and topological data analysis

Faculty Mentors: Javier Arsuaga, Dietmar Kueltz, Wolfgang Polonik

The goal of this project is to analyze differential protein expression in response to environmental conditions in fish, with the goal of addressing real scientific questions. The high-dimensional data will be analyzed using (i) certain type of network analysis (k-core method and refinements thereof) and (ii) methods from topological data analysis (TDA). While the participating students will be split into two groups, one working with networks and the other with TDA, it is expected that all participants will collaborate and participate in joint discussions and training sessions.

Prerequisites: Basic knowledge in statistics and linear algebra, aa well as good computing skills are expected. Some background in topology would be helpful but is not necessary.

Number of students: up to 6

Project 5: Causation, confounding and mediation

Faculty Mentor: Christiana Drake

Statisticians and data analysts get data from many sources. In agriculture, investigators often conduct experiments. Studies involving humans are more complicated. Clinical trials are used to assess the efficacy of new drugs. These studies are mostly randomized trials. Epidemiologists and public health workers often study potentially harmful substances and studies are observational in nature. Biologists, also, often have to rely on observational studies to investigate biological processes in the progression of diseases. We will examine the ideas underlying causal inference in observational studies through a concept called the Rubin Causal model. We will look at confounding as an obstacle in studies of causal inference and methods to address confounding. We also define the concept of a mediator and study how it differs from a confounder. The concepts will be introduced through reading short papers that introduce the concepts. We will use data to apply the concepts using SAS and R as needed.

Prerequisites: STA106/108 and STA131A-C, or equivalent

Number of students: up to 5