Department Logo for Earth, Planetary, and Space Sciences

Machine learning on the 5200 element Long Beach array:


Jan. 24, 2018, noon - 12:50 p.m.
Geology 1707

Presented By:
Peter Gerstoft
UCSD

See Event on Google. Subscribe to Calendar

Machine learning on the 5200 element Long Beach array: Localizing small sources and sparse travel-time tomography
In this talk, I will focus on two unsupervised learning methods for analyzing the Long Beach data. Unsupervised machine learning we infer a function to describe hidden structure from "unlabeled" data. An example of unsupervised machine learning is dictionary learning. Dictionary leaning can improve SSP resolution by generating a dictionary of shape functions for sparse processing (e.g., compressive sensing) that optimally compress SSPs; both minimizing the reconstruction error and the number of coefficients. Dictionary learning can also be used for travel time tomography. A 2D travel time tomography method which regularizes the inversion by modelling sparsely patches of slowness pixels from discrete slowness map, and adapts sparse dictionaries to the slowness data. We develop a model-free technique to identify weak sources within dense sensor arrays using graph clustering. No knowledge about the propagation medium is needed except that signal strengths decay to insignificant levels within a scale that is shorter than the aperture. We then reinterpret the spatial coherence matrix of a wave field as a matrix whose support is a connectivity matrix of a graph with sensors as vertices. In a dense network, well-separated sources induce clusters in this graph. The geographic spread of these clusters can serve to localize the sources. The support of the covariance matrix is estimated from limited-time data using a hypothesis test with a robust phase-only coherence test statistic combined with a physical distance criterion. The latter criterion ensures graph sparsity and thus prevents clusters from forming by chance. We verify the approach and quantify its reliability on a simulated dataset. The method is then applied to data from a dense 5200 element geophone array that blanketed 7km×10km of the city of Long Beach (CA). The analysis exposes a helicopter traversing the array and oil production facilities.