by C. Grillenzoni and D. Migliavacca

The on-line software estimates the ridges (crests) of 2D point clouds using iterative principal component mean shift (PCMS) methods.

The mean shift (MS) moves the points toward the modes of the cloud while principal components (PC) deviate the points on the crests.

DATA = [x,y,z,m,t]; (at present, "t" is not necessary).

The rows represent the data units (i.e. the events). The columns provide the spatial features:

x = 1st coordinate (longitude)

y = 2nd coordinate (latitude)

z = 3rd coordinate (depth)

m = mark (magnitude)

t = time (occurrence)

Since the data file may be very large, the user may consider only a portion of it, by defining the minimum mark (magnitude) of the observations, e.g with >= 3.0.

BLURRING is an accelerated version, which use original data only in the first iteration, then re-cluster previous estimates. It is efficient, but suffers from problems of asymptotic bias, hence its number of iterations must be small (less than 10).

The BANDWIDTH SELECTION option aims to select the bandwidths of PCMS algorithms in an automatic (data-driven) way, by minimizing a global fitting criterion based on the sum of Euclidean distances between estimates and data points, and the estimates themselves.

MIN = minimum bandwidth value, e.g. 0.5

MAX = maximum bandwidth value, e.g. 2

N = number of bandwidth points, e.g. 5

TOLERANCE (0.001) : maximum mean Euclidean distance between estimates allowed in two consecutive iterations. In the CLASSICAL algorithm it enables to minimize the computation time (e.g. 0.001).

BANDWIDTH SIZE (.5-2) : PCMS algorithms are weighted local means of data, the bandwidth "b" tunes the number of data in the mean. In the present program, it is designed as proportional to Silverman's rule (based on standard errors (SE) of data). In practice, it is given by b=a*SE/N^0.2, where 0<a<oo, if a=1,then b=Silverman's rule, recommended 0.5<a<2, a>1 reduces variance of estimates, but increases bias, a<1 provides multiple ridges, but increases the noise.

N. of BANDWIDTHS (1,2) : indicates single (1) or multiple (2) bandwidths. As the Silverman's rule is based on the SE of data x,y,z, 1 means the average SE, whereas 2 indicates individual SE. This option has proven to be useful in simulated data.

COVARIANCE TYPE (0,1,2) : indicates the type of variance-covariance matrix: (0) is global, (1) is local, (2) is intermediate. PCMS algorithms are based on the spectral factorization of the weighted covariance matrix of the data (x,y,z). Choose (0) if ridges have the same direction in space.

California.txt

UserGuide.txt

This website tool has been developed by Prof. Dr. Carlo Grillenzoni and Dr. Diego Migliavacca.