USER GUIDE % -------------------------------------------------------- This WebGis performs 2D spatial clustering of 3D space-time point data (typically produced by Seismic catalogs, GPS and Laser reliefs) The on-line software estimates the ridges (crests) of 2D point clouds using iterative principal component mean shift (PCMS) methods The mean shift (MS) moves the points toward the modes of the cloud while principal components (PC) deviate the points on the crests. STEP 1 - Load the Data % -------------------------------------------- the data file must be located on your PC it must be in ASCII format (csv or txt), with data separated by colons or semicolons: DATA = [x,y,z,m,t]; (at present, "t" is not necessary) the rows represent the data units (i.e. the events) the columns provide the spatial features: x = 1st coordinate (longitude), y = 2nd coordinate (latitude), z = 3rd coordinate (depth), m = mark (magnitude), t = time (occurrence), Since the data file may be very large, the user may consider only a portion of it, by defining the minimum mark (magnitude) of the observations, e.g with >= 3.0 STEP 2 - Select the Algorithm % ------------------------------------- The user may choose among CLASSICAL and BLURRING PCMS methods both iteratively move the points toward the ridges of the cloud BLURRING is an accelerated version, which use original data only in the first iteration, then re-cluster previous estimates It is efficient, but suffers from problems of asymptotic bias hence, its number of iterations must be small (less than 10) The BANDWIDTH SELECTION option aims to select the bandwidths of PCMS algorithms in an automatic (data-driven) way, by minimizing a global fitting criterion based on the sum of Euclidean distances between estimates and data points, and the estimates themselves. MIN = minimum bandwidth value, e.g. 0.5 MAX = maximum bandwidth value, e.g. 2 N = number of bandwidth points, 5 STEP 3 - Select the Coefficients % ----------------------------------- MAX ITERATIONS : maximum number of iterations allowed to estimates (1-30) for the BLURRING algorithm it must me less than 10 TOLERANCE : maximum mean Euclidean distance between estimates allowed (0.001) in two consecutive iterations. In the CLASSICAL algorithm it enables to minimize the computation time (e.g. 0.001) BANDWIDTH SIZE : PCMS algorithms are weighted local means of data (.5-2) the bandwidth "b" tunes the number of data in the mean. In the present program, it is designed as proportional to Silverman's rule (based on standard errors (SE) of data) In practice, it is given by b=a*SE/N^0.2, where 01 reduces variance of estimates, but increases bias a<1 provides multiple ridges, but increases the noise N. of BANDWIDTHS : indicates single (1) or multiple (2) bandwidths (1,2) As the Silverman's rule is based on the SE of data x,y,z 1 means the average SE, whereas 2 indicates individual SE This option has proven to be useful in simulated data. COVARIANCE TYPE : indicates the type of variance-covariance matrix (0,1,2) (0) is global, (1) is local, (2) is intermediate. PCMS algorithms are based on the spectral factorization of the weighted covariance matrix of the data (x,y,z) Choose (0) if ridges have the same direction in space. %----------------------------------------------------------------------