Identifying + removing outliers

Affiliation
Bundesdeutsche Arbeitsgemeinschaft fur Veranderliche Sterne e.V.(Germany) (BAV)
Sun, 04/19/2020 - 16:57

Hi,

in the EE CEP campaign, i realized that i have a big scattering in my data. One problem is if the mag error is 0.022, the star is not changing fast, so EE CEP is almost constant during one night...which data is plausible? Although the 2 (red marked) datapoints at around mag 11 have a small error, they could not be valid! So far i could not reproduce nice data points of CCD measurements from the other people. 

In my data, the outliers are often not that obvious. I can exclude arbitrary all datapoints from the uppermost top and all data from the undermost bottom. 

I need something like a trend, and all data within the range of that trend can be included. The mean above is 10,8944

1) My first guess is using the mean and all data, which mag errors (commonly +/- 0,022) are not touching / reaching the mean wil be discarded.

This is looking better, but this is also changing the mean maybe.. some times too significantly?

2) I calculate the STABW of the mag, and all data points, whom STABW are touching the mean will be included. (STABW = 0,06)

The mean here is 10,8893, almost as the original first calculated mean.

 

3) For far outliers i can take the median, which is more stable than.

 

What do you think? How do you handle your outliers?

regards WBEA 

 

Affiliation
Bundesdeutsche Arbeitsgemeinschaft fur Veranderliche Sterne e.V.(Germany) (BAV)

So far i use 60s iso400 exposures at my 102/500mm Skywatcher achromat refractor.

For photometry I use Muniwin.

I am also trying stacking and merging 3 frames to 1, to improove SN, but that also gives not always the wanted results (e.g. minimizing the scattering..)

Affiliation
Variable Stars South (VSS)

Bernhard,

I'm not sure I understand what you are trying to do. It seems to me that the scatter in your data would be expected for DSLR photometry using 60 second exposures at ISO 400 of a star of magnitude 10.8 - 11.0  through a telescope with specifications like yours.

I would have thought it is not valid to consider removing "outliers". If you do so, you would be deliberately reducing variance. But perhaps I have missed your point ...

Roy

Affiliation
Bundesdeutsche Arbeitsgemeinschaft fur Veranderliche Sterne e.V.(Germany) (BAV)

Hi Roy,

My worst dataset from

https://www.aavso.org/LCGv2/index.htm?DateFormat=Julian&RequestedBands=&view=api.delim&ident=ee cep&fromjd=2458916&tojd=2458959.981&delimiter=@@@

is this one: ee cep 2458947.5.png (144.41 KB)

Even if the photometry SW is saying the error of one measurement is +/- 0,034

, all the data points are going from 10,91 to 11,20 which has a delta of dmag=0,29. And this is very contradictory to a datapoint, which accuracy is claimed to be +/- 0,034 mag.

The most plausible overall data point is maybe a simple mean. like this Image icon ee cep 2458947.5_with means.png But it has a big error!

So now the mathematical - philosophical discussion is on (-:

a) I cannot say which data points are exactly are outliers here. (Because they are so much random, that there are'nt far outliers here..) So is there a trend or a function known which can help here?

b) If I remove outliers, the real value should not been altered (like the variance) as you 've said. How can that beeing achived?

c) Can i just let the data as it is and the simple mean is the correct data point? Or is it better to remove all data, which are not touching the mean or are lying within a standard deviation of all magnitudes? Therefore I get a better error of the whole mean?

kindly WBEA

Affiliation
American Association of Variable Star Observers (AAVSO)

Hi,

I agree with Roy, additionally my hypothesis: I see a combination of factors. Atmosphere and problems with the mechanics of mounting. 1. Make a test of the mechanics of mounting. 2. Take an image of stars, of various magnitudes, at the zenith. Having this data, you can analyze the causes of data deviation.

P.S. It is not scientific to delete data without good reason; it is necessary to look for a reason.

Affiliation
Bundesdeutsche Arbeitsgemeinschaft fur Veranderliche Sterne e.V.(Germany) (BAV)

Hi Igor,

currently i do not have a Zenit because of the balcony roof... But i can shoot varius magnitudes around Polaris and compare them.

It is common to delete outliers. (If an aeroplane, satellit, clouds, or big atmospheric turbulence at a given time... are making false data points, they should be removed...) The question here is: which data points are outlieres and which are not?

Affiliation
Variable Stars South (VSS)

Hi Barnhard,

OK, now I can see your data in the context of the light curve. It seems to me, for your 'worst' dataset for example, you should take a simple average, because you do not need time series data for an appropriate cadence for this star.

There are two other things. First, I do not regard any of the points in your dataset as 'outliers' - there is simply a scatter of data points. My personal approach is to regard as outliers isolated points separated by a clear gap from the main body of data.

Second, the method of calculation of the error bars in your data seems to me important. It may be idiosyncratic, but I never use error bars for individual measurements - I prefer to calculate the SD for a dataset.

Finally, I just want to go back to my original response to your question. I have done a lot of DSLR photometry, and from my experience you would expect to see the sort of scatter you are getting, given your equipment and exposure settings. Have you tried longer exposures (say, 3 minutes)? You may find that your scatter is reduced. However, I note that you said you have tried stacking groups of three images prior to measurement.

Roy