Skip to main content

Ensembles and Bias

7 posts / 0 new
Last post
potterrb's picture
Ensembles and Bias

I'm using VPhot for image analysis, and I really like its capabilities.  That said, using it is not the same as using it properly.  I like the flexibility of being able to fine tune the ensemble, but am concerned about introducing user bias.

My question is -- how do you select the comps to include in your ensemble?  Do you only eliminate those that show up as darker red, indicating they are perhaps outliers in the ensemble?  Do you choose an ensemble that brackets your target's magnitude, similar to visual estimating? Do you perform trial and error to find the ensemble that minimizes error values (I suspect not)? Or do you -- and this is what "feels" right to me -- select the ensemble that results in a value for the check star you've selected that is closest to the known value (in parentheses)?

I want to make sure that I am reporting values that are proper / most scientifically meaningful, and it seems there's a lot of room for introducing user bias into the numbers if a well thought out method is not followed.

Thanks -- Brian (PRV)

WGR's picture
Bias and Ensembles

Hello Brian

Said tongue in cheek---you usually know the result you are looking for, so adjust the ensemble until you get the answer you are looking for!

This is one reason I have never liked ensembles.  For me, the ensemble should be defined for all observers the same, and then it helps beat down the noise.  This is a good thing.  Ensemble is a way to beat down the noise at the potential expense of the accuracy of the magnitude estimate.  

Sorry, been observing all nite and just a bit punchy this morn.  




I agree with Gary

I always try to use all the comparisons that are included in my image.  A "red" measurement from an individual comparison can indicate that something is wrong.  But the discrepancy should not on its own be enough to drop it from the ensemble.  I take a second or third close look at any comps that give a discrepant value.  But I only drop them from the analysis if I can come up with an independent good reason to believe there is an issue with that comp (ie a cosmic ray hit, or a problem with the sky annulus that can't be fixed by changing its size, a satellite trailed through it, etc.). 

If a comp is consistently (or variably) discrepant, it should be reported to the sequence team.  It is possible in that case you have discovered a new variable or at the very least you will contribute to the quality of the sequences used.

Generally you should never throw away data for capricious reasons.  The assumption should be that a discrepant point is correct until it is clearly proven otherwise.

HQA's picture
ensembles and bias

Hi Brian,

There are (at least) three ways to perform the photometric analysis:

1) classical single comp and check

2) restricted ensemble (hand-selected in some manner)

3) inhomogeneous full ensemble (use every star in your field that has a calibrated magnitude).

Kent Honeycutt, in his classic PASP paper, used #3.  That introduces no explicit bias, though it will be biased in magnitude (more faint stars, which have poorer photometry and are more likely to have systematic errors) and color (in reddened fields, the all-field comps will be red, and in other fields, will typically be G-K colors).

I tend to do #2, but perhaps differently than others.  I select more ensemble stars than would normally appear in an AAVSO sequence, and I select them to bracket the parameters of the target star (magnitude, color, spatially surrounding the target, etc.).  I like to pick 20 or so, depending on what is available in a field.  I then will do one step of iteration, discarding those whose resultant magnitude differs from the mean by more than 3 sigma, and only if the number discarded is ~10percent of the total (that is, 1-3 stars might get discarded).  My sample rarely includes faint stars with poor signal/noise, so the rejection is rarely because of the ensemble star's photometric precision.

#2 introduces human bias, especially if you are using an ensemble with only a few stars.  I watch to see if one star is rejected often, and see if there is a reason - I may then remove it from the ensemble, but note why I removed it.  Rejection is always fraught with possible bias, so be very careful when removing ANY star unless you fully understand your reasoning.


CTX's picture
Classical vs Ensemble

I much prefer what Arne is referring to as the “classical single comp and check” and like Gary I am not a big fan of using ensembles for the very reasons he offers. However, there are situations where I believe the ensemble approach could potentially be a better option. 

“Classical” amateur differential photometry determines the final magnitude of the variable star (V) being studied through the use of a comp star (C), with a known magnitude, and a check star (K), with a known magnitude and looks like this:

V-C = v-c       Provided the known K-C = observed k-c


c = instrumental magnitude of comp star

C = known magnitude value of a comp star (from AAVSO or similar chart)

k  = instrumental magnitude of a check star

K = known magnitude value of a check star (from same chart as “C” star)

v = instrumental magnitude of variable star

V = magnitude of the variable star

Traditionally the observer would choose comp stars where the k-c differed, ideally, no more than ~.06 at the maximum from the K-C values; similar in color and close in magnitude (plus or minus 1.5  magnitudes, maybe).

Occasionally we discover that the underlying calibration of the sequence being used will not allow for a reasonable match of k-c to K-C and or the target magnitude is distant from the sequence values or the sequence colors are discrepant from the targets.  When this occurs, then, statistically an ensemble would be a wiser choice.

One problem today for observers is that some software has become so automated that the observer is not only unaware of what an instrumental magnitude is the software may not even provide it as an option. This can be a handicap for anyone wanting to properly match their comp and check stars.

There are pros and cons for any one of the three methods that Arne describes.  In the final analysis the most important objective for the observer is to be consistent from observation to observation with their photometry solution.  For any given target your observations, over time, should only use one of the three available methods and ideally, always with the same check and comp star or ensemble selection.

Per Ardua Ad Astra,

Tim Crawford, CTX

potterrb's picture
The PRV Ensemble Method

Ok, so maybe the title of this post is a bit tongue in cheek, but here is what I've been doing:

  1. Pull in the variable and the AAVSO comp stars
  2. Set the aperture and check all stars to make sure the sky annulus for each is clean and that there are no comp stars with close neighbors (I have seen this, and when I do I eliminate the comp star)
  3. If this is a new target, save a new sequence
  4. Do a quick check of the photometry report to see roughly the magnitude of the target
  5. Select a check star of similar magnitude to the target, and then bracket the target/check with a comp on either side; remove all other comp stars from the calculations
  6. Recheck the photometry report. If comps are not way off, then I'm done.  If one of the comps is dark red, remove it and select next nearest comp on that side of bracket.

So essentially I'm doing something between #1 and #2 in Arne's post, where my comps and check are selected based on the current magnitude of the variable. That is, the comps would change as the star's magnitude changes over time across its range.  

As this bracketing approach is not one of the standard options used, do I need to reprocess those variables that I've already submitted, or does this approach in some way make physical/mathematical sense?

Brian (PRV)

WGR's picture
Ensembles and Bias

I can think of two cases, where one does not care too much about the accuracy of the magnitude, but benefits from having low noise.

1.  In Expolanet light curves, you want to beat down the noise, and the magnitude of the target sun, is many times not even given in the published paper, only relative PT is shown.

2.  On Eclipsing Binaries, where one is only interested in the "time of minima", beating down the noise may help the algorithm calculating the minima, while the absolute accuracy is not important within a particular time series run.  However, if you are going to combine data with another observer, then accuracy is important and will introduce an error in the algorithm.

When trying to post data that matches other observers, like in the AID, accuracy is important, but not to the detriment of the scatter/noise.  As Arne says "It Depends" on what you are doing. Most PT is both a science and a trade off (some folks call this art). 

Clear Skies



Log in to post comments
AAVSO 49 Bay State Rd. Cambridge, MA 02138 617-354-0484