Skip to main content

Ways to improve data mining and management

We recently released a significant update to the backend systems for the AAVSO website. While most of the bugs introduced by this update have been fixed, there may still be problems we haven't fixed. If you run into a problem, please email webmaster@aavso.org
KTC's picture
KTC
Offline
Joined: 2010-12-08

See: http://www.kaggle.com/

Their motto:  We’re making data science a sport.

Also

http://www.kaggle.com/about

http://www.kaggle.com/prospect

What can we learn from this organiztaion?  How can they help us deal with ever-increasing amounts of data?

Kaggle.com
wel's picture
wel
Offline
Joined: 2010-07-26

Hi Tom,

Very interesting find! It would be interesting to see the regression of the number of Kaggle teams versus the reward value.

It is unclear how the AAVSO might engage this but the model bears consideration.

Cheers,

Doug

Regression
GJSa's picture
GJSa
Offline
Joined: 2012-06-22

I too was interested in the regression of no. of teams vs. reward. According to the website, 33 competitions have been completed so far with rewards ranging from 0 to $100k. Turns out the relationship between no. of teams and reward size isn't significantly different from zero. I've attached a plot. The gory statistical details are below.

Joel

 Call:

lm(formula = N_Teams ~ Reward, data = nonzero_finished)

Residuals:
    Min      1Q  Median      3Q     Max 
-180.53 -144.44  -92.89   44.37  762.35 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 2.013e+02  4.365e+01   4.610 6.54e-05 ***
Reward      1.279e-03  2.309e-03   0.554    0.584    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 225.7 on 31 degrees of freedom
Multiple R-squared: 0.0098,	Adjusted R-squared: -0.02214 
F-statistic: 0.3068 on 1 and 31 DF,  p-value: 0.5836 

AAVSO 49 Bay State Rd. Cambridge, MA 02138 aavso@aavso.org 617-354-0484