Ways to improve data mining and management
Posted: June 10, 2012 - 4:03pm
Their motto: We’re making data science a sport.
Also
http://www.kaggle.com/prospect
What can we learn from this organiztaion? How can they help us deal with ever-increasing amounts of data?
Regression
Posted: June 22, 2012 - 8:47am
I too was interested in the regression of no. of teams vs. reward. According to the website, 33 competitions have been completed so far with rewards ranging from 0 to $100k. Turns out the relationship between no. of teams and reward size isn't significantly different from zero. I've attached a plot. The gory statistical details are below.
Joel
Call:
lm(formula = N_Teams ~ Reward, data = nonzero_finished)
Residuals:
Min 1Q Median 3Q Max
-180.53 -144.44 -92.89 44.37 762.35
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.013e+02 4.365e+01 4.610 6.54e-05 ***
Reward 1.279e-03 2.309e-03 0.554 0.584
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 225.7 on 31 degrees of freedom
Multiple R-squared: 0.0098, Adjusted R-squared: -0.02214
F-statistic: 0.3068 on 1 and 31 DF, p-value: 0.5836









Hi Tom,
Very interesting find! It would be interesting to see the regression of the number of Kaggle teams versus the reward value.
It is unclear how the AAVSO might engage this but the model bears consideration.
Cheers,
Doug