Reproducibility Is Overrated

What’s with the the popular research papers in Dota 2 and StarCraft?

Yes, OpenAI and deepmind have done amazing work, but their benchmarks and specs make it intractable for the normal research lab to even make it palatable or think about reproducing the results!

Baseline estimate for either approach is in the order of 5M USD of training time! (This is computed through, use of 500 GPUs for the period of roughly 12 months, at say 4.00 per an hour, 10 hours a day and running for say 250 business days)

What could we do about this? There have been attempts like “MineRL” to create competitions based on resource constraints and learning RL from that. Perhaps a greater focus on the subproblems or RL understanding may benefit the area as a whole.

At the end of the day - who am I? I don’t have papers in RL and I’m just a PhD student trying to reproduce some of these big papers, and failing spectacularly whilst I’m at it.