OHBM Open Science Room (OSR)

Panel: The future of open tools/technologies

Tuesday, June 22, 2021 17:00-18:00 UTC


Panelists

André Maia Chagas

Lea Waller

Eilidh MacNicol

Christopher Madan

Host

Oscar Esteban

My Questions

Q1. In your 2019 paper introducing your calcSulc toolbox you used three lifespan datasets, and found consistent age-related relationships for each sulcus of two datasets (both Western samples) and “markedly weaker” in the East Asian dataset. This finding leads you to caution about the WEIRDness of neuroimaging data. If we then bring this problem into the context of your very recent paper “Scan Once, Analyze Many”, where you note that open data is already driving the development of methods - How much of a risk is that, if we base the development of new tools only on WEIRD datasets then all these new tools will only amplify biases? How can we (or should we) robustify the development of methods against this particular bias in the case we can’t access diverse enough data for the development? Can you share thoughts or strategies for a methods developer mid-way their PhD to effectively identify insufficient diversity on their test data?

Q2. I found “Scan Once, Analyze Many” a delightful read, very thought-provoking, and extremely useful, with Table 1 being a shiny gem indexing many open datasets. You assigned them some Access score (the capital A of Access comes from the FAIR principles for open data). Let’s talk about FAIR in general, but the R (reusability) in particular. Is it common amongst open data to be released with somewhat restrictive licensing disallowing, e.g., sharing derivatives? What is your position on that? If time permits and relatedly, what is the picture regarding the level of preprocessing (or how pristine) data in the open datasets you gathered together in Table 1 are?