Measuring language online - lots of studies, but once you dig deeper there are gaps. If an internet user speaks more than one language? You can look at data re: languages spoken and internet connectivity in the same region to get a proxy, but what about in less homogeneous areas where there are many languages spoken? Do people care if internet content is published in their first language? Opinions vary widely depending on culture (eg feelings in Germany very different from France).
ideas & reactions: survey users through the browser. ask users to opt-in to sharing their browsing history and use that as a data set. ads in different languages, see what gets clicked on.
Putting UN data on Wikipedia to improve articles - idea is that this leads to good outcomes (people learn more and then can do things). But hard to collect data on how much impact this is having. Can say # of people who read a specific page, but not how many people read a section of that page and what happened after. How to make the case re: impact?
ideas & reactions: Putting out a survey to ask, sharing the challenge openly/vulnerably and asking for people’s input. Ask (lobby?) Wikipedia to gather more data on this and/or survey their users.
Trying to find info re: who owns my data in Canada and if I can control it. Hard to gather and maintain research that is focused on this kind of qualitative, complex data.
ideas & reactions: use and support existing crowdsourcing research tools, like Wikidata
Chatbot prototype on Facebook messenger - challenge of working with clients without a high level of digital literacy, don’t necessarily know what data they need or want to capture. And the tools available on platforms are often very limited. Not until you see the data at the end of the project that you realize you wish you’d captured something else (but can’t go back or don’t have budget to do that).
CitizenLab Security Planner - not collecting data from users, want to link to data/research in the tool. A lot of data & research about security issues is collected by people who create devices/software to improve security - so need to be skeptical about the findings, as it often drives people back to the product they created. Also research methodology can be questionable. How to find good statistics/info that is vendor agnostic and address a more representative, global audience?
Internet shutdowns - growing group of researchers looking to document this and share it with activists to fight against this. Internet shutdowns are happening more frequently, not always reported by media, can happen in very short windows, can be very localized in relatively small areas (is it a shutdown, or is the electricity just off?), access to some platforms/sites might be shut down and not the whole internet, etc. Feels like this should be measurable, but it’s actually very difficult to track this worldwide. How do researchers collect/present this data and ensure it’s credible? How quick can this be done?
ideas & reactions: hard to report internet shutdowns when it’s happening because you don’t have internet access to report it - could you go to satellite network partners and ask them to provide support in these cases? how do you marry this to some of the automated solutions that already exist?
Working to digitize the shipping industry. Clients are often not open to giving enough data. By having just 1 image of a person you can make a facial recognition system. Using this approach for documents, to train a template recognition system.
Internet Health Report - Much more data on western countries than other parts of the world (eg lots of research re: online harassment in North America, far less from Sub-Saharan Africa and not comparable). Research on net neutrality laws - not many countries have law that are called “net neutrality law”, best research that we found had 66 countries that looked only at 2 continents… considered combining with another study that looked at more countries but it didn’t use the same scoring system so we couldn’t combine them. Didn’t end up using this in the report. One approach taken with other topics was to combine qualitative and quantitative studies (eg re: online harassment of journalists, combined quantitative research from The Guardian with a qualitative study).
One great approach is a network! Let’s stay in contact.