My experience in Zooniverse

Zara Abdi
9 min readFeb 16, 2024
Figure 1.Author. Untitled. 2023

The process of the project

In the “Maturity of Baby Sounds” project, initiated in 2020, participants are tasked with listening to individual sound files selected based on simple loudness, which might include periods of silence or non-human noise. The participant must determine whether the audio clip features a person’s voice. If so, the participant classifies the speaker’s age, choosing from Baby (below 3 years), Child (between 3 and 12 years), Teenager (between 13 and 18 years), or Adult (above 18 years). In cases with no human voice, it should be categorized as “Junk.” When dealing with clips containing multiple voices, the focus should be on identifying the loudest one, and if distinguishing individual speakers becomes challenging, the clip should be classified as “Junk.” For clips categorized as “Teenager” or “Adult,” participants are then prompted to estimate the speaker’s gender, distinguishing between male and female. Additionally, participants are required to classify the vocal sounds into specific categories: “canonical” for clear consonant-vowel combinations, “non-canonical” for speech-like vocalizations lacking a distinct consonant-vowel sequence, and separate categories for “Laughing” and “Crying.” Despite the brief duration of the clips, participants are encouraged to provide their best judgment, as the collective input of participants aims to ensure reliable results (Zooniverse 2020).

Figure 2. Zooniverse. Untitled. 2024

The implications of the contribution

The research team highlights that the Zooniverse study on baby sounds, conducted by interested users, has revealed consistent speech development in typically developing children across languages and cultures. Children at risk of language impairment exhibited unique patterns, providing insights for intervention. Infants produce a variety of sounds before speaking, including crying from birth; progressing to babbling — producing sounds resembling adult speech with distinct consonants and vowels as they develop. The study now needs more volunteers to explore universal shifts in babies’ speech while growing, the rate of these changes in children at risk of language impairment or autism, and the timing of matching adults in complex speech frequency influenced by language learning. The research examines how children’s speech development relates to the amount and type of speech they hear (Zooniverse 2020). Developmental Language Disorder is the most prevalent developmental disorder, affecting one out of every fourteen kindergarten-aged children. The effects of DLD persist into adulthood (NIH 2023). The early phonetic discrimination abilities of infants influence their capacity to acquire vocabulary, morphology, and syntax as they grow. A robust correlation exists between early speech perception skills and subsequent language skills (Kuhl, et al. 2005, 238). Participation in such research not only provides valuable insights into improving this field and initiating intervention strategies for supporting the healthy development of future generations but also fosters a deeper understanding of the complexities involved in early language development for participants, as some of them might be encouraged to study about it. Moreover, involvement in this research cultivates collaboration among multidisciplinary teams, promoting the exchange of ideas and driving innovation within the field of psychology and linguistics. A part of the project tends to realize if there are any variations in the ratio of complex speech sounds across different languages for instance English versus Pidgin with ten consonants. Therefore, projects like this also have implications for linguistics. Zooniverse as a crowdsourcing platform has demonstrated reliability and cost-effectiveness in natural language (Hossain and Zahidul 2015, 9). Here, through classifying these sounds by people like me, the researchers study speech development across various languages and analyze whether certain speech patterns are universal across languages or if they vary based on linguistic features. In this process, valuable data will be provided for cross-linguistic comparison. The comparative analysis for elucidating underlying mechanisms of languages in shaping speech development can open a door into language typology, especially in endangered languages, being a way to preserve them. Another significant implication is technological and methodological advancements. To collect data the team equips children with recorders throughout the day and employs software to analyze the extensive 10–16-hour recordings, aiming to identify sounds made by the child or others. They seek to evaluate different recorders and software, assessing their efficacy in processing child speech versus adult speech and determining which tools outperform in this domain. Diverse input from participants of various cultures and languages helps researchers improve software algorithms, and device technology such as battery life, data storage capacity, etc. Any faults could potentially affect the effectiveness of detecting subtle vocalizations of babies compared to other sounds, thus impacting the reliability and accuracy of the data. Aside from the refinement of recording devices by recognition of diverse participants, this demographic diversity ensures the generalizability and applicability of research findings across different linguistic and cultural contexts.

Figure 3. Zooniverse. Untitled. 2024

What I learned from the Experience

Crowdsourcing platforms like Zooniverse serve as invaluable educational tools, democratizing access to scientific inquiry, fostering collaborative learning, and benefiting both researchers and participants. Participating in this Zooniverse project, my curiosity sparked by the concept of ‘crowdsourcing’ itself, leading me to realize the abundance of other modeling approaches available. I learned about terminologies such as open-source software, micro-tasking, and public participation (Hossain and Zahidul 2015), as well as various types of crowdsourcing and their applications and a couple of platforms within these taxonomies. Understanding the potential benefits of crowdsourcing, I recognized its superiority over other sourcing models. Upon my introduction to Zooniverse during the initial days of my Digital Humanities course, I was wondering if diverse motivations exist among users. I found out that multifaceted users’ intrinsic motivations across Zooniverse projects vary from task enjoyment and deep fascination with the projects to age, size, and domain of a project (Trouille, Lintott and Fortson 2015, 5). I browsed through many projects in Zooniverse and looked at their tutorial, tasks, interface, and the number of volunteers to find out if there is any meaningful relationship among them. Some research (Luczak-Roesch, et al. 2014, 321) suggests not all projects employ scientific language in the same way; they start with many domain-specific terms, while others use familiar terminology, leaving more time for advanced vocabulary acquisition as users progress and conclude based on analysis that users within these projects were well-trained or already familiar with the subject domain and maybe that’s why some analysis (Cox, et al. 2015, 38) reveals a negative correlation between scientific impact and public engagement which is aligned with my observations. While the democratizing nature of Zooniverse does not subject users to this scenario, its user-friendly interface enhances the experience for novice users. This is achieved through dynamic interactions facilitated by features such as the Talk option and engagement with expert users. Through the project itself, I became acquainted with variations in vocalization patterns, some of which highlighted how babies attempt to mimic the sounds they hear from their surroundings, while also exploring acoustic features linked to various emotional expressions. Differentiating between canonical and non-canonical vocalizations deepened my comprehension of speech-processing mechanisms and the human brain’s ability to perceive and classify various vocalization types. Through the external links and educational tab provided by the project, I got familiar with remarkable projects concerning the cognitive abilities of babies (Christophe and Cristia 2023) and also discovered the Max Planck Institute (Max Planck Institute for Psycholinguistics 2024) with innovative research endeavors in psycholinguistics. I gained insight into the language learning mechanism (LSCP 2023), various aspects of language neurobiology in early development such as rapid changes in brain structure, the dynamic nature of language neurodevelopment, and the role of the human genome.

Crowdsourced initiative in the field of Museum Studies

Museums as inclusive spaces for learning (Hooper-Greenhill 2007) providing people with varied chances to delve into, comprehend, and value a wide array of topics. Today, museum education extends beyond simply presenting information about exhibits; instead, it frequently involves interactive initiatives aimed at fostering inclusivity, fostering intercultural communication, engaging the public, encouraging participation, empowering communities, and fostering creativity while promoting innovation (Sani 2015, 10). Museums have already tried to democratize their access by making their collections available in the cyber sphere. Amid the COVID-19 pandemic, this approach has demonstrated its effectiveness, with museums standing out as among the initial cultural institutions (NEMO 2020). By employing crowdsourcing like idea generation or citizen science, museums can adopt a more inclusive and participatory approach to exhibit development and interpretation, ensuring that the perspectives of various people are represented. The skills and methodologies utilized in crowdsourced initiatives, such as data analysis and community engagement, align with the principles of participatory museum practices which can be highly educational. Crowdsourcing metadata, creating tags, and cross-referencing are ways to benefit museums by engaging volunteers to contribute descriptive information, annotations, or corrections, improving the organization, and making the narratives and collection more discoverable. Those interested may engage in other crowdsourcing activities like providing input on exhibits by contributing to writing labels based on their interpretation or specialized knowledge and representing specific communities. For instance, LGBTQ+ community can provide insights into the historical significance of artifacts related to queer history, gender identity, and sexual orientation. Their input can ensure that museum exhibits accurately represent LGBTQ+ in society. Many museums or heritage sites around the world hold a wealth of archives like manuscripts, photographs, amulets, epigraphy, murals, and other materials related to their collections which remain inaccessible to researchers, scholars, and the public due to limitations on time, resources, staff, and budget. By engaging volunteers to help transcribe and digitize these materials the limitations can be mitigated, thus fostering a co-creating and co-curating experience. Also, the virtual presence of museums in such projects can enhance the promotion and protection of heritage and culture. Identifying geographic locations through coordinating historical maps with present locations is another perquisite of crowdsourcing for museum scholars and archeologists to gain insights into the historical context of sites, enabling them to better understand the evolution of landscapes and the potential locations of archaeological sites. Museums can also crowdsource ideas and insights from participants through gamification, such as Matrix Games, regarding collections acquired during colonial periods. This approach prompts ethical questions regarding the ownership, provenance, and repatriation of museum objects. This also critically examines the interpretation of colonized objects, which often occurs within the framework of Western culture, perpetuating colonial narratives and stereotypes. These efforts not only facilitate the repatriation of unlawfully acquired objects to their countries of origin but also establish an enlightening system to address stolen artifacts and the repercussions of cultural property trafficking.

Figure 4.Author. Untitled. 2024

References

Christophe, Anne, and Alex Cristia. 2023. Baby lab. Accessed February 9, 2024. http://sapience.dec.ens.fr/babylab/recherche.php.

Cox, Joe, Eun Young Oh, Brooke Simmons, Chris Lintott, Karen Masters, Anita Greenhill, Gary Graham, and Kate Holmes. 2015. “Defining and Measuring Success in Online Citizen Science: A Case Study of Zooniverse Projects.” Computing in Science & Engineering 1 (4): 28–41.

Hooper-Greenhill, Eilean . 2007. Museums and Education-Purpose, Pedagogy, Performance. 1st. Oxon: Routledge.

Hossain, Mokter , and K. M. Zahidul . 2015. “Crowdsourcing: a comprehensive literature review.” Strategic Outsourcing: An International Journal 8 (1): 2–22.

Kuhl, Patricia K. , Barbara T. Conboy, Denise Padden, Tobey Nelson, and Jessica Pruitt. 2005. “Early Speech Perception and Later Language Development: Implications for the “Critical Period”.” Language Learning and Development 1 (3–4): 237–264.

2023. LSCP. Accessed February 9, 2024. https://lscp.dec.ens.fr/en/research/teams-lscp/language-and-its-acquisition.

Luczak-Roesch, Markus, Ramine Tinati, Elena Simperl, Max van Kleek, Nigel Shadbolt, and Robert Simpson. 2014. “Why Won’t Aliens Talk to Us? Content and Community Dynamics in Online Citizen Science.” The Eighth International AAAI Conference on Weblogs and Social Media. Ann Arbor, Michigan: AAAI. 315–324.

2024. Max Planck Institute for Psycholinguistics. Accessed February 9, 2024. https://www.mpi.nl/research.

2020. NEMO. May 12. Accessed February 9, 2024. https://www.ne-mo.org/advocacy/our-advocacy-themes/museums-during-covid-19.

2023. NIH. May 8. Accessed February 8, 2024. https://www.nidcd.nih.gov/health/developmental-language-disorder.

Sani, Margherita. 2015. “Revisiting the Educational Value of Museums: Connecting to Audiences.” Pilsen, Czech Republic: NEMO-The Network of European Museums.

Trouille, Laura, Chris Lintott, and Lucy Fortson. 2015. “From Clicks to Publications: How the Public is Changing the Way We Do Research.” 16th Frank N. Bash Symposium. Austin: University of Texas. 1–8.

2020. Zooniverse. Accessed February 5, 2024. https://www.zooniverse.org/projects/laac-lscp/maturity-of-baby-sounds/classify.

2020. Zooniverse. Accessed February 5, 2024. https://www.zooniverse.org/projects/laac-lscp/maturity-of-baby-sounds/about/research.

--

--