Red Hen Lab - GSoC 2022 Ideas
Ideas are listed at the bottom of this page, but read the page thoroughly before you apply.cher
Note that GSoC no longer requires that contributors be students; allows both medium (~175 hour) projects and long (~350 hour) projects; and is open to extended timelines for project completion (from 12 to 22 weeks; 12 weeks is the standard). Red Hen Lab, unless noted otherwise for a specific project, is willing to consider both medium and short versions of contributions. Red Hen Lab considers 12 weeks to be the default, but, as warranted, is open to discussing other arrangements.
Timeline: https://developers.google.com/open-source/gsoc/timeline . Contributor Application Period opens 4 April 2022. Contributor Application deadline 19 April 2022, 1800 UTC. All final proposals must be submitted directly to Google, not to Red Hen.
Guide for Potential Contributors: https://google.github.io/gsocguides/student/
Developer resources are at https://developers.google.com/open-source/gsoc/resources/marketing
Red Hen Lab works closely with FrameNet Brasil and with vitrivr. Members of each group routinely serve as mentors for GSoC projects run by the other two groups. Feel free to submit similar proposals to two or three of these groups. Red Hen, FrameNet Brasil, and vitrivr will coordinate to decide on best placement.
Red Hen Google Summer of Code 2022
redhenlab@gmail.com
See Guidelines for Red Hen Developers and Guidelines for Red Hen Mentors
How to Apply
The great distinction between a course and a research opportunity is that in a course, the professor lays it out, gives assignments, start to finish. In research, the senior researchers are expected to do all of that themselves. They cogitate on the possibilities, use their library and networking skills to locate and review the state of the art, make judicious decisions about investment of time and other resources, and chart a path. Usually, the path they choose turns out to be a dead-end, but in research, success even some of the time is a great mark of distinction. Research is about doing something that has not ever been done before. The junior researcher, or student learning to do research, is not expected to do everything that a senior researcher does, but is expected, first, to work continuously to learn how to improve by studying senior researchers, and, second, to explore general research opportunities picked out by senior researchers, review the literature, get a strong sense of the state of the art, and think about how it could be built upon. The junior researcher, having been directed to an area or areas by the senior researchers, is expected to find and read the research articles, explore the possibilities, and propose a tractable piece of work to undertake. The senior researchers then mentor now and then. The senior researchers are especially valuable for their experience, which usually gives them a much sharper sense of which path is more likely to be fruitful, but nonetheless, the senior researchers are sometimes surprised by what the junior researchers manage to hit upon. Junior researchers are largely self-organizing, self-starting, self-inspiring. A proposal from a student of the form, “I am highly motivated and know about X, Y, and Z and would love to do something related to topic W. Where do I start?” is not a research proposal and is inappropriate for Red Hen Lab.
Template
Some projects below are marked as requiring that proposals use this Latex Template. But everyone should follow its instructions. Here is a text version:
GSOC 2022 Project Proposal for RedHenLab
Write Your Name Here
Date
Summary of the Proposal
Write a brief summary of the proposal. The summary should not exceed 120 words. A single paragraph is best. The summary should include a few lines about the background information, the main research question or problem that you want to write about, and your methods. The proposal summary should not contain any references or citations. Your entire proposal cannot exceed 2000 words, so choose the words in this section carefully. The 1500 words you will write in the proposal document will exclude any words contained in the tables, figures, and references.
Background
In the background section, write briefly using as many paragraphs, lists, tables, figures as you can about the main problem. This background section will typically have three sections:
• What is known about the topic
• What is not known about the topic, and the challenges
• What unknowns and challenges you will address
Cite all relevant references. If you use any part from previous research, you must cite it properly. Proposals assembled by copypasta from papers and websites will be ignored.
Goal and Objectives
Describe the goal(s) of your project and how you will meet those goals. Typically, the way to write this is something like, "The goal of this research is to ...", and then continue with something like, "Goal 1 will be met by achieving the following objectives ...", and so on. The goal is a broad based statement, and the objectives are very specific, achievable tasks that will show how you will achieve the goal you set out.
Methods
In this section, discuss:
The challenges you will tackle.
The method you chose to tackle those problems, and why
Additional resources (datasets, pre-implemented methods, etc.) you will use in this project.
The result of your project. What is the deliverable?
The future of your project after GSoC2022. What do you see as possibilities for future improvements? Would you be willing to mentor others in the future to continue work on your project?
Tentative Timeline
This is the fourth and final section of your proposal. You need to provide a tentative timeline, on what time frame you are planning to accomplish the goals you mentioned in Section . We recommend using this Gantt chart [1], something like this, with your objectives and milestones listed on the y-axis and the weeks of GSoC on the x-axis.
These are the compulsory sections that you will need to include in your proposal. Then submit it to the named mentor for the project or just to redhenlab@gmail.com. If you use this Latex Template on Overleaf, then you can generate a PDF of your project proposal by selecting the PDF symbol on the top of its window. Save the PDF to your hard drive and then upload that one copy of PDF to Learn. Include complete citations and references. For example, we have cited here a secondary analysis of data from papers published for about 40 years on statistical inference. It was an interesting paper written by Stang et.al. [2] and published in 2016. A full citation of the paper is mentioned in the references section.
References
[1] HL Gantt. 1910. Work, wages and profit. Engineering Magazine. New York.
[2] Andreas Stang, Markus Deckert, Charles Poole, and Kenneth J Rothman. 2016. Statistical inference in abstracts of major medical and epidemiology journals 1975-2014: a systematic review. European journal of epidemiology, November.
The Great Range of Projects that You Might Design
Red Hen Lab will consider any mature pre-proposal related to the study of multimodal communication. A pre-proposal is not a collaboration between a mentor and a student; rather, the mentor begins to pay attention once a reasonably mature and detailed outline for a pre-proposal is submitted. A mature pre-proposal is one that completes all the Template sections in a thorough and detailed manner. Red Hen lists a few project ideas below; more are listed in the Barnyard of Potential Possible Projects. But you are not limited to these lists. Do not write to ask whether you may propose something not on this list. The answer is, of course! We look forward to your mature and detailed pre-proposals.
Once you have a mature and detailed idea, and a pretty good sketch for the template above, you may send them to Red Hen to learn whether a mentor is interested in your idea and sketch, and then to receive some initial feedback and direction for finishing your pre-proposal. Red Hen mentors are extremely busy and influential people, and typically do not have time to respond to messages that do not include a mature and detailed idea and a pretty good sketch of the template above. Use the Template above to sketch your pre-proposal, print it to pdf, and send it to redhenlab@gmail.com. If a mentor is already listed for a specific project, send it also to that mentor.
The ability to generate a meaningful pre-proposal is a requirement for joining the team; if you require more hand-holding to get going, Red Hen Lab is probably not the right organization for you this year. Red Hen wants to work with you at a high level, and this requires initiative on your part and the ability to orient in a complex environment. It is important that you read the guidelines of the project ideas, and you have a general idea of the project before writing your pre-proposal.
When Red Hen receives your pre-proposal, Red Hen will assess it and attempt to locate a suitable mentor; if Red Hen succeeds, she will get back to you and provide feedback to allow you to develop a fully-fledged proposal to submit to GSoC 2022. Note that your final proposal must be submitted directly to Google, not to redhenlab@gmail.com
Red Hen is excited to be working with skilled students on advanced projects and looks forward to your pre-proposals.
Know Red Hen Before You Apply
Red Hen Lab is an international cooperative of major researchers in multimodal communication, with mentors spread around the globe. Together, the Red Hen cooperative has crafted this Ideas page, which offers some information about the Red Hen dataset of multimodal communication (see some sample data here and here) and a long list of tasks.
To succeed in your collaboration with Red Hen, the first step is to orient yourself carefully in the relevant material. The Red Hen Lab website that you are currently visiting is voluminous. Please explore it carefully. There are many extensive introductions and tutorials on aspects of Red Hen research. Make sure you have at least an overarching concept of our mission, the nature of our research, our data, and the range of the example tasks Red Hen has provided to guide your imagination. Having contemplated the Red Hen research program on multimodal communication, come up with a task that is suitable for Red Hen and that you might like to embrace or propose. Many illustrative tasks are sketched below. Orient in this landscape, and decide where you want to go.
The second step is to formulate a pre-proposal sketch of 1-3 pages that outlines your project idea. In your proposal, you should spell out in detail what kind of data you need for your input and the broad steps of your process through the summer, including the basic tools you propose to use. Give careful consideration to your input requirements; in some cases, Red Hen will be able to provide annotations for the feature you need, but in other cases successful applicants will craft their own metadata, or work with us to recruit help to generate it. Please use the Latex template to write your pre-proposal, and send us the pdf format.
Red Hen emphasizes: Red Hen has programs and processes—see, e.g., her Τέχνη Public Site, Red Hen Lab's Learning Environment—for tutoring high-school and college students. But Red Hen Google Summer of Code does not operate at that level. Red Hen GSoC seeks mature students who can think about the entire arc of a project: how to get data, how to make datasets, how to create code that produces an advance in the analysis of multimodal communication, how to put that code into production in a Red Hen pipeline. Red Hen is looking for the 1% of students who can think through the arc of a project that produces something that does not yet exist. Red Hen does not hand-hold through the process, but she can supply elite and superb mentoring that consists of occasional recommendations and guidance to the dedicated and innovative student.
Requirements for Commitment
In all but exceptional cases, recognized as such in advance, your project must be put into production by the end of Google Summer of Code or you will not be passed or paid. Most projects will create a pipeline or contribute to an existing pipeline in the Red Hen central operations. This can mean, e.g., scripting (typically in bash) an automated process for reading input files from Red Hen's data repository, submitting jobs to the CWRU HPC using the Slurm workload manager, running your code, and finally formatting the output to match Red Hen's Data Format. Consider these requirements as opportunities for developing all-round skills and for being proud of having written code that is not only merged but in regular production! Explore the current Red Hen Lab pipelines and think about how your project would work with them.
Tips for working with your mentors
Note that your project will probably need to be implemented inside a Singularity container (see instructions). This makes it portable between Red Hen's high-performance computing clusters. Red Hen has no interest in toy, proof-of-concept systems that run on your laptop or in your user account on a server. Red Hen is dedicated exclusively to pipelines and applications that run on servers anywhere and are portable. Please study Guidelines for Red Hen Developers, including the section on building Singularity containers. You are required to maintain a github account and a blog.
In almost all cases, you will do your work on CWRU HPC, although of course you might first develop code on your device and then transfer it to CWRU HPC. On CWRU HPC, do not try to sudo; do not try to install software. Check for installed software on CWRU HPC using the command
module
e.g.,
module spider singularity
module load gcc
module load python
On CWRU HPC, do not install software into your user account; instead, if it is not already installed on CWRU HPC, install it inside a Singularity container so that it is portable. Red Hen expects that Singularity will be used in most cases. Why Singularity? Here are 4 answers; note especially #2 and #4:
What is so special about Singularity?
While Singularity is a container solution (like many others), Singularity differs in its primary design goals and architecture:
Reproducible software stacks: These must be easily verifiable via checksum or cryptographic signature in such a manner that does not change formats (e.g. splatting a tarball out to disk). By default Singularity uses a container image file which can be checksummed, signed, and thus easily verified and/or validated.
Mobility of compute: Singularity must be able to transfer (and store) containers in a manner that works with standard data mobility tools (rsync, scp, gridftp, http, NFS, etc..) and maintain software and data controls compliancy (e.g. HIPPA, nuclear, export, classified, etc..)
Compatibility with complicated architectures: The runtime must be immediately compatible with existing HPC, scientific, compute farm and even enterprise architectures any of which maybe running legacy kernel versions (including RHEL6 vintage systems) which do not support advanced namespace features (e.g. the user namespace)
Security model: Unlike many other container systems designed to support trusted users running trusted containers, we must support the opposite model of untrusted users running untrusted containers. This changes the security paradigm considerably and increases the breadth of use cases we can support.
A few further tips for rare, outlier cases:
In rare cases, if you feel that some software should be installed by CWRU HPC rather than inside your Singularity container, write to us with an argument and an explanation, and we will consider it.
In rare cases, if you feel that Red Hen should install some software to be shared on gallina but not otherwise available to the CWRU HPC community, explain what you have in mind, and we will consider it.
Remember to study the blogs of other students for tips, and document on your own blogs anything you think would help other students.
More Tips for Working with your Mentors
Rely on your network and take the lead in building it. It is easy to think of GSoC as a coding job, but the sociology of the operation is at least as important. This is your chance to work on the inside of a high-level global collaboratory and see how such a network thrives. Take the lead in developing your community, your network, your resources.
For everyday learning and help with coding, mentors are only a last resort. Students are paid; mentors are not. The mentors are volunteers who see value in giving some of their time to helping the student and the project, but they are extremely busy people, with many responsibilities. They are angels, but appear only when it's actually necessary. GSoC is not a coding collaboration between you and your mentor. Most of the help you will need should come from the other students in your cohort, from your own research on how to pick up skills, from your connecting with people in your network. The mentor is available for high-level guidance on the goal, strategy, and timeline of the project. But if you encounter routine difficulties, your first request for help should not go to your mentor.
All students in Red Hen Lab GSoC will, during the community bonding period, be put through a common package of setup tasks. Instructions will be provided. Students in a year's cohort always find it useful to establish some communication channel, such as Slack, through which they can mentor each other as needed through the accomplishment of the initial setup. After everyone has completed setup, the Org Admins will schedule a meet-and-greet group videoconference. Thereafter, each student will work principally on his or her self-guided project but should continue to rely on the network of students throughout the GSoC period.
Red Hen requires that you document and explain everything in your blog and github, right down to the commands and code and steps needed to accomplish anything in your project. Think of it this way: suppose that, once you have completed GSoC by installing and demonstrating your working pipeline, fully in production, a later student comes along and wants to build on your work. Of course, we would ask you to mentor; and we hope that you will stay active in Red Hen and keep your project and similar projects going. Many Red Hen mentors were once Red Hen students. Your blog and your github must supply everything in a clear way for that student to hit the ground running (except security credentials in Red Hen). What would that student need to understand, know, re-use, imitate? Put that all in your blog and github. Red Hen needs a full post from you at least weekly. Many people—in Red Hen, universities, tech companies, etc.—will be looking at your blog.
Work closely from the beginning with your mentor on installing the production system. Red Hen is not interested in toy or proof-of-concept efforts. Be sure to work with your mentor on a plan for actually installing and testing your production system before the final evaluation.
Background Information
Red Hen Lab participated in Google Summer of Code in 2015, 2016, 2017, 2018, 2019, 2020, and 2021, working with brilliant students and expert mentors from all over the world. Each year, Red Hen has mentored students in developing and deploying cutting-edge techniques of multimodal data mining, search, and visualization, with an emphasis on automatic speech recognition, tagging for natural language, co-speech gesture, paralinguistic elements, facial detection and recognition, and a great variety of behavioral forms used in human communication. With significant contributions from Google Summer of Code students from all over the world, Red Hen has constructed tagging pipelines for text, audio, and video elements. These pipelines are undergoing continuous development, improvement, and extension. Red Hens have excellent access to high-performance computing clusters at UCLA, Case Western Reserve University, and FAU Erlangen; for massive jobs Red Hen Lab has an open invitation to apply for time on NSF's XSEDE network.
Red Hen's largest dataset is the NewsScape Library of International Television News, a collection of more than 600,000 television news programs, initiated by UCLA's Department of Communication, developed in collaboration with Red Hens from around the world, and curated by the UCLA Library, with processing pipelines at UCLA, Case Western Reserve University, and FAU Erlangen in Germany. Red Hen develops and tests tools on this dataset that can be used on a great variety of data—texts, photographs, audio and audiovisual recordings. Red Hen also acquires big data of many kinds in addition to television news, such as photographs of Medieval art, and is open to the acquisition of data needed for particular projects. Red Hen creates tools that are useful for generating a semantic understanding of big data collections of multimodal data, opening them up for scientific study, search, and visualization. See Overview of Research for a description of Red Hen datasets.
In 2015, Red Hen's principal focus was audio analysis; see the Google Summer of Code 2015 Ideas page. Red Hen students created a modular series of audio signal processing tools, including forced alignment, speaker diarization, gender detection, and speaker recognition (see the 2015 reports, extended 2015 collaborations, and github repository). This audio pipeline is currently running on Case Western Reserve University's high-performance computing cluster, which gives Red Hen the computational power to process the hundreds of thousands of recordings in the Red Hen dataset. With the help of GSoC students and a host of other participants, the organization continues to enhance and extend the functionality of this pipeline. Red Hen is always open to new proposals for high-level audio analysis.
In 2016, Red Hen's principal focus was deep learning techniques in computer vision; see the Google Summer of Code 2016 Ideas page and Red Hen Lab page on the Google Summer of Code 2016 site. Talented Red Hen students, assisted by Red Hen mentors, developed an integrated workflow for locating, characterizing, and identifying elements of co-speech gestures, including facial expressions, in Red Hen's massive datasets, this time examining not only television news but also ancient statues; see the Red Hen Reports from Google Summer of Code 2016 and code repository. This computer vision pipeline is also deployed on CWRU's HPC in Cleveland, Ohio, and was demonstrated at Red Hen's 2017 International Conference on Multimodal Communication. Red Hen is planning a number of future conferences and training institutes. Red Hen GSoC students from previous years typically continue to work with Red Hen to improve the speed, accuracy, and scope of these modules, including recent advances in pose estimation.
In 2017, Red Hen invited proposals from students for components for a unified multimodal processing pipeline, whose purpose is to extract information about human communicative behavior from text, audio, and video. Students developed audio signal analysis tools, extended the Deep Speech project with Audio-Visual Speech Recognition, engineered a large-scale speaker recognition system, made progress on laughter detection, and developed Multimodal Emotion Detection in videos. Focusing on text input, students developed techniques for show segmentation, neural network models for studying news framing, and controversy and sentiment detection and analysis tools (see Google Summer of Code 2017 Reports). Rapid development in convolutional and recurrent neural networks is opening up the field of multimodal analysis to a slew of new communicative phenomena, and Red Hen is in the vanguard.
In 2018, Red Hen GSoC students created Chinese and Arabic ASR (speech-to-text) pipelines, a fabulous rapid annotator, a multi-language translation system, and multiple computer vision projects. The Chinese pipeline was implemented as a Singularity container on the Case HPC, built with a recipe on Singularity Hub, and put into production ingesting daily news recordings from our new Center for Cognitive Science at Hunan Normal University in Hunan Province in China, directed by Red Hen Lab Co-Director Mark Turner. It represents the model Red Hen expects projects in 2019 to follow.
In 2019, Red Hen Lab GSoC students made significant contributions to add speech to text and OCR to Arabic, Bengali, Chinese, German, Hindi, Russian, and Urdu. We built a new global recording monitoring system, developed a show-splitting system for ingesting digitized news shows, and made significant improvements to the Rapid Annotator. For an overview with links to the code repositories, see Red Hen Lab's GSoC 2019 Projects.
Red Hen's themes for 2020 can be found here.
Red Hen's themes for 2021 can be found here.
In large part thanks to Google Summer of Code, Red Hen Lab has been able to create a global open-source community devoted to computational approaches to parsing, understanding, and modeling human multimodal communication. With continued support from Google, Red Hen will continue to bring top contributors from around the world into the open-source community.
What kind of Red Hen are you?
More About Red Hen
Our mentors
EPFL & Bibliotheca Hertziana
Frankie Robertson, GSoC student 2020
Wenyue Xu
Smith College
GSoC student 2020
NSIT, Delhi University
University of Basel
Uni-Leipzig
Federal University of Juiz de Fora
Federal University of Juiz de Fora
Nitesh Mahawar
The profiles of mentors not included in the portrait gallery are linked to their name below.
More guidelines for project ideas
Your project should be in the general area of multimodal communication, whether it involves tagging, parsing, analyzing, searching, or visualizing. Red Hen is particularly interested in proposals that make a contribution to integrative cross-modal feature detection tasks. These are tasks that exploit two or even three different modalities, such as text and audio or audio and video, to achieve higher-level semantic interpretations or greater accuracy. You could work on one or more of these modalities. Red Hen invites you to develop your own proposals in this broad and exciting field.
Red Hen studies all aspects of human multimodal communication, such as the relation between verbal constructions and facial expressions, gestures, and auditory expressions. Examples of concrete proposals are listed below, but Red Hen wants to hear your ideas! What do you want to do? What is possible? You might focus on a very specific type of gesture, or facial expression, or sound pattern, or linguistic construction; you might train a classifier using machine learning, and use that classifier to identify the population of this feature in a large dataset. Red Hen aims to annotate her entire dataset, so your application should include methods of locating as well as characterizing the feature or behavior you are targeting. Contact Red Hen for access to existing lists of features and sample clips. Red Hen will work with you to generate the training set you need, but note that your project proposal might need to include time for developing the training set.
Red Hen develops a multi-level set of tools as part of an integrated research workflow, and invites proposals at all levels. Red Hen is excited to be working with the Media Ecology Project to extend the Semantic Annotation Tool, making it more precise in tracking moving objects. The "Red Hen Rapid Annotator" is also ready for improvements. Red Hen is open to proposals that focus on a particular communicative behavior, examining a range of communicative strategies utilized within that particular topic. See for instance the ideas "Tools for Transformation" and "Multimodal rhetoric of climate change". Several new deep learning projects are on the menu, from "Hindi ASR" to "Gesture Detection and Recognition". On the search engine front, Red Hen also has several candidates: the "Development of a Query Interface for Parsed Data" to "Multimodal CQPweb". Red Hen welcomes visualization proposals; see for instance the "Semantic Art from Big Data" idea below.
Red Hen is now capturing television in China, Egypt, and India and is happy to provide shared datasets and joint mentoring with our partners CCExtractor, who provides the vital tools for text extraction in several television standards, for on-screen text detection and extraction..
When you plan your proposal, bear in mind that your project should result in a production pipeline. For Red Hen, that means it finds its place within the integrated research workflow. The application will typically be required to be located within a Singularity module that is installed on Red Hen's high-performance computing clusters, fully tested, with clear instructions, and fully deployed to process a massive dataset. The architecture of your project should be designed so that it is clear and understandable for coders who come after you, and fully documented, so that you and others can continue to make incremental improvements. Your module should be accompanied by a python application programming interface or API that specifies the input and output, to facilitate the construction of the development of a unified multimodal processing pipeline for extracting information from text, audio, and video. Red Hen prefers projects that use C/C++ and python and run on Linux. For some of the ideas listed, but by no means all, it's useful to have prior experience with deep learning tools.
Your project should be scaled to the appropriate level of ambition, so that at the end of the summer you have a working product. Be realistic and honest with yourself about what you think you will be able to accomplish in the course of the summer. Provide a detailed list of the steps you believe are needed, the tools you propose to use, and a weekly schedule of milestones. Chose a task you care about, in an area where you want to grow. The most important thing is that you are passionate about what you are going to work on with us. Red Hen looks forward to welcoming you to the team!
Ideas for Projects
Red Hen strongly emphasizes that a student should not browse the following ideas without first having read the text above them on this page. Red Hen remains interested in proposals for any of the activities listed throughout this website (http://redhenlab.org).
See especially the
Barnyard of Possible Specific Projects
Red Hen is uninterested in a preproposal that merely picks out one of the following ideas and expresses an interest. Red Hen looks instead for an intellectual engagement with the project of developing open-source code that will be put into production in our working pipelines to further the data science of multimodal communication. What is your full idea? Why is it worthy? Why are you interested in it? What is the arc of its execution? What data will you acquire, and where? How will you succeed?
Please read the instructions on how to apply carefully before applying for any project. Failing to follow the guidelines of the application will result in your (pre)proposal's not being considered for GSoC2022.
1. Red Hen Anonymizer
Build on and develop the existing Red Hen Anonymizer. See
https://yashkhasbage25.github.io/AnonymizingAudioVisualData/
https://github.com/yashkhasbage25/AnonymizingAudioVisualData
There are many new ideas for elaboration of RHA. For example, does StarGAN's use of Generative Adversarial Networks add functionality? See https://arxiv.org/pdf/1711.09020v3.pdf. Contact turner@case.edu to discuss details or ask for clarification. This is a task that can take 12 weeks and 175 hours of work, although a more sophisticated proposal with strech goals could be consider for 350 hours of work. Difficulty rating: easy to medium.
2. AI Frame Blend Nominator
Build on and develop the existing Frame Blend Nomination Tool. Mentored by Wenyue Xi and Mark Turner. The purpose of this project is to extend the highly-successful work done by Wenyue Xi during Google Summer of Code 2020. Mark Turner was her mentor. Wenyue and Mark will mentor this project. Study Wenyue Xi's blog and github page at Red Hen Lab GSoC 2020 Projects.
Contact turner@case.edu to discuss details or ask for clarification.
Red Hen already has a frame tagging system for English that exploits FrameNet; for details, see Tagging for Conceptual Frames. Red Hen Lab works closely with Framenet Brasil, another Google Summer of Code organization, and is eager to involve other languages in her tagging of conceptual frames. Conceptual blending of frames is a major area of research in cognitive science and cognitive linguistics. Can we develop a system that locates them in language and images? Wenyue Xi's Frame Blend Nomination System does just that. Study http://redhenlab.org to familiarize yourself with the Red Hen data holdings and other existing tools before submitting a pre-proposal for this project.
Long project (350 hours); difficulty: hard.
3. Émile Mâle Pipeline
Contact turner@case.edu to discuss details or ask for clarification. Depending on the level of ambition, this is a task for 12 weeks and either 175 or 350 hours of work. Difficulty level: medium.
4. Red Hen Hatcher - Installation, Configuration and Management of a Pi Station
Currently, RedHen has more than a dozen remote Raspberry Pi capture stations all around the World that provide media from many different languages and cultures so it can be used by researchers from around the globe.
To help further increase this number, new capture stations should be deployed by volunteers that are willing to contribute, but want to spend the least amount of time dealing with technical aspects of setting up and configuring a capture station, as well as managing its daily operation.
This project proposes to develop a desktop application, Red Hen Hatcher, that is able to prepare an SD card to be used in a Raspberry Pi capture station. It should provide a wizard set of dialogs that go through the process, helping the user make informed decisions. After obtaining these data, the Red Hen Hatcher application automatically generates the necessary scripts and configuration files to prepare and set up the capture station.
Once the capture station is up and running, the Red Hen Hatcher application should be able to access a central Red Hen repository to obtain updates, as well as to the capture station in order to apply the updates and allow its management. The management page should look like a dashboard, able to view and act on several tasks. Examples of these tasks are the backup generation, the visualisation of the SD card's and hard drive's free space, the internet connectivity status, low voltage warning, the check of the SD Card's health, the timetable of the capture channels allowing edits, the captured files that have not yet been uploaded to the Red Hen Lab central server.
The Red Hen Hatcher application must be developed using only open source software. It is suggested to be programmed with Python or Java.
This is a task that should be completed in 12 weeks and around 175 hours of work. Difficulty level: easy
Contact jozefonseca@gmail.com to discuss details or ask for clarification.
5. Develop a system for manual joint annotation
While tools like OpenPose can help annotate joint key points in images or videos where the people are fully observable, there are many cases where these tools perform poorly due to occlusion. The problem of pose estimation with partial observation requires data with manual joint annotation. Red Hen is interested in developing a system that can help with this process.
Ideally, this tool should allow human annotators to select certain frames from a video and click on certain positions of each frame to annotate the joint key points. Some possible key points are Left Ankle, Left Knee, Left Hip, Left Wrist, Left Elbow, Left Shoulder, Left Ear, Left Eye, Right Ankle, Right Knee, Right Hip, Right Wrist, Right Elbow, Right Shoulder, Right Ear, Right Eye, Nose, Top Head, Neck. The annotator should also have the option to create new key point names. To improve the accuracy of this system, there should be both an option to enter the exact pixel position for a key point and an option to select and drag existing key points on the frame to adjust their positions.
To make the annotation process faster, when the annotator moves to a subsequent frame of an already annotated frame, there should be an option to display the previous positions of the key points so that they can be dragged to the correct positions in the new frame. Alternatively, estimated positions of these joint key points can be calculated and displayed using interpolation with the key points of previous or following annotated frames. If possible, the tool can also run OpenPose (or another tool that can estimate joint key points) on a frame and display the results so that the annotator can just drag the labels to the correct positions to finish the annotation of that frame.
This functionality needs to be integrated into Red Hen Rapid Annotator. Follow this link; view the talk; and see below. One possible starting point is the VGG Image Annotator. Languages needed include JavaScript and Python. Long project (350 hours); difficulty: medium to hard.
6. Feature tracking based on moving geometric targets
The Semantic Annotation Tool (https://github.com/novomancy/waldorf-scalar/) is a web-based interface for adding time-based annotations with geometric targets to videos. Currently SAT defines an annotation’s geometric target using a starting and ending set of vertices and linearly tweening the target between those two sets of points over the duration of the annotation as an svg animation. However, motion in the underlying film is often not linear, so the geometric target does not follow the features that are intended to be shown if motion is erratic or changes direction. This project seeks to extend the SAT and related toolsets to automatically move a geometric target by tracking the underlying feature it annotates based on analysis of the image data inside the start/end vertices, generating a description for an svg animation of the geometric target that tracks the feature, and updating SAT to be able to overlay the new annotation on a video. How tracking is implemented is up to you, so long as it can be integrated with the Red Hen and SAT pipelines; the SAT itself is built using jquery and npm/grunt tooling. (350 hours, medium difficulty)
7. Machine detection of film edits
Students in film school study textbook types of video cuts. See, for example, The Cutting Edge. Red Hen seeks proposals for code that would automatically tag data for such standard film cuts. Study the work done by Shreyan Ganguly in Red Hen GSoC 2021. Red Hen wants to extend this tool to improve efficiency and resiliency across different quality media (e.g., lower resolution video files and videos that feature damaged media). For inspiration, see https://filmcolors.org/ and http://mediaecology.dartmouth.edu/wp/ and http://www.bmva.org/bmvc/2001/papers/111/accepted_111.pdf. See also http://www.cinemetrics.lv/. (Either 175 or 350 hours, medium difficulty)
8. GUI for OpenPose, PRAAT, Gentle, RapidAnnotator
Red Hen is seeking to build a graphical user interface (GUI) that makes computer vision, speech recognition, and manual annotation tools more accessible to its users. The tools to be integrated are currently, in this order of priority: OpenPose, PRAAT, Gentle (forced aligner), RapidAnnotator. The GUI will make the functionalities of these tools readily available and their output amenable to statistical treatment with R.
These are the three objectives of this project, graded from lower to higher difficulty:
1. Use shiny to create an interface for the related R functions.
2. Integrate R and bash to automatize data generation and processing.
3. Create a web interface from which the different tools and functions can be run from a remote server and resulting in an output in CSV format, ready to be modeled.
Smaller projects addressing one or more of the tools mentioned are also acceptable, in that order of priority. The resulting interface is expected to help us optimize the use of these tools, also allowing us to adapt the computing capacities of each user to the developing needs of ongoing research projects.
Mentors: Cristóbal Pagán Cánovas, Brian Herreño Jiménez, Daniel Alcaraz Carrión, Inés Olza, and Javier Valenzuela.
Difficulty: easy. The anticipated duration for this project is 12 weeks, with a medium workload (175 hours). A longer duration and/or workload could be possible (up to 22 weeks and 350 hours), if adequately justified.
9. Gesture temporal detection pipeline for news videos. Must use the Latex Template.
Mentored by Claire Walzer <claire.walzer@unibas.ch> and Mahnaz Parian-Scherb <mahnaz.parian-scherb@unibas.ch>
Red Hen invites proposals to build a gesture temporal detection pipeline. For gesture detection, a good starting point is OpenPose, and a useful extension is hand keypoint detection. Our dataset is around 600,000 hours of television news recordings in multiple languages, so the challenge is to obtain good recall rates with this particular content.
For the GSoC gesture project, Red Hen has the following goals:
Build a system inside a Singularity container for deployment on high-performance computing clusters (see instructions)
Reliably detect the presence or absence of hand gestures
A good command of python and deep learning libraries (Tensorflow/caffe/Keras) is necessary. Please see here for more information regarding proposals.
The anticipated duration for this project is 12 weeks, with a medium workload (175 hours). A longer duration and/or workload could be possible (up to 22 weeks and 375 hours), if adequately justified. Difficulty: medium to hard.
10. Classification of body keypoint trajectories of gesture co-occurring with time expressions
Implement a two-staged classifier, known as a hybrid deep learning framework, to classify gesture trajectories. The classifier is designed upon two well-known neural networks: Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). It will be trained on a dataset extracted from OpenPose data by our MULTIFLOW project at the Daedalus Lab.
We have been developing a large dataset, currently being expanded, of OpenPose-processed videos containing utterances of over 20 different time expressions (“back then,” “from beginning to end” “earlier/later than…”). The videos are all manually annotated for the linguistic phrase that co-occurs with the gesture, its neighboring words, its semantics (how it distinguishes between different nuances in temporal meaning), and other information. The combination of OpenPose plus manual annotation provides the x-y coordinates for multiple body keypoints, alongside the label for the linguistic time phrase co-occurring with each gesture. Current work mainly focuses on tracing one or multi-independent trajectories of objects. The goal of this project will be to predict the labels for the linguistic expressions of time. The proposed design addresses this issue by providing enough data embedded in the training set.
One possible expansion of this project is to also generate body keypoint trajectories for each linguistic time expression along with true trajectories. Another possible expansion is to integrate the model into a more complex one involving a sequence of PRAAT annotations for this same dataset of time-gesture videos, which we are currently implementing.
Mentors: Cristóbal Pagán Cánovas, Daniel Alcaraz Carrión, Brian Herreño Jiménez, Masoumeh Moradipour-tari, Inés Olza, Mahnaz Parian-Scherb, Claire Walzer, and Javier Valenzuela.
Difficulty: medium. The anticipated duration for this project is 12 weeks, with a medium workload (175 hours). A longer duration and/or workload could be possible (up to 22 weeks and 375 hours), if adequately justified.
11. Tools for improving subtitle/caption quality
Captions play a key role in corpus-based works focused on the study of multimodal communication, where oral language is essential. A good example of this is the NewsScape Library of Television News, a digital collection of around 420,000 television news programs, and Red Hen's main dataset. NewsScape has been recording TV programs since 2004, and stores them along with their associated closed captions. Since the subtitles are force-aligned with the video, it is possible to search for a given verbal pattern and find the exact moment in which the expression was uttered, thus greatly facilitating the study of multimodal aspects. In this regard, Red Hen collaborates closely with CCExtractor.
This project has two main objectives:
1) Implementing and/or improving tools that automatically detect and correct grammatical and spelling errors found in subtitles. Existing spell-checkers have some limitations, especially when working with very large and diverse datasets of textual material, and with languages other than English. In previous projects, we found that these limitations were clearly visible with Spanish captions, where accents play a key role in differentiating meanings. Other problems include the identification of proper nouns, enclitic pronouns, acronyms or abbreviations and punctuation.
2) Increasing the accuracy of automatic captions (content-wise). It is a widely known fact that subtitles and captions often do not match what is actually being said. One of the main aims of Red Hen's researchers that develop corpus-based studies for understanding human communication is to work with spoken language produced in real-world contexts. Indeed, the correspondences between language, gesture and other multimodal aspects are very important. For this reason, we want to develop tools that take accuracy into account and that are able to provide captions that are as faithful as possible to the actual spoken words in order to perform precise textual searches.
Mentors: Rosa Illán Castillo, Cristóbal Pagán Cánovas, Daniel Alcaraz Carrión, Inés Olza, and Javier Valenzuela.
The anticipated duration for this project is 12 weeks, with a medium workload (175 hours). A longer duration and/or workload could be possible (up to 22 weeks and 375 hours), if adequately justified. Difficulty: medium.
12. UI Interface for a Multimodal Pose Retrieval System
Mentored by Claire Walzer <claire.walzer@unibas.ch> and Mahnaz Parian-Scherb <mahnaz.parian-scherb@unibas.ch>
Multimodal systems are already investigated in Computer Vision tasks such as Image Captioning, Video Question Answering (VQA), and Image Retrieval. Such systems require an input image along with a textual description. Furthermore, these systems learn to fuse the data of both modalities to make them comparable to the underlying database.
The goal of this project is to create the UI interface for a baseline Multimodal Pose Retrieval System. The interface should be able to receive the user input (text and image) correctly for the retrieval and pass it into a multimodal pose retrieval network for predictions and the results obtained from the system should be displayed to the user. This UI will ultimately be tested on a part of the NewsScape dataset residing on the Redhen HPC cluster and will eventually be accessible to the entire redhen community.
A very good grasp of Angular and typescript is essential to participate in this project. Furthermore, a good command of python is necessary to implement the communication between the UI interface and the retrieval system. The student should have sufficient knowledge of machine/deep learning and retrieval systems.
Your proposals should be written using the provided Latex template and contain a draft schematic of your proposed UI.
The anticipated duration for this project is 12 weeks, with a medium workload (175 hours). Difficulty: easy to medium.
13. IPTV capture
Red Hen looks to expand her capture to include IPTV. Resources such as BBC iPlayer and Channel4 in the UK, RTVE in Spain, or ARD and ZDF Mediathek in Germany need to be channeled into Red Hen's standard scheduling and capture pipelines, including closed-captions or subtitles that are included in the broadcast stream. Red Hen routinely uses youtube-dl for such captures. You are asked to (1) extend youtube-dl's capabilities; see, e.g. https://github.com/ytdl-org/youtube-dl/issues/16779#issuecomment-781608403; and (2) create robust automated ingestion of IPTV broadcast into the Red Hen format. See http://redhenlab.org for specifics on Red Hen data format.
This project can be carried out both as a medium (175 hours) and long project (350 hours), depending on the number of supported sites. Difficulty is medium.
Mentors: Francis Steen, Jacek Wózny, Melanie Bell, and Javier Valenzuela.
14. Improving the Visual Recognition of Aztec Hieroglyphs (Decipherment Tool)
Mentors: Jungseock Joo (UCLA), Stephanie Wood, (University of Oregon), Juan José Batalla Rosado (Universidad Complutense)
Summary of the Proposal
To enlarge the data set with added iconographic and hieroglyphic examples, varying the angles on the ones we have and adding examples from other manuscript sources (codices). A larger dataset will help us test the accuracy and establish the grade of matching with respect to users’ tests (uploading hieroglyphs for mechanical decipherment). With the expanded and more diversified dataset, we will also establish a protocol of action and behavior between hieroglyphic texts and the previously created Machine Learning prototype. Finally, the results will be linked to potential entries in the Visual Lexicon dictionary and, in turn, the Online Nahuatl Dictionary, where end-users can learn more about their images under study, their visual characteristics and linguistic meanings.
Background
Deep Learning techniques are proving useful for deciphering hieroglyphs and establishing relationships between different levels of the graphic speech.
Hieroglyphs, as a cognitive product of specific artists and specific cultures, with their own internal development and range of expression, are more than just a set of pictures.
From our work in 2021 (https://www.redhenlab.org/summer-of-code/red-hen-lab-gsoc-2021-ideas), we obtained and analyzed more than 1800 hieroglyphs and iconographic samples from one manuscript created by two scribes in c. 1541. Obtaining more examples from additional scribes (known as tlacuiloque, in the Aztec language) from similar (but also closed) graphical traditions will show the strengths and weaknesses of the previous tool and just where/how it lacks of accuracy.
The results of the previous project from GSoC 2021 helped us see the desirability of connecting visual results of decipherment tests with specific Visual Lexicon dictionary terms (https://aztecglyphs.uoregon.edu/choose-letter), where they are more fully explained. This way, the prototype could become not only predictive, but a functional tool that advances end-user learning and methodology.
Goal and Objectives
The main goal is to test the prototype, expand its dataset, get finest matching results, and provide extra relevant information to the end-user’s query through Visual Lexicon entries:
Reshape existing images.
Add new content from additional sources.
Do results-oriented tests with the new dataset.
Get a statistical report showing accuracy level and what kind of improvements should be needed for reach higher levels.
Link the results with the Visual Lexicon entries (which, in turn, links to the Online Nahuatl Dictionary).
Methods
Use image-editing tools to improve and increase the dataset.
Re-label the resulting files if necessary to interrelate them with the Visual Lexicon entries.
Test the Python prototype algorithm and get the results in database form with a final statistic report.
Make small changes if necessary or indicate what future steps should be taken.
Timeline (we will reshape this into a Gantt chart)
12 weeks, workload of 350 hours; difficulty medium to hard.
Week 1: reshape existing images.
Weeks 2-5: capture and edit images from new sources, cropping and deleting extraneous material surrounding each glyph, rotating and resizing as necessary.
Weeks 6-7: enter new hieroglyphs into the database and export original and new material for use in testing decipherment.
Weeks 8-10: test the Python prototype algorithm using the expanded and improved dataset, make changes as necessary, and repeat until satisfied with results.
Week 11: develop code to link each image in a decipherment results list to the online database of the Visual Lexicon.
Week 12: prepare the report and indicate future steps.
References
Online Nahuatl Dictionary, ed. Stephanie Wood (Eugene, Ore.: Wired Humanities Projects University of Oregon, 2000-present).
Visual Lexicon of Aztec Hieroglyphs, ed. Stephanie Wood (Eugene, Ore.: Wired Humanities Projects University of Oregon, 2020-present). https://aztecglyphs.uoregon.edu/ (accessed 14 February 2022).
Whittaker, G. Deciphering Aztec Hieroglyphs: A Guide to Nahuatl Writing. University of California Press, 2021.
15. Red Hen Rapid Annotator
Mentored by Peter Uhrig, Gulshan Kumar, Vaibhav Gupta
This task is aimed at extending the Red Hen Rapid Annotator, which was re-implemented from scratch as a Python/Flask application during GSoC 2018 and improved in GSoC 2019, 2020 and 2021. Still, there are some feature requests, in particular in experiment configuration and connectivity to other software.
Please familiarize yourself with the project and play around with it.
A good command of Python and HTML5/Javascript are necessary for this project.
This project is a medium-sized project (175 hours) with an easy difficulty level.
16. CQPweb plugins (and plugin structure)
Red Hen uses an open-source software called CQPweb to facilitate linguistic research. However, CQPweb is not yet fully equipped to handle audio and video data, so it needs modifications for our purposes. Your task is to create plugins for audio analysis using the EMU webApp and better query options (e.g. the ability to search by sounds using IPA symbols), and additional downloaders for ELAN and Praat files. Where CQPweb's plugin structure cannot cater to our needs, you will submit merge requests to the CQPweb codebase. Proficiency in PHP, JavaScript and HTML is required.
Mentors: Peter Uhrig, Javier Valenzuela, and others
This project can be a medium-sized (175 hours) or a long (350 hours) project, depending on the number of features to be implemented. This is a medium to hard project.
17. Development of a Query Interface for Parsed Data
Mentored by Peter Uhrig's team
This infrastructure task is to create a new and improved version of a graphical user interface for graph-based search on dependency-annotated data. The new version should have all functionality provided by the prototype plus a set of new features. The back-end is already in place.
Develop current functionality:
add nodes to the query graph
offer choice of dependency relation, PoS/word class based on the configuration in the database (the database is already there)
allow for use of a hierarchy of dependencies (if supported by the grammatical model)
allow for word/lemma search
allow one node to be a "collo-item" (i.e. collocate or collexeme in a collostructional analysis)
color nodes based on a finite list of colors
paginate results
export xls of collo-items
create a JSON object that represents the query to pass it on to the back-end
Develop new functionality:
allow for removal of nodes
allow for query graphs that are not trees
allow for specification of the order of the elements
pagination of search results should be possible even if several browser windows or tabs are open.
configurable export to csv for use with R
compatibility with all major Web Browsers (Edge, Firefox, Chrome, Safari)
parse of example sentence can be used as the basis of a query ("query by example")
Steps:
Visit http://www.treebank.info and play around with the development version of the current interface (user: gsoc2022, password: redhen) [taz is a German corpus, the other two are English]
In consultation with Red Hen, decide on a suitable JavaScript Framework, possibly combined with Python/Flask.
Contact Peter Uhrig <peter.uhrig@fau.de> to discuss details or to ask for clarification on any point.
This is a long project (350 hours), requiring some architectural decisions, i.e. we would classifiy it as a hard project on GSoC's scale (easy/medium/hard).
18. Anomaly and Redundancy Estimation for Newly Created Frames using AI
Mentored by: Arthur Lorenzi (FN-Br | UFJF) | Ely Matos (FN-Br | UFJF) | Tiago Torrent (FN-Br | UFJF) | Mark Turner (Red Hen | CWRU)
General Context:
FN-Br, in cooperation with Red Hen, has developed Lutma, a frame maker tool that allows people to contribute frames and lexical units to a multilingual data release, to be made available soon as part of Global FrameNet. In the future, when a new frame is created, people in the framenet community will be able to suggest edits and revise the new frames. However, if the volume of newly created frames is too big, some sort of automatic identification of potentially anomalous or redundant frames will be needed.
The Idea:
This idea proposes the development of an AI model trained on framenet structure data (frame names and definitions, frame element names and definitions, frame-to-frame relations, and lexical units) to detect anomalous and redundant frames in Lutma, flag them, and provide contributors with a report of why the newly created frame has been flagged.
Why this Idea is Innovative:
The innovation presented by this idea relies on the fact that instead of using annotated data for training an AI model based on framenet, it proposes exploiting the richness of the semantic representation provided by FrameNet itself to improve the quality of human-made frames. It also has a training impact on the community of contributors, since the report generated by the system can help identify and correct potential errors in the frame creation process.
Expected proposal type: 350 hours project
Difficulty rating: Hard
Skillset: Python, JavaScript, SQL, and version control system (Git).
19. Measuring Frame Semantic Similarity in Multimodal, Multilingual Corpora
Mentored by: Marcelo Viridiano (FN-Br | UFJF) | Fred Belcavello (FN-Br | UFJF) | Tiago Torrent (FN-Br | UFJF) | Oliver Czulo (Uni-Leipzig) | Zheng-Xin Yong (Brown University) | Debanjana Kar (IBM Research)
General Context:
FN-Br has been annotating the Fickr 30k Entities corpus for frames and frame elements. In collaboration with the University of Leipzig, we’ve been also extending such annotation to the Multi30k corpus, where captions originally produced in English for the Flickr30k images are translated into other languages, including German and Brazilian Portuguese.
The resulting data set will be composed of: (a) 1,000 pictures with bounding boxes identifying elements in them; (b) annotations of each bounding box for frames and frame elements; (c) 2,000 captions in English, two per image, annotated for frames; (d) 2,000 captions for Brazilian Portuguese and 2,000 for German, two per image, currently under annotation for frames.
The Idea:
For this idea, we expect projects focused on extracting semantic similarity (and variation) across (a) communicative modes – image and verbal language – and (b) languages. The task is to implement algorithms for the assessment of semantic similarity between and variation within image descriptions for (1) descriptions in one language, (2) descriptions in one or another language in reference to a gold-standard description and (3) descriptions in different languages. (3) is a stretch goal.
Why this Idea is Innovative:
The assessment of semantic similarity in the FrameNet context is not addressed by means of (solely) word embedding models, but by rich, often culturally loaded categories called frames. These categories allow for certain, though abstract and probabilistic predictions of the behavior and the interpretation of the semantics of a linguistic stimulus. On top of this, the network structure of frames models semantic connections between these rich categories. With this project, we want to leverage this information for research into a type of task which can be applied to various scenarios beyond the assessment of image descriptions.
Expected proposal type: 350 hours project
Difficulty rating: Hard
Skillset: Python, JavaScript, SQL, and version control system (Git).
20. Ensemble Jackets
Mentored by: Lilyanne Dorilas.
See https://hd.media.mit.edu/tech-reports/TR-518.pdf, https://www.researchgate.net/publication/2344811_The_Conductor's_Jacket_A_Device_For_Recording_Expressive_Musical_Gestures . This innovation to the "Conductor's Jacket" would serve as an outline/outer "skeleton" of the musician as they play, without obstructing their ability to gesture and play naturally. It would track the mechanical finger patterns of musicians as well as their musical gestures in the context of the chamber group. It would utilize sensors on the upper body (similar to the Jacket), tagging of facial features, arms, legs, and fingers (similar to the tool OpenPose). On a more specific level, there would be sensors/tags on the bow of a string instrument as an extension of the arm's gestures, and some coding or machine-learning conditioning to recognize typical finger patterns for a given instrument (e.g. violin, saxophone, flute), regardless of spacing differences. Scalar and modal patterns (arpeggiation, ascending and descending scales) would be recognized first for the sake of simplicity. The coder would design and write the code.
Expected proposal type: 350 hours project
Difficulty rating: Hard