Speech technology in education research. Can you hear me now?

The primary way students and teachers interact in the classroom is through talking. A teacher poses a question, a student answers, followed by discussion, or argument. Back and forth, words are exchanged; ideas are refined and understood.

But unlike words on paper, spoken words disappear as soon as they are expressed. Even if the conversation is recorded, there has been no easy way to analyze each word—let alone the level of collaboration, motivation, and reasoning—outside of laboriously transcribing and coding limited interactions.

What if there were a way to electronically capture, measure, characterize, and understand all the words spoken in the classroom? How would access to that information inform education?

The Concord Consortium and its partners have begun exploring these questions. “Speech technology opens up whole new possibilities for analyzing what’s happening in the classroom,” explains Concord Consortium President and CEO Chad Dorsey. “Speech is the coin of the realm in education. For the most part, the core of teaching and learning has to happen when people are speaking to one another.”

The approaching convergence of speech technology and education has been in view for years. The field may not have reached a total convergence, but recent progress has at least made the impossible seem possible.

To assess the potential for speech technology for education research, the Concord Consortium, in 2015, partnered with leaders in spoken language technology research—SRI International and its Speech Technology and Research Laboratory and the Center for Robust Speech Systems at the University of Texas at Dallas—on a National Science Foundation grant to collate and examine current knowledge about speech recognition and analysis, and encourage collaborations that can launch the area of spoken language technology for education.   

For the past year, the partners have been holding focus groups with education and speech researchers to find out what’s already in place, what their hopes are for the future, and what gaps need to be filled to bridge the two. In January the Concord Consortium and SRI held a webinar, hosted by the Center for Innovative Research in CyberLearning (CIRCL), to share information about the potential for speech technology and education research. They have also published a summary of the field as a CIRCL primer. A paper for an educational research journal is in the works that will provide a broad analysis of speech technology and its use in education. Their hope is that bringing this new field out into the open will create “ah ha” moments that spur new collaborations.

However, the steps needed for a true convergence are many and complex. “There are four or five different stages that involve different kinds of technology that have all been maturing independently over years,” says Dorsey. The speech data has to be captured and turned into an appropriate digital format (no small task), and speech has to be distinguished from sound that is not speech, and one speaker from another. Once all that data has been successfully collected, how do you analyze and make sense of it?

The first step may be getting the education research community to recognize the tremendous unrealized potential of spoken language technologies for collecting word counts and performing keyword analysis, as well as evaluating collaboration, argumentation, teacher questioning, emotions, and social signals. It might also be possible to combine different types of data to create new knowledge. For example, combining data on overlapping speech and speech segments with question detection could yield information on whether a classroom is a student-centered classroom.

Consumer technologies like Siri and Alexa only scratch the surface of what’s currently available for research-quality engineering applications, but they have focused the public’s attention on speech technology. Dorsey is cautiously optimistic about the future and notes, “Once people realize this really is possible, it drives more research and work in the area.”

Speech technology and education has yet to mature into a fully formed interdisciplinary research field, but work has begun.

“Sometimes pushing big ideas forward takes understanding where the field is now and who the players are and the kinds of alliance needed for something to move from one step to the next,” says Dorsey.

The first step may be simply starting to talk.