The National Science Foundation funds grant to pair intelligent tutoring system and Geniverse

Games, modeling, and simulation technologies hold great potential for helping students learn science concepts and engage with the practices of science, and these environments often capture meaningful data about student interactions. At the same time, intelligent tutoring systems (ITS) have undergone important advancements in providing support for individual student learning. Their complex statistical user models can identify student difficulties effectively and apply real-time probabilistic approaches to select options for assistance.

The Concord Consortium is proud to announce a four-year $1.5 million grant from the National Science Foundation that will pair Geniverse with robust intelligent tutoring systems to provide real-time classroom support. The new GeniGUIDE—Guiding Understanding via Information from Digital Environments—project will combine a deeply digital environment with an ITS core.

Geniverse is our free, web-based software for high school biology that engages students in exploring heredity and genetics by breeding and studying virtual dragons. Interactive models, powered by real genes, enable students to do simulated experiments that generate realistic and meaningful genetic data, all within an engaging, game-like context.

Geniverse Breeding

Students are introduced to drake traits and inheritance patterns, do experiments, look at data, draw tentative conclusions, and then test these conclusions with more experimentation. (Drakes are a model species that can help solve genetic mysteries in dragons, in much the same way as the mouse is a model species for human genetic disease.)

The GeniGUIDE project will improve student learning of genetics content by using student data from Geniverse. The software will continually monitor individual student actions, taking advantage of ITS capabilities to sense and guide students automatically through problems that have common, easily rectified issues. At the classroom level, it will make use of this same capability to help learners by connecting them to each other. When it identifies a student in need of assistance that transcends basic feedback, the system will connect the student with other peers in the classroom who have recently completed similar challenges, thus cultivating a supportive environment.

At the highest level, the software will leverage the rich data being collected about student actions and the system’s evolving models of student learning to form a valuable real-time resource for teachers. GeniGUIDE will identify students most in need of help at any given time and provide alerts to the teacher. The alerts will include contextual guidance about students’ past difficulties and most recent attempts as well as suggestions for pedagogical strategies most likely to aid individual students as they move forward.

The Concord Consortium and North Carolina State University will research this layered learner guidance system that aids students and informs interactions between student peers and between students and teachers. The project’s theoretical and practical advances promise to offer a deeper understanding of how diagnostic formative data can be used in technology-rich K-12 classrooms. As adaptive student learning environments find broad application in education, GeniGUIDE technologies will serve as an important foundation for the next generation of teacher support systems.

Daily energy analysis in Energy3D

Fig. 1: The analyzed house.
Energy3D already provides a set of powerful analysis tools that users can use to analyze the annual energy performance of their designs. For experts, the annual analysis tools are convenient as they can quickly evaluate their designs based on the results. For novices who are trying to understand how the energy graphs are calculated (or skeptics who are not sure whether they should trust the results), the annual analysis is sometimes a bit like a black box. This is because if there are too many variables (which, in this case, are seasonal changes of solar radiation and weather) to deal with at once, we will be overwhelmed. The total energy data are the results of two astronomic cycles: the daily cycle (caused by the spin of the Earth itself) and the annual cycle (caused by the rotation of the Earth around the Sun). This is why novices have a hard time reasoning with the results.

Fig. 2: Daily light sensor data in four seasons.
To help users reduce one layer of complexity and make sense of the energy data calculated in Energy3D simulations, a new class of daily analysis tools has been added to Energy3D. These tools allow users to pick a day to do the energy analyses, limiting the graphs to the daily cycle.

For example, we can place three sensors on the east, south, and west sides of the house shown in Figure 1. Then we can pick four days -- January 1st, April 1st, July 1st, and October 1st -- to represent the four seasons. Then we run a simulation for each day to collect the corresponding sensor data. The results are shown in Figure 2. These show that in the winter, the south-facing side receives the highest intensity of solar radiation, compared with the east and west-facing sides. In the summer, however, it is the east and west-facing sides that receive the highest intensity of solar radiation. In the spring and fall, the peak intensities of the three sides are comparable but they peak at different times.

Fig. 3: Daily energy use and production in four seasons.
If you take a more careful look at Figure 2, you will notice that, while the radiation intensity on the south-facing side always peaks at noon, those on the east and west-facing sides generally go through a seasonal shift. In the summer, the peak of radiation intensity occurs around 8 am on the east-facing side and around 4 pm on the west-facing side, respectively. In the winter, these peaks occur around 9 am and 2 pm, respectively. This difference is due to the shorter day in the winter and the lower position of the Sun in the sky.

Energy3D also provides a heliodon to visualize the solar path on any given day, which you can use to examine the angle of the sun and the length of the day. If you want to visually evaluate solar radiation on a site, it is best to combine the sensor and the heliodon.

You can also analyze the daily energy use and production. Figure 3 shows the results. Since this house has a lot of south-facing windows that have a Solar Heat Gain Coefficient of 80%, the solar energy is actually enough to keep the house warm (you may notice that your heater runs less frequently in the middle of a sunny winter day if you have a large south-facing window). But the downside is that it also requires a lot of energy to cool the house in the summer. Also note the interesting energy pattern for July 1st -- there are two smaller peaks of solar radiation in the morning and afternoon. Why? I will leave that answer to you.

Energy3D in Colombia

Camilo Vieira Mejia, a PhD student of Purdue University, recently brought our Energy3D software to a workshop, which is a part of Clubes de Ciencia -- an initiative where graduate students go to Colombia and share science and engineering concepts with high school students from small towns around Antioquia (a state of Colombia).

Students designed houses with Energy3D, printed them out, assemble them, and put them under the Sun to test their solar gains. They probably have also run the solar and thermal analyses for their virtual houses.

We are glad that our free software is reaching out to students in these rural areas and helping them to become interested in science and engineering. This is one of the many examples that a project funded by the National Science Foundation also turns out to benefit people in other countries and impact the world in many positive ways. In this sense, the National Science Foundation is not just a federal agency -- it is a global agency.

If you are also using Energy3D in your country, please consider contacting us and sharing your stories or thoughts.

Energy3D is intended to be global -- It currently includes weather data from 220 locations in all the continents. Please let us know you would like to include locations in your country in the software so that you can design energy solutions for your own area. As a matter of fact, this was exactly what Camilo asked me to do before he headed for Colombia. I would have had no clue which towns in Colombia should be added and where I could retrieve their weather data (which is often in a foreign language).

[With the kind permission of these participating students, we are able to release the photos in this blog post.]

Geothermal simulation in Energy3D


Fig.1: Annual air and ground temperatures (daily averages)
A building exchanges heat not only with the outside air but also with the ground. The ground temperature depends on the location and the depth. At six meters under and deeper, the temperature remains almost constant throughout the year. That constant temperature roughly equals to the mean annual air temperature, which depends on the latitude.
Fig.2: Daily air and ground temperatures on 7/1

The ground temperature has a variation pattern different from that of the air temperature. You may experience this difference when you walk into the basement of a house from the outside in the summer or in the winter at different times of the day.

For our Energy3D CAD software to account for the heat transfer between a building and the ground at any time of a year at the 220 worldwide locations that it currently supports, we must develop a physical model for geothermal energy. While there is an abundance of weather data, we found very little ground data (ground data are, understandably, more difficult and expensive to collect). In the absence of real-world data, we have to rely on mathematical modeling.

Fig.3: Daily air and ground temperatures on 1/1
This mission was accomplished in Version 4.9.3 of Energy3D, which can now simulate the heat transfer with the ground. This geothermal model also opens up the possibility to simulate ground source heat pumps -- a promising clean energy solution, in Energy3D (which aims to ultimately include various renewable energy sources in its design capacity to support energy engineering).

Exactly how the math works can be found in the User Guide. In this blog post, I will show you some results. Figure 1 shows the daily averages of the air and ground temperatures throughout the year in Boston, MA. There are two notable features of this graph: 1) Going more deeply, the temperature fluctuation decreases and eventually diminishes at six meters; and 2) the peaks of the ground temperatures lag behind that of the air temperature, due to the heat capacity of the ground (the ground absorbs a lot of thermal energy in the summer and slowly releases them as the air cools in the fall).

Fig. 4: Four snapshots of heat transfer with the ground on a cold day.
In addition to the annual trend, users can also examine the daily fluctuations of the ground temperatures at different depths. Figure 2 shows the results on July, 1. There are three notable features of this graph: 1) Overall the ground temperature decreases when we go deeper; 2) the daily fluctuation of the ground temperature decreases when we go deeper; and 3) the peaks of the ground temperatures lag behind the peak of air temperature. Figure 3 shows the results on January 1 with a similar trend, except that the ground temperatures are higher than the air temperature.

Figure 4 shows four snapshots of the heat transfer between a house and the ground at four different times (12 am, 6 am, 12 pm, and 6 pm) on January 1. The figure shows arrays of heat flux vectors that represent the direction and magnitude of heat flow. To exaggerate the visualization, the R-values of the floor insulation and the windows were deliberately set to be low. If you observe carefully, you will find that the change in the magnitude of the heat flux vectors into the ground lags behind that of those into the air.

The geothermal model also includes parameters that allow users to choose the physical properties of the ground, such as thermal diffusivity. For example, Dry land tends to have smaller thermal diffusivity than wet land. With these properties, geology also becomes a design factor, making the already interdisciplinary Energy3D software even more so.

The National Science Foundation awards grant to study virtual worlds that afford knowledge integration

The Concord Consortium is proud to announce a new project funded by the National Science Foundation, “Towards virtual worlds that afford knowledge integration across project challenges and disciplines.” Principal Investigator Janet Kolodner and Co-PI Amy Pallant will explore how the design of project challenges and the contexts in which they are carried out can support knowledge integration, sustained engagement, and excitement. The goal is to learn how to foster knowledge integration across disciplines when learners encounter and revisit phenomena and processes across several challenges.

Aerial Geography and Air QualityIn this model, students explore the effect of wind direction and geography on air quality as they place up to four smokestacks in the model.

We envision an educational system where learners regularly engage in project-based education within and across disciplines, and in and out of school. We believe that, with such an educational approach, making connections across learning experiences should be possible in new and unexplored ways. If challenges are framed appropriately and their associated figured worlds (real and virtual) and scaffolding are designed to afford it, such education can help learners integrate the content and practices they are learning across projects and across disciplines. “Towards virtual worlds” will help move us towards this vision.

This one-year exploratory project focuses on the possibilities for knowledge integration when middle schoolers who have achieved water ecosystems challenges later attempt an air quality challenge. Some students will engage with EcoMUVE, where learners try to understand why the fish in a pond are dying, and others will engage with Living Together from Project-Based Inquiry Science (PBIS), where learners advise about regulations that should be put in place before a new industry is allowed to move into a town. A subset of these students will then encounter specially crafted air quality challenges based on High-Adventure Science activities and models. These, we hope, will evoke reminders of experiences during their water ecosystem work. We will examine what learners are reminded of, the richness of their memories, and the appeal for learners of applying what they are learning about air quality to better address the earlier water ecology challenge. Research will be carried out in Boston area schools.

Sideview Pollution Control Devices In this model, students explore the effects of installing pollution control devices, such as scrubbers and catalytic converters, on power plants and cars. Students monitor the level of primary pollutants (brown line) and secondary pollutants (orange line) in the model over time, via the graph.

The project will investigate:

  1. What conditions give rise to intense and sustained emotional engagement?
  2. What is remembered by learners when they have (enthusiastically) engaged with a challenge in a virtual figured world and reflected on it in ways appropriate to learning, and what seems to affect what is remembered?
  3. How does a challenge and/or virtual world need to be configured so that learners notice—while not being overwhelmed by—phenomena not central to the challenge but still important to making connections with content outside the challenge content?

Our exploration will help us understand more about the actual elements in the experiences of learners that lead to different emotional responses and the impacts of such responses on their memory making and desires.

Lessons we learn about conditions under which learners form rich memories and want to go back and improve their earlier solutions to challenges will form some of the foundations informing how to design virtual worlds and project challenges with affordances for supporting knowledge integration across projects and disciplines. Exemplar virtual worlds and associated project challenges will inform design principles for the design and use of a new virtual world genre — one with characteristics that anticipate cross-project and cross-discipline knowledge integration and ready learners for future connection making and knowledge deepening.

Simulating the Hadley Cell using Energy2D

Download the models
Although it is mostly used as an engineering tool, our Energy2D software can also be used to create simple Earth science simulations. This blog post shows some interesting results about the Hadley Cell.

The Hadley Cell is an atmospheric circulation that transports energy and moisture from the equator to higher latitudes in the northern and southern hemispheres. This circulation is intimately related to the trade winds, hurricanes, and the jet streams.

As a simple way to simulate zones of ocean that have different temperatures due to differences in solar heating, I added an array of constant-temperature objects at the bottom of the simulation window. The temperature gradually decreases from 30 °C in the middle to 15 °C at the edges. A rectangle, set to be at a constant temperature of -20 °C, is used to mimic the high, chilly part of the atmosphere. The viscosity of air is deliberately set to much higher than reality to suppress the wild fluctuations for a somehow averaged effect. The results show a stable flow pattern that looks like a cross section of the Hadley Cell, as is shown in the first image of this post.

When I increased the buoyant force of the air, an oscillatory pattern was produced. The system swings between two states shown in the second and third images, indicating a periodic reinforcement of hot rising air from the adjacent areas to the center (which is supposed to represent the equator).

Of course, I can't guarantee that the results produced by Energy2D are what happen in nature. Geophysical modeling is an extremely complicated business with numerous factors that are not considered in this simple model. Yet, Energy2D shows something interesting: the fluctuations of wind speeds seem to suggest that, even without considering the seasonal changes, this nonlinear model already exhibits some kind of periodicity. We know that it is all kinds of periodicity in Mother Nature that help to sustain life on the Earth.

Simulating geometric thermal bridges using Energy2D

Fig. 1: IR image of a wall junction (inside) by Stefan Mayer
One of the mysterious things that causes people to scratch their heads when they see an infrared picture of a room is that the junctions such as edges and corners formed by two exterior walls (or floors and roofs) often appear to be colder in the winter than other parts of the walls, as is shown in Figure 1. This is, I hear you saying, caused by an air gap between two walls. But not that simple! While a leaking gap can certainly do it, the effect is there even without a gap. Better insulation only makes the junctions less cold.

Fig. 2: An Energy2D simulation of thermal bridge corners.
A typical explanation of this phenomenon is that, because the exterior surface of a junction (where the heat is lost to the outside) is greater than its interior surface (where the heat is gained from the inside), the junction ends up losing thermal energy in the winter more quickly than a straight part of the walls, causing it to be colder. The temperature difference is immediately revealed by a very sensitive IR camera. Such a junction is commonly called a geometric thermal bridge, which is different from material thermal bridge that is caused by the presence of a more conductive piece in a building assembly such as a steel stud in a wall or a concrete floor of a balcony.

Fig. 3: IR image of a wall junction (outside) by Stefan Mayer
But the actual heat transfer process is much more complicated and confusing. While a wall junction does create a difference in the surface areas of the interior and exterior of the wall, it also forms a thicker area through which the heat must flow through (the area is thicker because it is in a diagonal direction). The increased thickness should impede the heat flow, right?

Fig. 4: An Energy2D simulation of a L-shaped wall.
Unclear about the outcome of these competing factors, I made some Energy2D simulations to see if they can help me. Figure 2 shows the first one that uses a block of object remaining at 20 °C to mimic a warm room and the surrounding environment of 0 °C, with a four-side wall in-between. Temperature sensors are placed at corners, as well as the middle point of a wall. The results show that the corners are indeed colder than other parts of the walls in a stable state. (Note that this simulation only involves heat diffusion, but adding radiation heat transfer should yield similar results.)

What about more complex shapes like an L-shaped wall that has both convex and concave junctions? Figure 3 shows the IR image of such a wall junction, taken from the outside of a house. In this image, interestingly enough, the convex edge appears to be colder, but the concave edge appears to be warmer!

The Energy2D simulation (Figure 4) shows a similar pattern like the IR image (Figure 3). The simulation results show that the temperature sensor placed near the concave edge outside the L-shape room does register a higher temperature than other sensors.

Now, the interesting question is, does the room lose more energy through a concave junction or a convex one? If we look at the IR image of the interior taken inside the house (Figure 1), we would probably say that the convex junction loses more energy. But if we look at the IR image of the exterior taken outside the house (Figure 3), we would probably say that the concave junction loses more energy.

Which statement is correct? I will leave that to you. You can download the Energy2D simulations from this link, play with them, and see if they help you figure out the answer. These simulations also include simulations of the reverse cases in which heat flows from the outside into the room (the summer condition).

Time series analysis tools in Visual Process Analytics: Cross correlation

Two time series and their cross-correlation functions
In a previous post, I showed you what autocorrelation function (ACF) is and how it can be used to detect temporal patterns in student data. The ACF is the correlation of a signal with itself. We are certainly interested in exploring the correlations among different signals.

The cross-correlation function (CCF) is a measure of similarity of two time series as a function of the lag of one relative to the other. The CCF can be imagined as a procedure of overlaying two series printed on transparency films and sliding them horizontally to find possible correlations. For this reason, it is also known as a "sliding dot product."

The upper graph in the figure to the right shows two time series from a student's engineering design process, representing about 45 minutes of her construction (white line) and analysis (green line) activities while trying to design an energy-efficient house with the goal to cut down the net energy consumption to zero. At first glance, you probably have no clue about what these lines represent and how they may be related.

But their CCFs reveal something that appears to be more outstanding. The lower graph shows two curves that peak at some points. I know you have a lot of questions at this point. Let me try to see if I can provide more explanations below.

Why are there two curves for depicting the correlation of two time series, say, A and B? This is because there is a difference between "A relative to B" and "B relative to A." Imagine that you print the series on two transparency films and slide one on top of the other. Which one is on the top matters. If you are looking for cause-effect relationships using the CCF, you can treat the antecedent time series as the cause and the subsequent time series as the effect.

What does a peak in the CCF mean, anyways? It guides you to where more interesting things may lie. In the figure of this post, the construction activities of this particular student were significantly followed by analysis activities about four times (two of them are within 10 minutes), but the analysis activities were significantly followed by construction activities only once (after 10 minutes).

Time series analysis tools in Visual Process Analytics: Autocorrelation

Autocorrelation reveals a three-minute periodicity
Digital learning tools such as computer games and CAD software emit a lot of temporal data about what students do when they are deeply engaged in the learning tools. Analyzing these data may shed light on whether students learned, what they learned, and how they learned. In many cases, however, these data look so messy that many people are skeptical about their meaning. As optimists, we believe that there are likely learning signals buried in these noisy data. We just need to use or invent some mathematical tricks to figure them out.

In Version 0.2 of our Visual Process Analytics (VPA), I added a few techniques that can be used to do time series analysis so that researchers can find ways to characterize a learning process from different perspectives. Before I show you these visual analysis tools, be aware that the purpose of these tools is to reveal the temporal trends of a given process so that we can better describe the behavior of the student at that time. Whether these traits are "good" or "bad" for learning likely depends on the context, which often necessitates the analysis of other co-variables.

Correlograms reveal similarity of two time series.
The first tool for time series analysis added to VPA is the autocorrelation function (ACF), a mathematical tool for finding repeating patterns obscured by noise in the data. The shape of the ACF graph, called the correlogram, is often more revealing than just looking at the shape of the raw time series graph. In the extreme case when the process is completely random (i.e., white noise), the ACF will be a Dirac delta function that peaks at zero time lag. In the extreme case when the process is completely sinusoidal, the ACF will be similar to a damped oscillatory cosine wave with a vanishing tail.

An interesting question relevant to learning science is whether the process is autoregressive (or under what conditions the process can be autoregressive). The quality of being autoregressive means that the current value of a variable is influenced by its previous values. This could be used to evaluate whether the student learned from the past experience -- in the case of engineering design, whether the student's design action was informed by previous actions. Learning becomes more predictable if the process is autoregressive (just to be careful, note that I am not saying that more predictable learning is necessarily better learning). Different autoregression models, denoted as AR(n) with n indicating the memory length, may be characterized by their ACFs. For example, the ACF of AR(2) decays more slowly than that of AR(1), as AR(2) depends on more previous points. (In practice, partial autocorrelation function, or PACF, is often used to detect the order of an AR model.)

The two figures in this post show that the ACF in action within VPA, revealing temporal periodicity and similarity in students' action data that are otherwise obscure. The upper graphs of the figures plot the original time series for comparison.

Visual Process Analytics (VPA) launched


Visual Process Analytics (VPA) is an online analytical processing (OLAP) program that we are developing for visualizing and analyzing student learning from complex, fine-grained process data collected by interactive learning software such as computer-aided design tools. We envision a future in which every classroom would be powered by informatics and infographics such as VPA to support day-to-day learning and teaching at a highly responsive level. In a future when every business person relies on visual analytics every day to stay in business, it would be a shame that teachers still have to read through tons of paper-based work from students to make instructional decisions. The research we are conducting with the support of the National Science Foundation is paving the road to a future that would provide the fair support for our educational systems that is somehow equivalent to business analytics and intelligence.

This is the mission of VPA. Today we are announcing the launch of this cyberinfrastructure. We decided that its first version number should be 0.1. This is just a way to indicate that the research and development on this software system will continue as a very long-term effort and what we have done is a very small step towards a very ambitious goal.


VPA is written in plain JavaScript/HTML/CSS. It should run within most browsers -- best on Chrome and Firefox -- but it looks and works like a typical desktop app. This means that while you are in the middle of mining the data, you can save what we call "the perspective" as a file onto your disk (or in the cloud) so that you can keep track of what you have done. Later, you can load the perspective back into VPA. Each perspective opens the datasets that you have worked on, with your latest settings and results. So if you are half way through your data mining, your work can be saved for further analyses.

So far Version 0.1 has seven analysis and visualization tools, each of which shows a unique aspect of the learning process with a unique type of interactive visualization. We admit that, compared with the daunting high dimension of complex learning, this is a tiny collection. But we will be adding more and more tools as we go. At this point, only one repository -- our own Energy3D process data -- is connected to VPA. But we expect to add more repositories in the future. Meanwhile, more computational tools will be added to support in-depth analyses of the data. This will require a tremendous effort in designing a smart user interface to support various computational tasks that researchers may be interested in defining.

Eventually, we hope that VPA will grow into a versatile platform of data analytics for cutting-edge educational research. As such, VPA represents a critically important step towards marrying learning science with data science and computational science.