Tag Archives: Engineering design

Listen to the data with the Visual Process Analytics

Visual analytics provides a powerful way for people to see patterns and trends in data by visualizing them. In real life, we use both our eyes and ears. So can we hear patterns and trends if we listen to the data?

I spent a few days studying the JavaScript Sound API and adding simple data sonification to our Visual Process Analytics (VPA) to explore this question. I don't know where including the auditory sense to the analytics toolkit may lead us, but you never know. It is always good to experiment with various ideas.

Note that the data sonification capabilities of VPA is very experimental at this point. To make the matter worse, I am not a musician by any stretch of the imagination. So the generated sounds in the latest version of VPA may sound horrible to you. But this represents a step forward to better interactions with complex learner data. As my knowledge about music improves, the data should sound less terrifying.

The first test feature added to VPA is very simple: It just converts a time series into a sequence of notes and rests. To adjust the sound, you can change a number of parameters such as pitch, duration, attack, decay, and oscillator types (sine, square, triangle, sawtooth, etc.). All these options are available through the context menu of a time series graph.

At the same time as the sound plays, you can also see a synchronized animation of VPA (as demonstrated by the embedded videos). This means that from now on VPA is a multimodal analytic tool. But I have no plan to rename it as data visualization is still and will remain dominant for the data mining platform.

The next step is to figure out how to synthesize better sounds from multiple types of actions as multiple sources or instruments (much like the Song from Pi). I will start with sonifying the scatter plot in VPA. Stay tuned.

What’s new in Visual Process Analytics Version 0.3

Visual Process Analytics (VPA) is a data mining platform that supports research on student learning through using complex tools to solve complex problems. The complexity of this kind of learning activities of students entails complex process data (e.g., event log) that cannot be easily analyzed. This difficulty calls for data visualization that can at least give researchers a glimpse of the data before they can actually conduct in-depth analyses. To this end, the VPA platform provides many different types of visualization that represent many different aspects of complex processes. These graphic representations should help researchers develop some sort of intuition. We believe VPA is an essential tool for data-intensive research, which will only grow more important in the future as data mining, machine learning, and artificial intelligence play critical roles in effective, personalized education.

Several new features were added to Version 0.3, described as follows:

1) Interactions are provided through context menus. Context menus can be invoked by right-clicking on a visualization. Depending on where the user clicks, a context menu provides the available actions applicable to the selected objects. This allows a complex tool such as VPA to still have a simple, pleasant user interface.

2) Result collectors allow users to gather analysis results and export them in the CSV format. VPA is a data browser that allows users to navigate in the ocean of data from the repositories it connects to. Each step of navigation invokes some calculations behind the scenes. To collect the results of these calculations in a mining session, VPA now has a simple result collector that automatically keeps track of the user's work. A more sophisticated result manager is also being conceptualized and developed to make it possible for users to manage their data mining results in a more flexible way. These results can be exported if needed to be analyzed further using other software tools.

3) Cumulative data graphs are available to render a more dramatic view of time series. It is sometimes easier to spot patterns and trends in cumulative graphs. This cumulative analysis applies to all levels of granularity of data supported by VPA (currently, the three granular levels are Top, Medium, and Fine, corresponding to three different ways to categorize action data). VPA also provides a way for users to select variables from a list to be highlighted in cumulative graphs.

Many other new features were also added in this version. For example, additional information about classes and students are provided to contextualize each data set. In the coming weeks, the repository will incorporate data from more than 1,200 students in Indiana who have undertaken engineering design projects using our Energy3D software. This unprecedented large-scale database will potentially provide a goldmine of research data in the area of engineering design study.

For more information about VPA, see my AERA 2016 presentation.

Daily energy analysis in Energy3D

Fig. 1: The analyzed house.
Energy3D already provides a set of powerful analysis tools that users can use to analyze the annual energy performance of their designs. For experts, the annual analysis tools are convenient as they can quickly evaluate their designs based on the results. For novices who are trying to understand how the energy graphs are calculated (or skeptics who are not sure whether they should trust the results), the annual analysis is sometimes a bit like a black box. This is because if there are too many variables (which, in this case, are seasonal changes of solar radiation and weather) to deal with at once, we will be overwhelmed. The total energy data are the results of two astronomic cycles: the daily cycle (caused by the spin of the Earth itself) and the annual cycle (caused by the rotation of the Earth around the Sun). This is why novices have a hard time reasoning with the results.

Fig. 2: Daily light sensor data in four seasons.
To help users reduce one layer of complexity and make sense of the energy data calculated in Energy3D simulations, a new class of daily analysis tools has been added to Energy3D. These tools allow users to pick a day to do the energy analyses, limiting the graphs to the daily cycle.

For example, we can place three sensors on the east, south, and west sides of the house shown in Figure 1. Then we can pick four days -- January 1st, April 1st, July 1st, and October 1st -- to represent the four seasons. Then we run a simulation for each day to collect the corresponding sensor data. The results are shown in Figure 2. These show that in the winter, the south-facing side receives the highest intensity of solar radiation, compared with the east and west-facing sides. In the summer, however, it is the east and west-facing sides that receive the highest intensity of solar radiation. In the spring and fall, the peak intensities of the three sides are comparable but they peak at different times.

Fig. 3: Daily energy use and production in four seasons.
If you take a more careful look at Figure 2, you will notice that, while the radiation intensity on the south-facing side always peaks at noon, those on the east and west-facing sides generally go through a seasonal shift. In the summer, the peak of radiation intensity occurs around 8 am on the east-facing side and around 4 pm on the west-facing side, respectively. In the winter, these peaks occur around 9 am and 2 pm, respectively. This difference is due to the shorter day in the winter and the lower position of the Sun in the sky.

Energy3D also provides a heliodon to visualize the solar path on any given day, which you can use to examine the angle of the sun and the length of the day. If you want to visually evaluate solar radiation on a site, it is best to combine the sensor and the heliodon.

You can also analyze the daily energy use and production. Figure 3 shows the results. Since this house has a lot of south-facing windows that have a Solar Heat Gain Coefficient of 80%, the solar energy is actually enough to keep the house warm (you may notice that your heater runs less frequently in the middle of a sunny winter day if you have a large south-facing window). But the downside is that it also requires a lot of energy to cool the house in the summer. Also note the interesting energy pattern for July 1st -- there are two smaller peaks of solar radiation in the morning and afternoon. Why? I will leave that answer to you.

Energy3D in Colombia

Camilo Vieira Mejia, a PhD student of Purdue University, recently brought our Energy3D software to a workshop, which is a part of Clubes de Ciencia -- an initiative where graduate students go to Colombia and share science and engineering concepts with high school students from small towns around Antioquia (a state of Colombia).

Students designed houses with Energy3D, printed them out, assemble them, and put them under the Sun to test their solar gains. They probably have also run the solar and thermal analyses for their virtual houses.

We are glad that our free software is reaching out to students in these rural areas and helping them to become interested in science and engineering. This is one of the many examples that a project funded by the National Science Foundation also turns out to benefit people in other countries and impact the world in many positive ways. In this sense, the National Science Foundation is not just a federal agency -- it is a global agency.

If you are also using Energy3D in your country, please consider contacting us and sharing your stories or thoughts.

Energy3D is intended to be global -- It currently includes weather data from 220 locations in all the continents. Please let us know you would like to include locations in your country in the software so that you can design energy solutions for your own area. As a matter of fact, this was exactly what Camilo asked me to do before he headed for Colombia. I would have had no clue which towns in Colombia should be added and where I could retrieve their weather data (which is often in a foreign language).

[With the kind permission of these participating students, we are able to release the photos in this blog post.]

Time series analysis tools in Visual Process Analytics: Cross correlation

Two time series and their cross-correlation functions
In a previous post, I showed you what autocorrelation function (ACF) is and how it can be used to detect temporal patterns in student data. The ACF is the correlation of a signal with itself. We are certainly interested in exploring the correlations among different signals.

The cross-correlation function (CCF) is a measure of similarity of two time series as a function of the lag of one relative to the other. The CCF can be imagined as a procedure of overlaying two series printed on transparency films and sliding them horizontally to find possible correlations. For this reason, it is also known as a "sliding dot product."

The upper graph in the figure to the right shows two time series from a student's engineering design process, representing about 45 minutes of her construction (white line) and analysis (green line) activities while trying to design an energy-efficient house with the goal to cut down the net energy consumption to zero. At first glance, you probably have no clue about what these lines represent and how they may be related.

But their CCFs reveal something that appears to be more outstanding. The lower graph shows two curves that peak at some points. I know you have a lot of questions at this point. Let me try to see if I can provide more explanations below.

Why are there two curves for depicting the correlation of two time series, say, A and B? This is because there is a difference between "A relative to B" and "B relative to A." Imagine that you print the series on two transparency films and slide one on top of the other. Which one is on the top matters. If you are looking for cause-effect relationships using the CCF, you can treat the antecedent time series as the cause and the subsequent time series as the effect.

What does a peak in the CCF mean, anyways? It guides you to where more interesting things may lie. In the figure of this post, the construction activities of this particular student were significantly followed by analysis activities about four times (two of them are within 10 minutes), but the analysis activities were significantly followed by construction activities only once (after 10 minutes).

Time series analysis tools in Visual Process Analytics: Autocorrelation

Autocorrelation reveals a three-minute periodicity
Digital learning tools such as computer games and CAD software emit a lot of temporal data about what students do when they are deeply engaged in the learning tools. Analyzing these data may shed light on whether students learned, what they learned, and how they learned. In many cases, however, these data look so messy that many people are skeptical about their meaning. As optimists, we believe that there are likely learning signals buried in these noisy data. We just need to use or invent some mathematical tricks to figure them out.

In Version 0.2 of our Visual Process Analytics (VPA), I added a few techniques that can be used to do time series analysis so that researchers can find ways to characterize a learning process from different perspectives. Before I show you these visual analysis tools, be aware that the purpose of these tools is to reveal the temporal trends of a given process so that we can better describe the behavior of the student at that time. Whether these traits are "good" or "bad" for learning likely depends on the context, which often necessitates the analysis of other co-variables.

Correlograms reveal similarity of two time series.
The first tool for time series analysis added to VPA is the autocorrelation function (ACF), a mathematical tool for finding repeating patterns obscured by noise in the data. The shape of the ACF graph, called the correlogram, is often more revealing than just looking at the shape of the raw time series graph. In the extreme case when the process is completely random (i.e., white noise), the ACF will be a Dirac delta function that peaks at zero time lag. In the extreme case when the process is completely sinusoidal, the ACF will be similar to a damped oscillatory cosine wave with a vanishing tail.

An interesting question relevant to learning science is whether the process is autoregressive (or under what conditions the process can be autoregressive). The quality of being autoregressive means that the current value of a variable is influenced by its previous values. This could be used to evaluate whether the student learned from the past experience -- in the case of engineering design, whether the student's design action was informed by previous actions. Learning becomes more predictable if the process is autoregressive (just to be careful, note that I am not saying that more predictable learning is necessarily better learning). Different autoregression models, denoted as AR(n) with n indicating the memory length, may be characterized by their ACFs. For example, the ACF of AR(2) decays more slowly than that of AR(1), as AR(2) depends on more previous points. (In practice, partial autocorrelation function, or PACF, is often used to detect the order of an AR model.)

The two figures in this post show that the ACF in action within VPA, revealing temporal periodicity and similarity in students' action data that are otherwise obscure. The upper graphs of the figures plot the original time series for comparison.

Seeing student learning with visual analytics

Technology allows us to record almost everything happening in the classroom. The fact that students' interactions with learning environments can be logged in every detail raises the interesting question about whether or not there is any significant meaning and value in those data and how we can make use of them to help students and teachers, as pointed out in a report sponsored by the U.S. Department of Education:
New technologies thus bring the potential of transforming education from a data-poor to a data-rich enterprise. Yet while an abundance of data is an advantage, it is not a solution. Data do not interpret themselves and are often confusing — but data can provide evidence for making sound decisions when thoughtfully analyzed.” — Expanding Evidence Approaches for Learning in a Digital World, Office of Educational Technology, U.S. Department of Education, 2013
A radar chart of design space exploration.
A histogram of action intensity.
Here we are not talking about just analyzing students' answers to some multiple-choice questions, or their scores in quizzes and tests, or their frequencies of logging into a learning management system. We are talking about something much more fundamental, something that runs deep in cognition and learning, such as how students conduct a scientific experiment, solve a problem, or design a product. As learning goes deeper in those directions, data produced by students grows bigger. It is by no means an easy task to analyze large volumes of learner data, which contain a lot of noisy elements that cast uncertainty to assessment. The validity of an assessment inference rests on  the strength of evidence. Evidence construction often relies on the search for relations, patterns, and trends in student data.With a lot of data, this mandates some sophisticated computation similar to cognitive computing.

Data gathered from highly open-ended inquiry and design activities, key to authentic science and engineering practices that we want students to learn, are often intensive and “messy.” Without analytic tools that can discern systematic learning from random walk, what is provided to researchers and teachers is nothing but a DRIP (“data rich, information poor”) problem.

A scatter plot of action timeline.
Recognizing the difficulty in analyzing the sheer volume of messy student data, we turned to visual analytics, a whole category of techniques extensively used in cutting-edge business intelligence systems such as software developed by SAS, IBM, and others. We see interactive, visual process analytics key to accelerating the analysis procedures so that researchers can adjust mining rules easily, view results rapidly, and identify patterns clearly. This kind of visual analytics optimally combines the computational power of the computer, the graphical user interface of the software, and the pattern recognition power of the brain to support complex data analyses in data-intensive educational research.

A digraph of action transition.
So far, I have written four interactive graphs and charts that can be used to study four different aspects of the design action data that we collected from our Energy3D CAD software. Recording several weeks of student work on complex engineering design challenges, these datasets are high-dimensional, meaning that it is improper to treat them from a single point of view. For each question we are interested in getting answers from student data, we usually need a different representation to capture the outstanding features specific to the question. In many cases, multiple representations are needed to address a question.

In the long run, our objective is to add as many graphic representations as possible as we move along in answering more and more research questions based on our datasets. Given time, this growing library of visual analytics would develop sufficient power to the point that it may also become useful for teachers to monitor their students' work and thereby conduct formative assessment. To guarantee that our visual analytics runs on all devices, this library is written in JavaScript/HTML/CSS. A number of touch gestures are also supported for users to use the library on a multi-touch screen. A neat feature of this library is that multiple graphs and charts can be grouped together so that when you are interacting with one of them, the linked ones also change at the same time. As the datasets are temporal in nature, you can also animate these graphs to reconstruct and track exactly what students do throughout.

The National Science Foundation funds SmartCAD—an intelligent learning system for engineering design

We are pleased to announce that the National Science Foundation has awarded the Concord Consortium, Purdue University, and the University of Virginia a $3 million, four-year collaborative project to conduct research and development on SmartCAD, an intelligent learning system that informs engineering design of students with automatic feedback generated using computational analysis of their work.

Engineering design is one of the most complex learning processes because it builds on top of multiple layers of inquiry, involves creating products that meet multiple criteria and constraints, and requires the orchestration of mathematical thinking, scientific reasoning, systems thinking, and sometimes, computational thinking. Teaching and learning engineering design becomes important as it is now officially part of the Next Generation Science Standards in the United States. These new standards mandate every student to learn and practice engineering design in every science subject at every level of K-12 education.
Figure 1

In typical engineering projects, students are challenged to construct an artifact that performs specified functions under constraints. What makes engineering design different from other design practices such as art design is that engineering design must be guided by scientific principles and the end products must operate predictably based on science. A common problem observed in students' engineering design activities is that their design work is insufficiently informed by science, resulting in the reduction of engineering design to drawing or crafting. To circumvent this problem, engineering design curricula often encourage students to learn or review the related science concepts and practices before they try to put the design elements together to construct a product. After students create a prototype, they then test and evaluate it using the governing scientific principles, which, in turn, gives them a chance to deepen their understanding of the scientific principles. This common approach of learning is illustrated in the upper image of Figure 1.

There is a problem in the common approach, however. Exploring the form-function relationship is a critical inquiry step to understanding the underlying science. To determine whether a change of form can result in a desired function, students have to build and test a physical prototype or rely on the opinions of an instructor. This creates a delay in getting feedback at the most critical stage of the learning process, slowing down the iterative cycle of design and cutting short the exploration in the design space. As a result of this delay, experimenting and evaluating "micro ideas"--very small stepwise ideas such as those that investigate a design parameter at a time--through building, revising, and testing physical prototypes becomes impractical in many cases. From the perspective of learning, however, it is often at this level of granularity that foundational science and engineering design ultimately meet.

Figure 2
All these problems can be addressed by supporting engineering design with a computer-aided design (CAD) platform that embeds powerful science simulations to provide formative feedback to students in a timely manner. Simulations based on solving fundamental equations in science such as Newton’s Laws model the real world accurately and connect many science concepts coherently. Such simulations can computationally generate objective feedback about a design, allowing students to rapidly test a design idea on a scientific basis. Such simulations also allow the connections between design elements and science concepts to be explicitly established through fine-grained feedback, supporting students to make informed design decisions for each design element one at a time, as illustrated by the lower image of Figure 1. These scientific simulations give the CAD software tremendous disciplinary intelligence and instructional power, transforming it into a SmartCAD system that is capable of guiding student design towards a more scientific end.

Despite these advantages, there are very few developmentally appropriate CAD software available to K-12 students—most CAD software used in industry not only are science “black boxes” to students, but also require a cumbersome tool chaining of pre-processors, solvers, and post-processors, making them extremely challenging to use in secondary education. The SmartCAD project will fill in this gap with key educational features centered on guiding student design with feedback composed from simulations. For example, science simulations can be used to analyze student design artifacts and compute their distances to specific goals to detect whether students are zeroing in towards those goals or going astray. The development of these features will also draw upon decades of research on formative assessments of complex learning.

A stock-and-flow model for building thermal analysis

Figure 1. A stock-and-flow model of building energy.
Our Energy3D CAD software has two built-in simulation engines for performing solar energy analysis and building thermal analysis. I have extensively blogged about solar energy analysis using Energy3D. This article introduces building thermal analysis with Energy3D.

Figure 2. A colonial house.
The current version of the building energy simulation engine is based on a simple stock-and-flow model of building energy. Viewed from the perspective of system dynamics—a subject that studies the behavior of complex systems, the total thermal energy of a building is a stock and the energy gains or losses through its various components are flows. These gains or losses usually happen via the energy exchange between the building and the environment through the components. For instance, the solar radiation that shines into a building through its windows are inputs; the heat transfer through its walls may be inputs or outputs depending on the temperature difference between the inside and the outside.

Figure 3. The annual energy graph.
Figure1 illustrates how energy flows into and out of a building in the winter and summer, respectively. In order to maintain the temperature inside a building, the thermal energy it contains must remain constant—any shortage of thermal energy must be compensated and any excessive thermal energy must be removed. These are done through heating and air conditioning systems, which, together with ventilation systems, are commonly known as HVAC systems. Based on the stock-and-flow model, we can predict the energy cost of heating and air conditioning by summing up the energy flows in various processes of heat transfer, solar radiation, and energy generation over all the components of the building such as walls, windows, or roofs and over a certain period of time such as a day, a month, or a year.

Figure 2 shows the solar radiation heat map of a house and the distribution of the heat flux density over its building envelope. Figure 3 shows the results of the annual energy analysis for the house shown in Figure 2.

More information can be found in Chapter 3 of Energy3D's User Guide.

Common architectural styles supported by Energy3D

Energy3D supports the design of some basic architectural styles commonly seen in New England, such as Colonial and Cape Cod. Its simple 3D user interface allows users to quickly sketch up a house with an aesthetically pleasing look -- with only mouse clicks and drags (and, of course, some patience). This makes it easy for middle and high school students to create meaningful, realistic designs and learn science and engineering from these authentic experiences -- who wants to keep doing those cardboard houses that look nothing like a real house for another 100 years?

The true enabler of science learning in Energy3D is its analytic capability that can tell students the energy consequences of their designs while they are working on them. Without this analytical capability, learning would have been cut short at architectural design (which undeniably is the fun part of Energy3D that entices students to explore many different design options that entertain the eyes). With the analytical capability, the relationship between form and function becomes a major driving force for student design. It is at this point that an Energy3D project becomes an engineering design project.

Architectural design, which focuses on designing the form, and engineering design, which focuses on designing the function, are equally important in both educational and professional practices. Students need to learn both. After all, the purpose of design is to meet various people's needs, including their aesthetic needs. This principle of coupling architectural design and engineering design is of generic importance as it can be extended to the broader case of integrating industrial design and engineering design. It is this coupling that marries art, science, and usability.

We are working on providing a list of common architectural styles that can be designed using Energy3D. These styles, four of them are shown in this article, show only the basic form of each style. Each should only take less than an hour to sketch up for beginners. If you want, you can derive more complex and detailed designs for each style.