Category Archives: Molecular Workbench

Integrating Solarize Mass and STEM education through powerful simulation technologies

Fig.1: Solar simulation in Energy3D.
Solarize Mass is a program launched by the Massachusetts Clean Energy Center that seeks to increase the adoption of small-scale solar electricity in participating communities. In 2016, the towns of Natick and Bolton were selected to pilot for Solarize Mass. According to the Town of Natick, "Solarize Mass Natick is a volunteer initiative run by Natick residents. Our goal is to make going solar simple and affordable for Natick residents and small business owners as part of a 2016 state-sponsored program. But it is a limited-time program: the deadline for requesting a site visit is August 1, 2016."

Solar energy does not need to be a limited-time offer. The question is to figure out how residents can do their own site assessment while the guys are not in town to give free consultation. Sure, residents can use Google's Sunroof to quickly check whether solar is right for them (if their areas are covered by Sunroof). But what Sunroof does is only to screen a building based on its solar potential, not to provide a more informative engineering solution to help homeowners make up their minds. The latter has to be done by a solar installer who will provide the PV array layout, the output projection, the financial analysis, etc., in order to run a convincing business. But this is a time-consuming process that poses financial risks to solar installers if the homeowners end up backing out. So we need to find some other creative solutions.

Fig. 2: Student work from a Massachusetts school in 2016.
Funded by the National Science Foundation (NSF), we have been working at the Concord Consortium on exploring meaningful ways to combine solar programs with STEM education that will effectively boost each other. We have been developing a powerful computer-aided engineering system called Energy3D that essentially turns an important part of solar engineering's job into something that even a middle or high school student can do (Figure 1). In a recent case study, we found that Energy3D's prediction outperforms a solar installer's prediction for my colleague's house in Bolton, MA. In a pilot test in an Eastern Massachusetts school in June 2016, we found at least 60% of the 27 ninth graders who participated in the 8-hour activity succeeded -- with various degrees -- in coming up with a 3D model of her/his house and designing a solarization solution based on it (Figure 2). Giving the fact that they had to learn both Google Earth and Energy3D in a very short time and then perform a serious job, this result is actually quite encouraging. Our challenge in the NSF-funded project is to improve our technology, materials, and pedagogy so that more students can do a better job within a limited amount of time in the classroom.

With this improving capacity, we are now asking this question: "What can middle or high school students empowered by Energy3D do for the solarization movement?" Fact is that, there are four million children entering our education system each year in the US. If 1% of them become little solar advocates or even solar engineers in schools, the landscape for green energy could be quite different from what it is now.

Fig. 3: Energy3D supports rich design.
Starting from three years ago, STEM education in the US is required to incorporate science and engineering practices extensively into the curriculum by the Next Generation Science Standards (the equivalent of Common Core for science). The expectation is that students will gradually think and act like a real scientist and engineer through their education careers. To accomplish this goal, an abundance of opportunities for students to practice science and engineering through solving authentic real-world problems will need to be created and researched. On July 8, 2016, NSF has also made this clear in the a proposal solicitation letter about what they call Change Makers, which states: "Learners can be Change Makers, identifying and working to solve problems that matter deeply to them, while simultaneously advancing their own understanding and expertise. Research shows that engaging in real world problem solving enhances learning, understanding, and persistence in STEM." Specifically, the letter lists "crowd-sourced solutions to clean energy challenge through global, public participation in science" as an example topic. An NSF letter like this usually reflects the thinking and priority of the funding agency. From a practical point of view, considering the fact that the choices for engineering projects for schools are currently quite limited, there is a good chance that schools would welcome solar engineering and other types of engineering as an alternative to, say, robotic engineering.

The overlap of timing for the ongoing solarization movement and the ongoing education overhaul poses a great opportunity for uniting the two fronts. We envision that Energy3D will play a vitally important role on making this integration a reality because 1) Energy3D is based on rigorous science and engineering principles, 2) its accuracy is comparable to that of other industry-grade simulation tools, 3) it simulates what solar engineers do in the workplace, 4) it covers the education standards of scientific inquiry and engineering design, 5) it supports many architectural styles (Figure 3), 6) it works just like a design game (e.g., Minecraft) for children, and 7) last but not least -- it is free! With more development under way and planned for the future, Energy3D is also on the way to become a citizen science platform for anyone interested in residential and commercial solar designs and even solar power plant designs.

Exactly how the integration will be engineered is still a question under exploration. But we are very excited about all the possibilities ahead and we are already in an early phase to test some preliminary ideas. If you represent a solar company and are interested in this initiative, please feel free to contact us.

Simulating photovoltaic power plants with Energy3D

Modeling 1,000 PV panels in a desert
Solar radiation simulation
We have just added new modeling capacities to our Energy3D software for simulating photovoltaic (PV) power stations. With these additions, the latest version of the software can now simulate rooftop solar panels, solar parks, and solar power plants. Our plan is to develop Energy3D into a "one stop shop" for solar simulations. The goal is to provide students an accessible (yet powerful) tool to learn science and engineering in the context of renewable energy and professionals an easy-to-use (yet accurate) tool to design, predict, and optimize renewable energy generation.

Users can easily copy and paste solar panels to create an array and then duplicate arrays to create more arrays. In this way, users can rapidly add many solar panels. Each solar panel can be rotated around three different axes (normal, zenith, and azimuth). With this flexibility, users can create a PV array in any direction and orientation. At any time, they can adjust the direction and orientation of any or all solar panels.
PV arrays that are oriented differently


What is in the design of a solar power plant? While the orientation is a no-brainer, the layout may need some thinking and planning, especially for a site that has a limited area. Another factor that affects the layout is the design of the solar tracking system used to maximize the output. Also, considering that many utility companies offer peak and off-peak prices for electricity, users may explore strategies of orienting some PV arrays towards the west or southwest for the solar power plant to produce more energy in the afternoon when the demand is high in the summer, especially in the south.

Rooftop PV arrays
In addition to designing PV arrays on the ground, users can do the same thing for flat rooftops as well. Unlike solar panels on pitched roofs of residential buildings, those on flat roofs of large buildings are usually tilted.

We are currently implementing solar trackers so that users can design solar power plants that maximize their outputs based on tracking the sun. Meanwhile, mirror reflector arrays will be added to support the design of concentrated solar power plants. These features should be available soon. Stay tuned!

Energy3D makes designing realistic buildings easy

The annual yield and cost benefit analyses of rooftop solar panels based on sound scientific and engineering principles are critical steps to the financial success of building solarization. Google's Project Sunroof provides a way for millions of property owners to get recommendations for the right solar solutions.



Another way to conduct accurate scientific analysis of solar panel outputs based on their layout on the rooftop is to use a computer-aided engineering (CAE) tool to do a three-dimensional, full-year analysis based on ab initio scientific simulation. Under the support of the National Science Foundation since 2010, we have been developing Energy3D, a piece of CAE software that has the goal of bringing the power of sophisticated scientific and engineering simulations to children and laypersons. To achieve this goal, a key step is to support users to rapidly sketch up their own buildings and the surrounding objects that may affect their solar potentials. We feel that most CAD tools out there are probably too difficult for average users to create realistic models of their own houses. This forces us to invent new solutions.

We have recently added countless new features to Energy3D to progress towards this goal. The latest version allows many common architectural styles found in most parts of the US to be created and their solar potential to be studied. The screenshots embedded in this article demonstrate this capability. With the current version, each of these designs took myself approximately an hour to create from scratch. But we will continue to push the limit.

The 3D construction user interface has been developed based on the tenet of supporting users to create any structure using a minimum set of building blocks and operations. Once users master a relatively small set of rules, they are empowered to create almost any shape of building as they wish.

Solar yield analysis of the first house
The actual time-consuming part is to get the right dimension and orientation of a real building and the surrounding tall objects such as trees.
Google's 3D map may provide a way to extract these data. Once the approximate geometry of a building is determined, users can easily put solar panels anywhere on the roof to check out their energy yield. They can then try as many different layouts as they wish to compare the yields and select an optimal layout. This is especially important for buildings that may have partial shades and sub-optimal orientations. CAE tools such as Energy3D can be used to do spatial and temporal analysis and report daily outputs of each panel in the array, allowing users to obtain fine-grained, detailed results and thus providing a good simulation of solar panels in day-to-day operation.

The engineering principles behind this solar design, assessment, and optimization process based on science is exactly what the Next Generation Science Standards require K-12 students in the US to learn and practice. So why not ask children for help to solarize their own homes, schools, and communities, at least virtually? The time for doing this can never be better. And we have paved the road for this vision by creating one of easiest 3D interfaces with compelling scientific visualizations that can potentially entice and engage a lot of students. It is time for us to test the idea.

To see more designs, visit this page.

Personal thermal vision could turn millions of students into the cleantech workforce of today

So we have signed the Paris Agreement and cheered about it. Now what?

More than a year ago, I wrote a proposal to the National Science Foundation to test the feasibility of empowering students to help combat the energy issues of our nation. There are hundreds of millions of buildings in our country and some of them are pretty big energy losers. The home energy industry currently employs probably 100,000 people at most. It would take them a few decades to weatherize and solarize all these residential and commercial buildings (let alone educating home owners so that they would take such actions).

But there are millions of students in schools who are probably more likely to be concerned about the world that they are about to inherit. Why not ask them to help?

You probably know a lot of projects on this very same mission. But I want to do something different. Enough messaging has been done. We don't need to hand out more brochures and flyers about the environmental issues that we may be facing. It is time to call for actions!

For a number of years, I have been working on infrared thermography and building energy simulation to knock down the technical barriers that these techniques may pose to children. With NSF awarding us a $1.2M grant last year and FLIR releasing a series of inexpensive thermal cameras, the time of bringing these tools to large-scale applications in schools has finally arrived.

For more information, see our poster that will be presented at a NSF meeting next week. Note that this project has just begun so we haven't had a chance to test the solarization part. But the results from the weatherization part based on infrared thermography has been extremely encouraging!

Listen to the data with the Visual Process Analytics


Visual analytics provides a powerful way for people to see patterns and trends in data by visualizing them. In real life, we use both our eyes and ears. So can we hear patterns and trends if we listen to the data?

I spent a few days studying the JavaScript Sound API and adding simple data sonification to our Visual Process Analytics (VPA) to explore this question. I don't know where including the auditory sense to the analytics toolkit may lead us, but you never know. It is always good to experiment with various ideas.


Note that the data sonification capabilities of VPA is very experimental at this point. To make the matter worse, I am not a musician by any stretch of the imagination. So the generated sounds in the latest version of VPA may sound horrible to you. But this represents a step forward to better interactions with complex learner data. As my knowledge about music improves, the data should sound less terrifying.

The first test feature added to VPA is very simple: It just converts a time series into a sequence of notes and rests. To adjust the sound, you can change a number of parameters such as pitch, duration, attack, decay, and oscillator types (sine, square, triangle, sawtooth, etc.). All these options are available through the context menu of a time series graph.

At the same time as the sound plays, you can also see a synchronized animation of VPA (as demonstrated by the embedded videos). This means that from now on VPA is a multimodal analytic tool. But I have no plan to rename it as data visualization is still and will remain dominant for the data mining platform.

The next step is to figure out how to synthesize better sounds from multiple types of actions as multiple sources or instruments (much like the Song from Pi). I will start with sonifying the scatter plot in VPA. Stay tuned.

What’s new in Visual Process Analytics Version 0.3


Visual Process Analytics (VPA) is a data mining platform that supports research on student learning through using complex tools to solve complex problems. The complexity of this kind of learning activities of students entails complex process data (e.g., event log) that cannot be easily analyzed. This difficulty calls for data visualization that can at least give researchers a glimpse of the data before they can actually conduct in-depth analyses. To this end, the VPA platform provides many different types of visualization that represent many different aspects of complex processes. These graphic representations should help researchers develop some sort of intuition. We believe VPA is an essential tool for data-intensive research, which will only grow more important in the future as data mining, machine learning, and artificial intelligence play critical roles in effective, personalized education.

Several new features were added to Version 0.3, described as follows:

1) Interactions are provided through context menus. Context menus can be invoked by right-clicking on a visualization. Depending on where the user clicks, a context menu provides the available actions applicable to the selected objects. This allows a complex tool such as VPA to still have a simple, pleasant user interface.

2) Result collectors allow users to gather analysis results and export them in the CSV format. VPA is a data browser that allows users to navigate in the ocean of data from the repositories it connects to. Each step of navigation invokes some calculations behind the scenes. To collect the results of these calculations in a mining session, VPA now has a simple result collector that automatically keeps track of the user's work. A more sophisticated result manager is also being conceptualized and developed to make it possible for users to manage their data mining results in a more flexible way. These results can be exported if needed to be analyzed further using other software tools.

3) Cumulative data graphs are available to render a more dramatic view of time series. It is sometimes easier to spot patterns and trends in cumulative graphs. This cumulative analysis applies to all levels of granularity of data supported by VPA (currently, the three granular levels are Top, Medium, and Fine, corresponding to three different ways to categorize action data). VPA also provides a way for users to select variables from a list to be highlighted in cumulative graphs.

Many other new features were also added in this version. For example, additional information about classes and students are provided to contextualize each data set. In the coming weeks, the repository will incorporate data from more than 1,200 students in Indiana who have undertaken engineering design projects using our Energy3D software. This unprecedented large-scale database will potentially provide a goldmine of research data in the area of engineering design study.

For more information about VPA, see my AERA 2016 presentation.

Infrared imaging evidence of geothermal energy in a basement

Geothermal energy is the thermal energy generated or stored in the Earth. The ground maintains a nearly constant temperature six meter (20 feet) under, which is roughly equal to the average annual air temperature at the location. In Boston, this is about 13 °C (55 °F).

You can feel the effect of the geothermal energy in a basement, particularly in a hot summer day in which the basement can be significantly cooler. But IR imaging provides a unique visualization of this effect.

I happen to have a sub-basement that is partially buried in the ground. When I did an IR inspection of my basement in an attempt to identify places where heat escapes in a cold night, something that I did not expect struck me: As I scanned the basement, the whole basement floor appeared to be 4-6 °F warmer than the walls. Both the floor and wall of my basement are simply concrete -- there is no insulation, but the walls are partially or fully exposed to the outside air, which was about 24 °F at that time.

This temperature distribution pattern is opposite to the typical temperature gradient observed in a heated room where the top of a wall is usually a few degrees warmer than the bottom of a wall or the floor as hot air rises to warm up the upper part.

The only explanation of this warming of the basement floor is geothermal energy, caught by the IR camera.

Visualizing thermal equilibration: IR imaging vs. Energy2D simulation

Figure 1
A classic experiment to show thermal equilibration is to put a small Petri dish filled with some hot or cold water into a larger one filled with tap water around room temperature, as illustrated in Figure 1. Then stick one thermometer in the inner dish and another in the outer dish and take their readings over time.

With a low-cost IR camera like the FLIR C2 camera or FLIR ONE camera, this experiment becomes much more visual (Figure 2). As an IR camera provides a full-field view of the experiment in real time, you get much richer information about the process than a graph of two converging curves from the temperature data read from the two thermometers.
Figure 2

The complete equilibration process typically takes 10-30 minutes, depending on the initial temperature difference between the water in the two dishes and the amount of water in the inner dish. A larger temperature difference or a larger amount of water in the inner dish will require more time to reach the thermal equilibrium.

Another way to quickly show this process is to use our Energy2D software to create a computer simulation (Figure 3). Such a simulation provides a visualization that resembles the IR imaging result. The advantage is that it runs very fast -- only 10 seconds or so are needed to reach the thermal equilibrium. This allows you to test various conditions rapidly, e.g., changing the initial temperature of the water in the inner dish or the outer dish or changing the diameters of the dishes.

Figure 3
Both real-world experiments and computer simulations have their own pros and cons. Exactly which one to use depends on your situation. As a scientist, I believe nothing beats real-world experiments in supporting authentic science learning and we should always favor them whenever possible. However, conducting real-world experiments requires a lot of time and resources, which makes it impractical to implement throughout a course. Computer simulations provide an alternative solution that allows students to get a sense of real-world experiments without entailing the time and cost. But the downside is that a computer simulation, most of the time, is an overly simplified scientific model that does not have the many layers of complexity and the many types of interactions that we experience in reality. In a real-world experiment, there are always unexpected factors and details that need to be attended to. It is these unexpected factors and details that create genuinely profound and exciting teachable moments. This important nature of science is severely missing in computer simulations, even with a sophisticated computational fluid dynamics tool such as Energy2D.

Here is my balancing of this trade-off equation: It is essential for students to learn simplified scientific models before they can explore complex real-world situations. The models will give students the frameworks needed to make sense of real-world observation. A fair strategy is to use simulations to teach simplified models and then make some time for students to conduct experiments in the real world and learn how to integrate and apply their knowledge about the models to solve real problems.

A side note: You may be wondering how well the Energy2D result agrees with the IR result on a quantitative basis. This is kind of an important question -- If the simulation is not a good approximation of the real-world process, it is not a good simulation and one may challenge its usefulness, even for learning purposes. Figure 4 shows a comparison of a test run. As you can see, the while the result predicted by Energy2D agrees in trend with the results observed through IR imaging, there are some details in the real data that may be caused by either human errors in taking the data or thermal fluctuations in the room. What is more, after the thermal equilibrium was reached, the water in both dishes continued to cool down to room temperature and then below due to evaporative cooling. The cooling to room temperature was modeled in the Energy2D simulation through a thermal coupling to the environment but evaporative cooling was not.

Figure 4

Time series analysis tools in Visual Process Analytics: Cross correlation

Two time series and their cross-correlation functions
In a previous post, I showed you what autocorrelation function (ACF) is and how it can be used to detect temporal patterns in student data. The ACF is the correlation of a signal with itself. We are certainly interested in exploring the correlations among different signals.

The cross-correlation function (CCF) is a measure of similarity of two time series as a function of the lag of one relative to the other. The CCF can be imagined as a procedure of overlaying two series printed on transparency films and sliding them horizontally to find possible correlations. For this reason, it is also known as a "sliding dot product."

The upper graph in the figure to the right shows two time series from a student's engineering design process, representing about 45 minutes of her construction (white line) and analysis (green line) activities while trying to design an energy-efficient house with the goal to cut down the net energy consumption to zero. At first glance, you probably have no clue about what these lines represent and how they may be related.

But their CCFs reveal something that appears to be more outstanding. The lower graph shows two curves that peak at some points. I know you have a lot of questions at this point. Let me try to see if I can provide more explanations below.

Why are there two curves for depicting the correlation of two time series, say, A and B? This is because there is a difference between "A relative to B" and "B relative to A." Imagine that you print the series on two transparency films and slide one on top of the other. Which one is on the top matters. If you are looking for cause-effect relationships using the CCF, you can treat the antecedent time series as the cause and the subsequent time series as the effect.

What does a peak in the CCF mean, anyways? It guides you to where more interesting things may lie. In the figure of this post, the construction activities of this particular student were significantly followed by analysis activities about four times (two of them are within 10 minutes), but the analysis activities were significantly followed by construction activities only once (after 10 minutes).

Time series analysis tools in Visual Process Analytics: Autocorrelation

Autocorrelation reveals a three-minute periodicity
Digital learning tools such as computer games and CAD software emit a lot of temporal data about what students do when they are deeply engaged in the learning tools. Analyzing these data may shed light on whether students learned, what they learned, and how they learned. In many cases, however, these data look so messy that many people are skeptical about their meaning. As optimists, we believe that there are likely learning signals buried in these noisy data. We just need to use or invent some mathematical tricks to figure them out.

In Version 0.2 of our Visual Process Analytics (VPA), I added a few techniques that can be used to do time series analysis so that researchers can find ways to characterize a learning process from different perspectives. Before I show you these visual analysis tools, be aware that the purpose of these tools is to reveal the temporal trends of a given process so that we can better describe the behavior of the student at that time. Whether these traits are "good" or "bad" for learning likely depends on the context, which often necessitates the analysis of other co-variables.

Correlograms reveal similarity of two time series.
The first tool for time series analysis added to VPA is the autocorrelation function (ACF), a mathematical tool for finding repeating patterns obscured by noise in the data. The shape of the ACF graph, called the correlogram, is often more revealing than just looking at the shape of the raw time series graph. In the extreme case when the process is completely random (i.e., white noise), the ACF will be a Dirac delta function that peaks at zero time lag. In the extreme case when the process is completely sinusoidal, the ACF will be similar to a damped oscillatory cosine wave with a vanishing tail.

An interesting question relevant to learning science is whether the process is autoregressive (or under what conditions the process can be autoregressive). The quality of being autoregressive means that the current value of a variable is influenced by its previous values. This could be used to evaluate whether the student learned from the past experience -- in the case of engineering design, whether the student's design action was informed by previous actions. Learning becomes more predictable if the process is autoregressive (just to be careful, note that I am not saying that more predictable learning is necessarily better learning). Different autoregression models, denoted as AR(n) with n indicating the memory length, may be characterized by their ACFs. For example, the ACF of AR(2) decays more slowly than that of AR(1), as AR(2) depends on more previous points. (In practice, partial autocorrelation function, or PACF, is often used to detect the order of an AR model.)

The two figures in this post show that the ACF in action within VPA, revealing temporal periodicity and similarity in students' action data that are otherwise obscure. The upper graphs of the figures plot the original time series for comparison.