Tablet-friendly STEM Resources

Friday, January 24th, 2014 by Jen Goree

Is your New Year’s resolution to find more interactive STEM resources that are tablet-ready? (We understand — we make similar technology-related resolutions, too!) We’ve optimized many of our browser-based interactive resources to run on popular tablets. By tuning our code, we’re able to make the power of our models available for your students!

For example, this Phase Change interactive runs 60% faster than it did before our recent code improvements:

And Metal Forces runs 33% faster:

Here’s a few to try now:

Biology

Physics

Chemistry

Mathematics

For even more, check out a complete list of our tablet-friendly STEM resources.

Fireplaces at odd with energy efficiency? An Energy2D simulation

Saturday, January 18th, 2014 by Charles Xie
In the winter, a fireplace is the coziest place in the house when we need some thermal comfort. It is probably something hard to remove from our living standards and our culture (it is supposed to be the only way Santa comes into your house). But is the concept of fireplace -- an ancient way of warming up a house -- really a good idea today when the entire house is heated by a modern distributed heating system? In terms of energy efficiency, the advice from science is that it probably isn't.

Figure 1. A fire is lit in the fireplace.
When the wood burns, a fireplace creates an updraft force that draws the warm air from the house to the outside through the chimney. This creates a "negative pressure" that draws the cold air from the outside into the house through small cracks in the building envelope. This is called the stack effect. So while you are getting radiation heat from the fireplace, you are also losing heat in the house at a faster rate through convection. As a result, your furnace has to work harder to keep other parts of your house warm.

Figure 2. No fire.
Our Energy2D tool can be used to investigate this because it can simulate both the stack effect and thermostats. Let's just create a house heated by a heating board on the floor as shown in the figures in this article. The heating board is controlled by a thermostat whose temperature sensor is positioned in the middle of the house. A few cracks were purposely created in the wall on the right side to let the cold air from the outside in. Their sizes were exaggerated in this simulation.

Figure 1 shows the duty cycles of the heating board within two hours when the house was heated from 0 °C to 20 °C with a fire lit in the fireplace. A heating run is a segment of the temperature curve in which the temperature increases, indicating the house is being heated. In our simulation, the duration of a heating run is approximately the same under different conditions. The difference is in the durations of the cooling runs. A more drafty house tends to have shorter cooling runs as it loses energy more quickly. Let's just count those heating runs. Figure 1 shows that 15 heating runs were recorded in this case.

Figure 2 shows the case when there was no fire in the fireplace and the fireplace door was closed. 13 heating runs were recorded in this case.

What does this result mean? This means that, in order to keep the house at 20 °C, you actually need to spend a bit more on your energy bill when the fireplace is burning. This is kind of counter-intuitive, but it may be true, especially when you have a large drafty house.

Figure 3. In a house without cracks...
How do we know that the increased energy loss is due to the cracks? Easy. We can just nudge the window and the wall on the right to close the gaps. Now we have a tight house. Re-run the simulation shows that  only 11 heating runs were recorded (Figure 3). In this case, you can see in Figure 3 that the cooling runs lasted longer, indicating that the rate of heat loss decreased.

Note that this Energy2D simulation is only an approximation. It does not consider the radiation heat gain from the fireplace. And it assumes that the fire would burn irrespective of air supply. But still, it illustrates the point.

This example demonstrates how useful Energy2D may be for all precollege students. In creating this simulation, all I did is to drag and drop, change some parameters, run the simulation, and then count the heating runs. As simple as that, this tool could be a game changer in science and engineering education in high schools or even middle schools. It really creates an abundance of learning opportunities for students to experiment with concepts and designs that would otherwise be inaccessible. Similar experiences are currently only possible at college level with expensive professional software that typically cost hundreds or even thousands of dollars for just a single license. Yet, according to some of our users, our Energy2D rivals those expensive tools to some extent (I would never claim that myself, though).

The time of infrared imaging in classrooms has arrived

Thursday, January 9th, 2014 by Charles Xie
At the Consumer Electronics Show (CES) 2014, FLIR Systems debuted the FLIR ONE, the first thermal imager for smartphones that sells for $349. Compared with standalone IR cameras that often cost between $1,000 and $60,000, this is a huge leap forward for the IR technology to be adopted by millions.

With this price tag, FLIR ONE finally brings the power of infrared imaging to science classrooms. Our unparalleled Infrared Tube is dedicated to IR imaging experiments for science and engineering education. This website publishes the experiments I have designed to showcase cool IR visualizations of natural phenomena. Each experiment comes with an illustration of the setup (so you can do it yourself) and a short IR video recorded from the experiment. Teachers and students may watch these YouTube videos to get an idea about how the unseen world of thermodynamics and heat transfer looks like through an IR camera -- before deciding to buy such a camera.

For example, this post shows one of my IR videos that probably can give you some idea why the northern people are spraying salt on the road like crazy in this bone-chilling weather. The video demonstrates a phenomenon called freezing point depression, a process in which adding a solute to a solvent decreases the freezing point of the solvent. Spraying salt to the road melts the ice and prevents water from freezing. Check out this video for an infrared view of this mechanism! 

Dart projects of Energy2D and Quantum Workbench announced

Wednesday, January 8th, 2014 by Charles Xie
Last month, Google announced Dart 1.0, a new programming language for the Web that aims to greatly accelerate Web development. Dart uses HTML5 as the UI. It can either run on the Dart Virtual Machine being built in Chrome or be compiled into JavaScript to run in other browsers. Dart can also be used to create standalone apps (I guess it is meant to be the main programming language for Google's own Chrome OS) or server-side software. An ECMA Technical Committee (TC 52) has been formed to make Dart into an international standard.

This is the moment I have been waiting for. As a developer with C/Java background, I am not convinced that JavaScript is made for large, complex projects (as Web programming seems to be moving towards) -- even after reading many articles and books about JavaScript. The facts that after ten years Google Docs still has only a tiny fraction of functionality of Word and basic functions such as positioning an image have not improved much suggest that its JavaScript front end has probably reached its limit.

Don't get me wrong. JavaScript is an excellent choice for creating interactive Web experiences. I use JavaScript extensively to create Web interfaces for interacting with the Energy2D applet. But I think it is in general healthy for the developer community if we are given more options. Recognizing the weaknesses of JavaScript, the community has already created CoffeeScript and TypeScript (supersets of JavaScript that strips off unproductive features of JavaScript) that also require compilation into native JavaScript. Dart is Google's solution to these problems that should be welcomed. To a Java developer like me, Dart provides a much better option because it returns the power of class-based object-oriented programming to developers who must create Web-based front ends. What is even sweeter is that its SDK provides a familiar Eclipse-based programming platform that makes many developers feel at home.

Excited about the potential of this new language (plus it is from Google and will be highly performant on Chrome), I am announcing the development of the Dart versions of our Energy2D and Quantum Workbench software. These software are based on complex mathematical solutions of extremely complex partial differential equations and will hopefully provide some showcases to anyone interested in Dart. This is not to say the development of the Java versions will cease. We are committed to develop and maintain both Dart and Java versions.

Hopefully 2014 will be an exciting year for us!

Visual learning analytics based on graph theory: Part I

Sunday, December 22nd, 2013 by Charles Xie
All educational research and assessment are based on inference from evidence. Evidence is constructed from learner data. The quality of this construction is, therefore, fundamentally important. Many educational measurements have relied on eliciting, analyzing, and interpreting students' constructed responses to assessment questions. New types of data may engender new opportunities for improving the validity and reliability of educational measurements. In this series of articles, I will show how graph theory can be applied to educational research.

The process of inquiry-based learning with an interactive computer model can be imagined as a trajectory of exploring in the problem space spanned by the user interface of the model. Students use various widgets to control different variables, observe the corresponding emergent behaviors, take some data, and then reason with the data to draw a conclusion. This sounds obvious. But exactly how do we capture, visualize, and analyze this process?

From the point of view of computational science, the learning space is enormous: If we have 10 controls in the user interface and each control has five inputs, there are potentially 100,000 different ways of interacting with the model. To be able to tackle a problem of this magnitude, we can use some mathematics. Graph theory is a trick that we are building into our process analytics. The publication of Leonhard Euler's Seven Bridges of Königsberg in 1736 is commonly considered as the birth of graph theory.

Figure 1: A learning graph made of two subgraphs representing two ideas.
In graph theory, a graph is a collection of vertices connected by edges: G = (V, E). When applied to learning, a vertex represents an indicator that may be related to certain competency of a student, which can be logged by software. An edge represents the transition from one indicator to another. We call a graph that represents a learning process as a learning graph.

A learning graph is always a digraph G = (V, A) -- namely, it always has directed edges or arrows -- because of the temporal nature of learning. Most likely, it is a multigraph that has multiple directed edges between one or more than one pair of vertices (it is sometimes called a multidigraph) because the student often needs multiple transitions between indicators to learn their connections. A learning graph often has loops, edges that connect back to the same vertex, because the student may perform multiple actions related to an indicator consecutively before making a transition. Figure 1 shows a learning graph that includes two sets of indicators, each for an idea.

Figure 2. The adjacency matrix of the graph in Figure 1.
The size of a learning graph is defined as the number of its arrows, denoted by |A(G)|. The size represents the number of actions the student takes during learning. The multiplicity of an arrow is the number of multiple arrows sharing the same vertices; the multiplicity of a graph, the maximum multiplicity of its arrows. The multiplicity represents the most frequent transition between two indicators in a learning process. The degree dG(v) of a vertex v in a graph G is the number of edges incident to v, with loops being counted twice. A vertex of degree 0 is an isolated vertex. A vertex of degree 1 is a leaf. The degree of a vertex represents the times the action related to the corresponding indicator is performed. The maximum degree Δ(G) of a graph G is the largest degree over all vertices; the minimum degree δ(G), the smallest.

The distance dG(u, v) between two vertices u and v in a graph G is the length of a shortest path between them. When u and v are identical, their distance is 0. When u and v are unreachable from each other, their distance is defined to be infinity ∞. The distance between two indicators may reveal how the related constructs are connected in the learning process.

Figure 3. A more crosscutting learning trajectory between two ideas.
Two vertices u and v are called adjacent if an edge exists between them, denoted by u ~ v. The square adjacency matrix is a means of representing which vertices of a graph are adjacent to which other vertices. Figure 2 is the adjacency matrix of the graph in Figure 1, the trace (the sum of all the diagonal elements in the matrix) of which represents the number of loops in the graph. Having known the adjacency matrix, we can apply the spectral graph theory to study the properties of a graph in relationship to the characteristic polynomial, eigenvalues, and eigenvectors of the matrix (because the adjacency matrix of a learning graph is a digraph, the eigenvalues are often complex numbers). For example, the eigenvalues of the adjacency matrix may be used to reduce the dimensionality of the dataset into clusters.

Figure 4. The adjacency matrix of the graph in Figure 3.
How might learning graphs be useful for analyzing student learning? Figure 3 gives an example that shows a different behavior of exploration between two ideas (such as heat and temperature or pressure and temperature). In this hypothetical case, the student has more transitions between two subgraphs that represent the two ideas and their indicator domains. This pattern can potentially result in better understanding of the connections between the ideas. The adjacency matrix shown in Figure 4 has different block structures than that shown in Figure 2: The blocks A-B and B-A are much sparser in Figure 2 than in Figure 4. The spectra of these two matrices may be quite different and could be used to characterize the knowledge integration process that fosters the linkage between the two ideas.

Go to Part II.

Season’s greetings from Energy2D

Saturday, December 14th, 2013 by Charles Xie
I have been so swamped in fund raising these days that I haven't been able to update this blog for more than two months. Since it is the time of the year again, I thought I should just share a holiday video made by Matthew d'Alessio, a professor at California State University Northridge, using our signature software Energy2D.

The simulator currently attracts more than 5,000 unique visitors each month, a number that probably represents a sizable portion of engineering students studying the subject of heat transfer on the planet. Over the past year, I have received a lot of encouraging emails from Energy2D's worldwide users. Some of them even compared it with well-known engineering programs. Franco Landriscina at the University of Trieste has written Energy2D into his recent Springer book "Simulation and Learning: A Model-Centered Approach."

I am truly grateful for these positive reactions. I want to say "Thank You" for all your nice words. There is nothing more rewarding than hearing from you on this fascinating subject of fluid dynamics and heat transfer. Rest assured that the development of this program will resume irrespective of its funding. In 2014, I hope to come up with a better radiation solver, which I have been thinking for quite a long time. It turns out that simulating radiation is much more difficult than simulating convection!

Here is a tutorial video in Spanish made by Gabriel Concha.

Molecular modelers won Nobel Prize in Chemistry

Wednesday, October 9th, 2013 by Charles Xie
Martin Karplus, Michael Levitt, and Arieh Warshel won the 2013 Nobel Prize For Chemistry today "for the development of multiscale models for complex chemical systems."

The Royal Swedish Academy of Sciences said the three scientists' research in the 1970s has helped scientists develop programs that unveil chemical processes. "The work of Karplus, Levitt and Warshel is ground-breaking in that they managed to make Newton's classical physics work side-by-side with the fundamentally different quantum physics," the academy said. "Previously, chemists had to choose to use either/or." Together with a few earlier Nobel Prizes in quantum chemistry, this award consecrates the field of computational chemistry.

Incidentally, Martin Karplus is my postdoc co-adviser Georgios Archontis's thesis adviser at Harvard. Georgios is one of the earlier contributors to CHARMM, a widely-used package of computational chemistry. CHARMM was the computational tool that I used when working with Georgios almost 15 years ago. In collaboration with Martin, Georgios and I were studying glycogen phosphorylase inhibitors based on a free energy perturbation analysis using CHARMM. In another project with Spyros Skourtis, I wrote a multi-scale simulation program that couples molecular dynamics and quantum dynamics to study electron transfer in proteins and DNA molecules (i.e., use Newton's Equation of Motion to predict the trajectories of atoms, construct the Hamiltonian time series, and solve the time-dependent Schrodinger equation using the Hamiltonian series as the input).

We are thrilled by this news because much of the computational kernels of our Molecular Workbench software was actually inspired by CHARMM. The Molecular Workbench also advocates a multiscale philosophy and pedagogical approach, but for linking concepts at different scales with simulations in order to help students connect the dots and build more unified pictures about science (see the image above).

We are glad to be part of the "Karplus genealogy tree," as Georgios put it when replying my congratulatory email. We hope that through our grassroots work in education, the power of molecular simulation from the top of the scientific research pyramid will enlighten millions of students and ignite their interest and curiosity in science.

Computational process analytics: Compute-intensive educational research and assessment

Saturday, October 5th, 2013 by Charles Xie
Trajectories of building movement (good)
Computational process analytics (CPA) differs from traditional research and assessment methods in that it is not only data-intensive, but also compute-intensive. A unique feature of CPA is that it automatically analyzes the performance of student artifacts (including all the intermediate products) using the same set of science-based computational engines that students used to solve problems. The computational engines encompass every single details in the artifacts and their complex interactions that are highly relevant to the nature of the problems students solved. They also recreate the scenarios and contexts of student learning (e.g., the calculated results in such a post-processing analysis are exactly the same as those presented as feedback to students while they were solving the problems). As such, the computational engines provide holistic, high-fidelity assessments of students' work that no human evaluator can ever beat -- while no one can track numerous variables students might have created in long and deep learning processes in a short evaluation time, a computer program can easily do the job. Utilizing disciplinarily intelligent computational engines to do performance assessment was a major breakthrough in CPA as this approach really has the potential to revolutionize computer-based assessment.

No building movement (bad)
To give an example, this weekend I am busy running all the analysis jobs on my computer to process 1 GB of data logged by our Energy3D CAD software. I am trying to reconstruct and visualize the learning and design trajectories of all the students, projected onto many
different axes and planes of the state space. To do that, an estimate of 30-40 hours of CPU time on my Lenovo X230 tablet, which is a pretty fast machine, is needed. Each step loads up a sequence of artifacts, runs a solar simulation for each artifact, and analyzes the results (since I have automated the entire process, this is actually not as bad as it sounds). Our assumption is that the time evolution of the performance of these artifacts would approximately reflect the time evolution of the performance of their designers. We should be able to tell how well a student was learning by examining if the performance of her artifacts shows a systematic trend of improvement, or is just random. This is way better than the performance assessment based on just looking at students' final products.

After all the intermediate performance data were retrieved through post-processing the artifacts, we can then analyze them using our Process Analyzer -- a visual mining tool being developed to show the analysis results in various visualizations (it is our hope that the Process Analyzer will eventually become a powerful assessment assistant to teachers as it would free teachers from having to deal with an enormous amount of raw data or complicated data mining algorithms). For example, the two images in this post show that one student went through a lot of optimization in her design and the other did not (there is no trajectory in the second image).

National Science Foundation funds research that puts engineering design processes under a big data "microscope"

Friday, September 20th, 2013 by Charles Xie
The National Science Foundation has awarded us $1.5 million to advance big data research on engineering design. In collaboration with Professors Şenay Purzer and Robin Adams at Purdue University, we will conduct a large-scale study involving over 3,000 students in Indiana and Massachusetts in the next five years.

This research will be based on our Energy3D CAD software that can automatically collect large process data behind the scenes while students are working on their designs. Fine-grained CAD logs possess all four characteristics of big data defined by IBM:
  1. High volume: Students can generate a large amount of process data in a complex open-ended engineering design project that involves many building blocks and variables; 
  2. High velocity: The data can be collected, processed, and visualized in real time to provide students and teachers with rapid feedback; 
  3. High variety: The data encompass any type of information provided by a rich CAD system such as all learner actions, events, components, properties, parameters, simulation data, and analysis results; 
  4. High veracity: The data must be accurate and comprehensive to ensure fair and trustworthy assessments of student performance.
These big data provide a powerful "microscope" that can reveal direct, measurable evidence of learning with extremely high resolution and at a statistically significant scale. Automation will make this research approach highly cost-effective and scalable. Automatic process analytics will also pave the road for building adaptive and predictive software systems for teaching and learning engineering design. Such systems, if successful, could become useful assistants to K-12 science teachers.

Why is big data needed in educational research and assessment? Because we all want students to learn more deeply and deep learning generates big data.

In the context of K-12 science education, engineering design is a complex cognitive process in which students learn and apply science concepts to solve open-ended problems with constraints to meet specified criteria. The complexity, open-endedness, and length of an engineering design process often create a large quantity of learner data that makes learning difficult to discern using traditional assessment methods. Engineering design assessment thus requires big data analytics that can track and analyze student learning trajectories over a significant period of time.
Deep learning generates big data.

This differs from research that does not require sophisticated computation to understand the data. For example, in typical pre/post-tests using multiple-choice assessment, the selection data of individual students are directly used as performance indices -- there is basically no depth in these self-evident data. I call this kind of data usage "data picking" -- analyzing them is just like picking up apples already fallen to the ground (as opposed to data mining that requires some computational efforts).

Process data, on the other hand, contain a lot of details that may be opaque to researchers at first glance. In the raw form, they often appear to be stochastic. But any seasoned teacher can tell you that they are able to judge learning by carefully watching how students solve problems. So here is the challenge: How can computer-based assessment accomplish what experienced teachers (human intelligence plus disciplinary knowledge plus some patience) can do based on observation data? This is the thesis of computational process analytics, an emerging subject that we are spearheading to transform educational research and assessment using computation. Thanks to NSF, we are now able to advance this subject.

Measuring the effects of an intervention using computational process analytics

Sunday, September 15th, 2013 by Charles Xie
"At its core, scientific inquiry is the same in all fields. Scientific research, whether in education, physics, anthropology, molecular biology, or economics, is a continual process of rigorous reasoning supported by a dynamic interplay among methods, theories, and findings. It builds understanding in the form of models or theories that can be tested."  —— Scientific Research in Education, National Research Council, 2002
Actions caused by the intervention
Computational process analytics (CPA) is a research method that we are developing in the spirit of the above quote from the National Research Council report. It is a whole class of data mining methods for quantitatively studying the learning dynamics in complex scientific inquiry or engineering design projects that are digitally implemented. CPA views performance assessment as detecting signals from the noisy background often present in large learner datasets due to many uncontrollable and unpredictable factors in classrooms. It borrows many computational techniques from engineering fields such as signal processing and pattern recognition. Some of these analytics can be considered as the computational counterparts of traditional assessment methods based on student articulation, classroom observation, or video analysis.

Actions unaffected by the intervention
Computational process analytics has wide applications in education assessments. High-quality assessments of deep learning hold a critical key to improving learning and teaching. Their strategic importance has been highlighted in President Obama’s remarks in March 2009: “I am calling on our nation’s Governors and state education chiefs to develop standards and assessments that don’t simply measure whether students can fill in a bubble on a test, but whether they possess 21st century skills like problem-solving and critical thinking, entrepreneurship, and creativity.” However, the kinds of assessments the President wished for often require careful human scoring that is far more expensive to administer than multiple-choice tests. Computer-based assessments, which rely on the learning software to automatically collect and sift learner data through unobtrusive logging, are viewed as a promising solution to assessing increasingly prevalent digital learning.

While there have been a lot of work on computer-based assessments for STEM education, one foundational question has rarely been explored: How sensitive can the logged learner data be to instructions?

Actions caused by the intervention.
According to the assessment guru Popham, there are two main categories of evidence for determining the instructional sensitivity of an assessment tool: judgmental evidence and empirical evidence. Computer logs provide empirical evidence based on user data recording—the logs themselves provide empirical data for assessment and their differentials before and after instructions provide empirical data for evaluating the instructional sensitivity. Like any other assessment tools, computer logs must be instructionally sensitive if they are to provide reliable data sources for gauging student learning under intervention. 


Actions unaffected by the intervention.
Earlier studies have used CAD logs to capture the designer’s operational knowledge and reasoning processes. Those studies were not designed to understand the learning dynamics occurring within a CAD system and, therefore, did not need to assess students’ acquisition and application of knowledge and skills through CAD activities. Different from them, we are studying the instructional sensitivity of CAD logs, which describes how students react to interventions with CAD actions. Although interventions can be either carried out by human (such as teacher instruction or group discussion) or generated by the computer (such as adaptive feedback or intelligent tutoring), we have focused on human interventions in this phase of our research. Studying the instructional sensitivity to human interventions will enlighten the development of effective computer-generated interventions for teaching engineering design in the future (which is another reason, besides cost effectiveness, why research on automatic assessment using learning software logs is so promising).

The study of instructional effects on design behavior and performance is particularly important, viewing from the perspective of teaching science through engineering design, a practice now mandated by the newly established Next Generation Science Standards of the United States. A problem commonly observed in K-12 engineering projects, however, is that students often reduce engineering design challenges to construction or craft activities that may not truly involve the application of science. This suggests that other driving forces acting
Distribution of intervention effect across 65 students.
on learners, such as hunches and desires for how the design artifacts should look, may overwhelm the effects of instructions on how to use science in design work. Hence, the research on the sensitivity of design behavior to science instruction requires careful analyses using innovative data analytics such as CPA to detect the changes, however slight they might be. The insights obtained from studying this instructional sensitivity may result in the actionable knowledge for developing effective instructions that can reproduce or amplify those changes.

Our preliminary CPA results have shown that CAD logs created using our Energy3D CAD tool are instructionally sensitive. The first four figures embedded in this post show two pairs of opposite cases with one type of action sensitive to an instruction that occurred outside the CAD tool and the other not. This is because the instruction was related to one type of action and had nothing to do with the other type. The last figure shows that the distribution of instructional sensitivity across 65 students. In this figure, the largest number means higher instructional sensitivity. A number close to one means that the instruction has no effect. From the graph, you can see that the three types of actions that are not related to the instruction fluctuate around one whereas the fourth type of action is strongly sensitive to the instruction.

These results demonstrate that software logs can not only record what students do with the software but also capture the effects of what happen outside the software.