Posts Tagged ‘Data Mining’

On the instructional sensitivity of computer-aided design logs

July 20th, 2014 by Charles Xie
Figure 1: Hypothetical student responses to an intervention.
In the fourth issue this year, the International Journal of Engineering Education published our 19-page-long paper on the instructional sensitivity of computer-aided design (CAD) logs. This study was based on our Energy3D software, which supports students to learn science and engineering concepts and skills through creating sustainable buildings using a variety of built-in design and analysis tools related to Earth science, heat transfer, and solar energy. This paper proposed an innovative approach of using response functions -- a concept borrowed from electrical engineering -- to measure instructional sensitivity from data logs (Figure 1).

Many researchers are interested in studying what students learn through complex engineering design projects. CAD logs provide fine-grained empirical data of student activities for assessing learning in engineering design projects. However, the instructional sensitivity of CAD logs, which describes how students respond to interventions with CAD actions, has never been examined, to the best of our knowledge.
Figure 2. An indicator of statistical reliability.

For the logs to be used as reliable data sources for assessments, they must be instructionally sensitive. Our paper reports the results of our systematic research on this important topic. To guide the research, we first propose a theoretical framework for computer-based assessments based on signal processing. This framework views assessments as detecting signals from the noisy background often present in large temporal learner datasets due to many uncontrollable factors and events in learning processes. To measure instructional sensitivity, we analyzed nearly 900 megabytes of process data logged by Energy3D as collections of time series. These time-varying data were gathered from 65 high school students who solved a solar urban design challenge using Energy3D in seven class periods, with an intervention occurred in the middle of their design projects.

Our analyses of these data show that the occurrence of the design actions unrelated to the intervention were not affected by it, whereas the occurrence of the design actions that the intervention targeted reveals a continuum of reactions ranging from no response to strong response (Figure 2). From the temporal patterns of these student responses, persistent effect and temporary effect (with different decay rates) were identified. Students’ electronic notes taken during the design processes were used to validate their learning trajectories. These results show that an intervention occurring outside a CAD tool can leave a detectable trace in the CAD logs, suggesting that the logs can be used to quantitatively determine how effective an intervention has been for each individual student during an engineering design project.

Design replay: Reconstruction of students’ engineering design processes from Energy3D logs

June 18th, 2014 by Charles Xie
One of the useful features of our Energy3D software is the ability to record the entire design process of a student behind the scenes. We call the reconstruction of a design process from fine-grained process data design replay.


Design replay is not a screencast technology. The main difference is that it records a sequence of CAD models, not in any video format such as MP4. This sequence is played back in the original CAD tool that generated it, not in a video player. As such, every snapshot model is fully functional and editable. For instance, a viewer can pause the replay and click on the user interface of the CAD tool to obtain or visualize more information, if necessary. In this sense, design replay can provide far richer information than screencast (which records as much information as the pixels in the recording screen permit).


Design replay provides a convenient method for researchers and teachers to quickly look into students' design work. It compresses hours of student work into minutes of replay without losing any important information for analyses. Furthermore, the reconstructed sequence of design can be post-processed in many ways to extract additional information that may shed light on student learning, as we can use any model in the recorded sequence to calculate any of its properties.



The three videos embedded in this post show the design replays of three students' work from a classroom study that we just completed yesterday in a Massachusetts high school. Sixty-seven students spent approximately two weeks designing zero-energy houses -- a zero-energy house is a highly energy-efficient house that consumes net zero (or even negative) energy over a year due to its use of passive and active solar technologies to conserve and generate energy. These videos may give you a clue how these three students solved the design challenge.

Learning analytics is the "crystallography" for educational research

March 24th, 2014 by Charles Xie
To celebrate 100 years of dazzling history of crystallography, the year of 2014 has been declared by UNESCO as the International Year of Crystallography. To this date, 29 Nobel Prizes have been awarded to scientific achievements related to crystallography. On March 7th, the Science Magazine honored crystallographers with a special issue.

Why is crystallography such a big deal? Because it enables scientists to "see" atoms and molecules and discover the molecular structures of substances. One of the most famous examples is the discovery of the DNA helix by Rosalind Franklin in 1952, followed by Crick, Watson, and Wilkins' double helix model. Enough ink has been spilled on the importance of this discovery.

Science fundamentally relies on techniques such as crystallography for detecting and visualizing invisible things. Educational research needs this kind of techniques, too, to decode students' minds that are opaque to researchers. Up to this point, educational researchers depend on methods such as pre/post-tests, observations, and interviews. But these traditional methods are either insufficient or inefficient for measuring learning in complex processes such as scientific inquiry and engineering design. To achieve a level of truly "no child left behind," we will need to develop a research technique that can monitor every student for every minute in the classroom.

Such a technique has to be based on an integrated informatics system that can engage students with meaningful learning tasks, tease out what are in their minds, and capture every bit of information that may be indicative of learning. This involves development in all areas of learning sciences, including technology, curriculum, pedagogy, and assessment. Eventually, what we have is a comprehensive set of data through which we will sift to find patterns of learning or evaluate the effectiveness of an intervention.

The whole process is not unlike crystallography. At the end, it is the learning analytics that concludes the research. Today we are seeing a lot of learner data, but we probably have no idea what they actually mean. We can either say there is no significance in those data and shrug off, or we can try to figure out the right kind of data analytics to decipher them. Which attitude to choose probably depends on which universe we live in. But the history of crystallography can give us a clue. It was Max von Laue who created the first X-ray diffraction pattern in 1912. He couldn't interpret it, however. It wasn't until William Henry Bragg and William Lawrence Bragg's groundbreaking work later in the same year that scientists became able to infer molecular structures from those patterns. In educational research, the equivalent of this is the learning analytics -- a critical piece that will give data meaning.

For more information, read my new article "Visualizing Student Learning."

The first paper on learning analytics for assessing engineering design?

January 30th, 2014 by Charles Xie
Figure 1
The International Journal of Engineering Education published our paper ("A Time Series Analysis Method for Assessing Engineering Design Processes Using a CAD Tool") on learning analytics and educational data mining for assessing student performance in complex engineering design projects. I believe this is the first time learning analytics was applied to the study of engineering design -- an extremely complicated process that is very difficult to assess using traditional methodologies because of its open-ended and practical nature.

Figure 2
This paper proposes a novel computational approach based on time series analysis to assess engineering design processes using our Energy3D CAD tool. To collect research data without disrupting a design learning process, design actions and artifacts are continuously logged as time series by the CAD tool behind the scenes, while students are working on an engineering design project such as a solar urban design challenge. These "atomically" fine-grained data can be used to reconstruct, visualize, and analyze the entire design process of a student with extremely high resolution. Results of a pilot study in a high school engineering class suggest that these data can be used to measure the level of student engagement, reveal the gender differences in design behaviors, and distinguish the iterative (Figure 1) and non-iterative (Figure 2) cycles in a design process.

From the perspective of engineering education, this paper contributes to the emerging fields of educational data mining and learning analytics that aim to expand evidence approaches for learning in a digital world. We are working on a series of papers to advance this research direction and expect to help with the "landscaping" of  those fields.