Monthly Archives: February 2018

Uncertainty: Real-world examples

When you live in New England in the winter, you pay attention to the forecast. Large snowstorms can make travel near impossible. Heavy snow and blowing winds can cause coastal flooding, power outages, and roof collapses.

The National Weather Service (NWS) exists to “provide weather, water, and climate data, forecasts and warnings for the protection of life and property and enhancement of the national economy.” They’re my favorite source for weather forecasts. And yesterday morning (February 26), they gave me one more reason to appreciate them.

You see, there’s a big storm that may (or may not) be coming later this week. Last week, some forecasters (not from the NWS, it should be noted) were calling for blizzard conditions – seven to eight days from any potential storm! That’s lots of planning time, but is it valid to make plans based on seven-day forecasts?

Yesterday morning’s post from NWS Boston included this graphic and description:


Note the words “POTENTIALLY” and “LOW CONFIDENCE FORECAST”. Clicking through to look at the details, you can learn a bit about the model information on which they’re basing their forecast. If you don’t know a lot about meteorology, you can get lost in the abbreviations and details of the models. But the meteorologists have made it easy to understand their shifting confidence by explaining how model runs have shifted as they compile more information. They’ve put a bit of this information into their graphic, illustrating that the model error decreases as more information is known closer to the event.

On a much more novice level, this is what students do when they use High-Adventure Science (HAS) activities. (High-Adventure Science, a National Science Foundation-funded project, produced six NGSS-aligned curricular modules on cutting-edge Earth and environmental science topics. These free, online curricula incorporate real-world data and computational models and are appropriate for middle and high school classrooms.) In HAS activities, students run models and make claims based on data from the model runs. They rate their confidence with their answers and explain the factors that led them to that confidence level.

In our research, we found that when students were asked to write about uncertainty in the context of scientific arguments, they improved their overall argumentation ability. That suggests that teaching about uncertainty in science enables students to better understand real-world science – including weather forecasts.

Will we experience a big snowstorm later this week? I’m confident that the staff at NWS Boston will keep an eye on the model runs, updating me (and the rest of the Boston area) with their forecasts and levels of certainty with the data. In the meantime, check out a High-Adventure Science activity to enhance your students’ scientific thinking skills!




Virtual Solar Grid adds Crescent Dunes Solar Tower

The Crescent Dues Solar Tower as modeled in Energy3D
A light field visualization in Energy3D
A top view
The Crescent Dunes Solar Power Tower is a 110 MW utility-scale concentrated solar power (CSP) plant with 1.1 GWh of molten salt energy storage, located about 190 miles northwest of Las Vegas in the United States (watch a video about it). The plant includes a whopping number of 10,347 large heliostats that collect and focus sunlight onto a central receiver at the top of a 195-meter tall tower to heat 32,000 tons of molten salt. The molten salt circulates from the tower to some storage tanks, where it is then used to produce steam and generate electricity. Excess thermal energy is stored in the molten salt and can be used to generate power for up to ten hours, providing electricity in the evening or during cloudy hours. Unlike other CSP plants, Crescent Dunes' advanced storage technology eliminates the need for any backup fossil fuels to melt the salt and jumpstart the plant in the morning. Each heliostat is made up of 35 6×6 feet (1.8 m) mirror facets, adding up to a total aperture of 115.7 square meters. The total solar field aperture sums to an area of 1,196,778 square meters, or more than one square kilometer, in a land area of 1,670 acres (6.8 square kilometers). That is, the plant is capable of potentially collecting one seventh of all the solar energy that shines onto the field. Costing about $1 billion to construct, it was commissioned in September 2015.

A close-up view of accurate modeling of heliostat tracking
Since its inception in January 2018, our Virtual Solar Grid has included the Energy3D models of nearly all the existing large CSP power plants in the world. That covers more than 80 large CSP plants capable of generating more than 11 TWh per year. The ultimate goal of the Virtual Solar Grid is to mirror every solar energy system in the world in the computing cloud through crowdsourcing involving a large number of students interested in engineering, creating an unprecedentedly detailed computational model for learning how to design a reliable and resilient power grid based completely on renewable energy (solar energy in this phase). The modeling of the Crescent Dunes plant has put our Energy3D software to a stress test. Can it handle such a complex project with so many heliostats in such a large field?
A side view

Near the base of the tower
Over the shoulder of the tower
The solar field
This became my President's Day project. To make this happen, I had to first increase the resolution of Google Maps images supported in Energy3D. A free developer account of Google Maps can only get images of 640 × 640 pixels. When you are looking at an area that is as big as a few square kilometers, that resolution gets you very blurry images. To fetch high-resolution images from Google without paying them, I had to basically make Energy3D download many more images and then knit them together to create a large image that forms an Earth canvas in Energy3D (hence you see a lot of Google logos and copyrights in the ground image that I could not get rid of from each patch). Once I had the Earth canvas, I then drew heliostats on top of it (that is, one by one for more than 10,000 times!) and compared their orientations and shadows rendered by Energy3D with those shown in the Google Maps images. Now, the problem is that Google doesn't tell you when the satellite image was taken. But based on the shadows of the tower and other structures, I could easily figure out an approximate time and date. I then set that time and date in Energy3D and confirmed that the shadow of the tower in the Energy3D model overlaps with that in the satellite image. After this calibration, every single virtual heliostat that I copied and pasted then automatically aligned with those in the satellite image (as long as the original copy specifies the tower that it points to), visually testifying that the tracking algorithm for the virtual heliostats in Energy3D is just as good as the one used by the computers that control the motions of the real-world heliostats. Matching the computer model with the satellite image is essential as the procedure ensures the accuracy of our numerical simulation.

The solar field
After making numerous other improvements for Energy3D, the latest version (V7.8.4) was finally capable of modeling this colossal power plant. This includes the capability of being able to divide the whole project into nine smaller projects and then allow Energy3D to stitch the smaller 3D models together to create the full model using the Import Tool. This divide-and-conquer method makes the user interface a lot faster as neither you nor Energy3D need to deal with 9,000 existing heliostats while you are adding the last 1,000. The predicted annual output of the plant by Energy3D is 462 GWh, as opposed to the official projection of 500 GWh, assuming 90% of mirror reflectance and 25% of thermal-to-electric conversion.

One thing I had to do, though, was to double the memory requirement for the software from the default 256 MB to 512 MB for the Windows version (the Mac version is fine), which would make the software fail on really old computers that have only 256 MB of total memory (but I don't think such old computers would still work properly today anyways). The implication of this change is that, if you are a Windows user and have installed Energy3D before, you will need to re-install it using the latest installer from our website in order to take advantage of this update. If you are not sure, there is a way to know how much memory your Energy3D is allocated by checking the System Information and Preferences under the File Menu. If that number is about 250 MB, then you have to re-install the software -- if you really want to see the spectacular Crescent Dunes model in Energy3D without crashing it.

With basically only the three Ivanpah Solar Towers left to be modeled and uploaded, the Virtual Solar Grid has nearly incorporated all the operational solar thermal power plants in the world. We will continue to add new CSP plants as they come online and show up in Google Maps. In our next phase, we will move to add more photovoltaic (PV) solar power plants to the Virtual Solar Grid. At this point, the proportion of the modeled capacity from PV stands at only 8% in the Virtual Solar Grid, compared with 92% from CSP. Adding PV power plants will really require crowdsourcing as there are many more PV projects in the world -- there are potentially millions of small rooftop systems in existence. On a separate avenue, the National Renewable Energy Laboratory (NREL) has estimated that, if we add solar panels to every square feet of usable roof area in the U.S., we could meet 40% of our total electricity need. Is their statement realistic? Perhaps only time can tell, but by adding more and more virtual solar power systems to the Virtual Solar Grid, we might be able to tell sooner.

Everyday Inquiry with R: Is Yogurt-X Expensive?

To kick off this Everyday Inquiry with R series, I’d like to recount a conversation between my friend Eric and me about one of Americans’ favorite foods, yogurt.

R is a free programming language for statistical computing and graphics, which we’re using in our new National Science Foundation-funded CodeR4MATH project to research the development of students’ computational thinking and mathematical modeling competencies.

The other day I showed Eric my yogurt collection. He was amazed that I had tried so many different brands and flavors.

Eric: Which one is your favorite?
Jie: Currently yogurt-X (a pseudonym of my favorite brand).
Eric: How much is it?
Jie: $1.59
Eric: That’s expensive.
Jie: No, it’s not.
Eric: It is. Take a look at the prices of all the products you collected.

I had previously stored the prices in a vector called ‘yogurt_price’. Below is the simple R code to do that.

# create a vector 'yogurt_price' consisting of yogurt prices
yogurt_price = c(1.13, 2.00, 1.69, 1.79, 2.09, 1.00, 1.00, 0.60, 1.00, 1.11, 1.79, 3.19, 1.79, 1.99, 3.69, 2.79, 0.60, 1.79, 1.99, 4.09, 4.49, 4.49, 0.89, 0.89, 1.99, 2.09, 2.09, 2.09, 2.09, 0.69, 1.59, 0.69, 0.69, 0.69, 1.00, 1.19, 7.69)

Jie: Here they are (typing yogurt_price in R console to view the data).

# view a vector

[1] 1.13 2.00 1.69 1.79 2.09 1.00 1.00 0.60 1.00 1.11 1.79 3.19 1.79 1.99

[15] 3.69 2.79 0.60 1.79 1.99 4.09 4.49 4.49 0.89 0.89 1.99 2.09 2.09 2.09

[29] 2.09 0.69 1.59 0.69 0.69 0.69 1.00 1.19 7.69

Eric: Nice. How many products did you collect?
Jie: There are…(calling the length() function)

# count the number of elements in a vector

[1] 37

Jie: 37.
Eric: Oh, that’s a lot. Hmmm, which one is the most expensive (trying to eyeball the greatest number)?
Jie: Well, let me show you…(calling the sort() function)

# sort the elements in a vector

[1] 0.60 0.60 0.69 0.69 0.69 0.69 0.89 0.89 1.00 1.00 1.00 1.00 1.11 1.13

[15] 1.19 1.59 1.69 1.79 1.79 1.79 1.79 1.99 1.99 1.99 2.00 2.09 2.09 2.09

[29] 2.09 2.09 2.79 3.19 3.69 4.09 4.49 4.49 7.69

Eric: Wow, $7.69? And the least expensive is only $0.60. What’s the normal price then?
Jie: Normal? Well, there are a number of $0.69s, $1.00s, $1.79s, and $2.09s. Let me show a frequency count (calling the table() function).

# generate a table of counts for each element in a vector


0.6 0.69 0.89   1 1.11 1.13 1.19 1.59 1.69 1.79 1.99   2 2.09 2.79 3.19

   2   4   2   4   1   1   1   1   1   4   3   1   5   1   1

3.69 4.09 4.49 7.69

   1   1   2   1

Jie: There are 5 yogurts priced at $2.09. Is that normal?
Eric: Hmmm…there are four $0.69, four $1.00, and four $1.79. $2.09 seems to be on the expensive end.
Jie: Let’s plot the data points on a number line and see where most prices fall (calling the stripchart() function).

# draw a strip chart for a vector

Eric: What is this?
Jie: The x axis is price. Each little square stands for a data point. Some of them are overlapping because the default method is ‘overplot’. Let me make a few changes. We’ll use the ‘stack’ method to stack up data points of the same value. Also, let’s use solid dots instead of hollow squares and set some distance between the points.

  method = "stack",   # stack up data points of the same value
  pch = 16,           # use solid round label for the points
  offset = 0.5,       # set the distance between points at 0.5
  at = 0              # set the location of points to be near the x axis

Eric: Beautiful! Looks like there are one, two, three…(counting dots between 0 and 1) 12 products priced at $1.00 or below. And there are…
Jie: You want to see how many products fall in each dollar bracket? Let’s pull out the histogram (calling the hist() function).

# draw a histogram for a vector

Eric: That saves me a lot of time counting. There are 12 products between $1.00 and $2.00. Your yogurt-X is $1.59. I am sure there are a lot below $1.50. Can you narrow the bins so I can see how many cost less than $1.50?
Jie: Sure! (specifying the breaks argument for the hist() function)

  breaks = 16   # set the number of breaks or bins at 16

Eric: See? There are 12 (between $0.50 and $1.00) and 3 (between $1.00 and $1.50), a total of 15 under $1.50. Your yogurt-X is the 16th and there are 37 yogurts in total…
Jie: So yogurt-X is on the cheap side.
Eric: Wait a second, these are over $3.00 (pointing to the middle part of the histogram)? I have never seen yogurt that expensive. What are they?
Jie: Oh, oops… (checking yogurt collection table), they are family-sized products. My bad.
Eric: Yeah, that’s not even a fair comparison.
Jie: Right, we need to look at products of the same size.
Eric: Also, perhaps the same yogurt type, flavor, organic or not, etc.
Jie: Exactly.

Obviously, our inquiry did not start out to be very scientific. But that’s okay. It is the process of seeking knowledge. We explore just because we are curious. We argue just because it is stimulating and fun.

And there is nothing exclusive about R. You don’t need a degree or a job to use it. As long as you have some questions and data that may answer those questions, R is amazingly empowering.

I plan to write on this Everyday Inquiry with R theme for K-12 teachers and students. If you have suggestions, please leave a comment. In the meantime, I encourage you to try R and discover its power on your own.

Note: I’m excited that this blog post will also be published on, which aggregates content contributed by bloggers who write about R. Learn more about all things R.

This material is based on work supported by the National Science Foundation under Grant No. 1742083. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.