# Everyday Inquiry with R: Is Yogurt-X Expensive?

To kick off this Everyday Inquiry with R series, I’d like to recount a conversation between my friend Eric and me about one of Americans’ favorite foods, yogurt.

R is a free programming language for statistical computing and graphics, which we’re using in our new National Science Foundation-funded CodeR4MATH project to research the development of students’ computational thinking and mathematical modeling competencies.

The other day I showed Eric my yogurt collection. He was amazed that I had tried so many different brands and flavors.

Eric: Which one is your favorite?
Jie: Currently yogurt-X (a pseudonym of my favorite brand).
Eric: How much is it?
Jie: \$1.59
Eric: That’s expensive.
Jie: No, it’s not.
Eric: It is. Take a look at the prices of all the products you collected.

I had previously stored the prices in a vector called ‘yogurt_price’. Below is the simple R code to do that.

```# create a vector 'yogurt_price' consisting of yogurt prices
yogurt_price = c(1.13, 2.00, 1.69, 1.79, 2.09, 1.00, 1.00, 0.60, 1.00, 1.11, 1.79, 3.19, 1.79, 1.99, 3.69, 2.79, 0.60, 1.79, 1.99, 4.09, 4.49, 4.49, 0.89, 0.89, 1.99, 2.09, 2.09, 2.09, 2.09, 0.69, 1.59, 0.69, 0.69, 0.69, 1.00, 1.19, 7.69)
```

Jie: Here they are (typing yogurt_price in R console to view the data).

```# view a vector
yogurt_price

[1] 1.13 2.00 1.69 1.79 2.09 1.00 1.00 0.60 1.00 1.11 1.79 3.19 1.79 1.99

[15] 3.69 2.79 0.60 1.79 1.99 4.09 4.49 4.49 0.89 0.89 1.99 2.09 2.09 2.09

[29] 2.09 0.69 1.59 0.69 0.69 0.69 1.00 1.19 7.69

```

Eric: Nice. How many products did you collect?
Jie: There are…(calling the length() function)

```# count the number of elements in a vector
length(yogurt_price)

[1] 37

```

Jie: 37.
Eric: Oh, that’s a lot. Hmmm, which one is the most expensive (trying to eyeball the greatest number)?
Jie: Well, let me show you…(calling the sort() function)

```# sort the elements in a vector
sort(yogurt_price)

[1] 0.60 0.60 0.69 0.69 0.69 0.69 0.89 0.89 1.00 1.00 1.00 1.00 1.11 1.13

[15] 1.19 1.59 1.69 1.79 1.79 1.79 1.79 1.99 1.99 1.99 2.00 2.09 2.09 2.09

[29] 2.09 2.09 2.79 3.19 3.69 4.09 4.49 4.49 7.69

```

Eric: Wow, \$7.69? And the least expensive is only \$0.60. What’s the normal price then?
Jie: Normal? Well, there are a number of \$0.69s, \$1.00s, \$1.79s, and \$2.09s. Let me show a frequency count (calling the table() function).

```# generate a table of counts for each element in a vector
table(yogurt_price)

yogurt_price

0.6 0.69 0.89   1 1.11 1.13 1.19 1.59 1.69 1.79 1.99   2 2.09 2.79 3.19

2   4   2   4   1   1   1   1   1   4   3   1   5   1   1

3.69 4.09 4.49 7.69

1   1   2   1

```

Jie: There are 5 yogurts priced at \$2.09. Is that normal?
Eric: Hmmm…there are four \$0.69, four \$1.00, and four \$1.79. \$2.09 seems to be on the expensive end.
Jie: Let’s plot the data points on a number line and see where most prices fall (calling the stripchart() function).

```# draw a strip chart for a vector
stripchart(yogurt_price)
```

Eric: What is this?
Jie: The x axis is price. Each little square stands for a data point. Some of them are overlapping because the default method is ‘overplot’. Let me make a few changes. We’ll use the ‘stack’ method to stack up data points of the same value. Also, let’s use solid dots instead of hollow squares and set some distance between the points.

```stripchart(
yogurt_price,
method = "stack",   # stack up data points of the same value
pch = 16,           # use solid round label for the points
offset = 0.5,       # set the distance between points at 0.5
at = 0              # set the location of points to be near the x axis
)
```

Eric: Beautiful! Looks like there are one, two, three…(counting dots between 0 and 1) 12 products priced at \$1.00 or below. And there are…
Jie: You want to see how many products fall in each dollar bracket? Let’s pull out the histogram (calling the hist() function).

```# draw a histogram for a vector
hist(yogurt_price)
```

Eric: That saves me a lot of time counting. There are 12 products between \$1.00 and \$2.00. Your yogurt-X is \$1.59. I am sure there are a lot below \$1.50. Can you narrow the bins so I can see how many cost less than \$1.50?
Jie: Sure! (specifying the breaks argument for the hist() function)

```hist(
yogurt_price,
breaks = 16   # set the number of breaks or bins at 16
)
```

Eric: See? There are 12 (between \$0.50 and \$1.00) and 3 (between \$1.00 and \$1.50), a total of 15 under \$1.50. Your yogurt-X is the 16th and there are 37 yogurts in total…
Jie: So yogurt-X is on the cheap side.
Eric: Wait a second, these are over \$3.00 (pointing to the middle part of the histogram)? I have never seen yogurt that expensive. What are they?
Jie: Oh, oops… (checking yogurt collection table), they are family-sized products. My bad.
Eric: Yeah, that’s not even a fair comparison.
Jie: Right, we need to look at products of the same size.
Eric: Also, perhaps the same yogurt type, flavor, organic or not, etc.
Jie: Exactly.

Obviously, our inquiry did not start out to be very scientific. But that’s okay. It is the process of seeking knowledge. We explore just because we are curious. We argue just because it is stimulating and fun.

And there is nothing exclusive about R. You don’t need a degree or a job to use it. As long as you have some questions and data that may answer those questions, R is amazingly empowering.

I plan to write on this Everyday Inquiry with R theme for K-12 teachers and students. If you have suggestions, please leave a comment. In the meantime, I encourage you to try R and discover its power on your own.

Note: I’m excited that this blog post will also be published on R-Bloggers.com, which aggregates content contributed by bloggers who write about R. Learn more about all things R.

This material is based on work supported by the National Science Foundation under Grant No. 1742083. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

# Computational Thinking in Biology: What is an InSPECT Dataflow Diagram?

Integrating computational thinking into core science content and practices is a major goal of our InSPECT project, which is developing hands-on high school biology investigations using simple electronic sensors with Internet of Things (IoT) connectivity—a far cry from the simple germination experiments students usually encounter.

An article in the Fall 2017 Concord Consortium newsletter (“Science Thinking for Tomorrow Today”) describes the overall InSPECT project. Let’s take a closer look at a unique and powerful component of the project: virtual programming using a dataflow diagram.

The Dataflow interface enables students to do virtual programming within the browser-based interface.

Dataflow diagrams have been around since the 1970s. They’re a visual model of the “flow” of data through a system. InSPECT has created a diagramming environment called Dataflow, the first version of which was developed in partnership with Peter Sand at Manylabs, that is much more than boxes and arrows on a page; components come alive when wirelessly synced to sensors, whose numerical data are displayed on the screen in real time as it’s recorded.

An eco-column activity includes sensors plugged into a Raspberry Pi computer. Dataflow automates data collection and initiates actions, such as raising or lowering temperature.

InSPECT is piloting an eco-column activity that includes electronic sensors plugged into a low-cost Raspberry Pi computer the size of a credit card. The sensors collect data about conditions inside the eco-column chambers, such as humidity and oxygen levels, sending it wirelessly over the Internet to Dataflow, which can automate data collection 24 hours a day, uninterrupted and unattended.

With the addition of actuators, Dataflow also can be programmed to create actions based on the sensor data. Students control the real-world actuators by defining variables and setting up conditionals that are entered directly into Dataflow’s simple, visual interface, which can run in any browser on a Mac or PC platform. For example, students can program a Dataflow diagram to adjust the current to a Peltier cooler in order to tweak the temperature of an eco-chamber when the sensor reading becomes too low or too high.

Finally, students can visualize, analyze, and interpret their data by exporting it to one of Concord Consortium’s most popular tools, CODAP (Common Online Data Analysis Platform), a free web-based environment for data analysis. Dataflow can be embedded directly into CODAP.

The use of sensor technology is still new to many biology classes, but our early research is based on the idea that when students collect data in real time, they make a powerful connection to the concepts they are studying. Using Dataflow, students not only learn to design the experiment and control the variables, they come to understand dynamic ecological systems.

The InSPECT project is currently recruiting biology teachers for our fall 2018 and spring 2019 studies. Our goal is to explore how to support integrated science practices and computational thinking in biology. We have several labs to select from involving photosynthesis, respiration, seed germination, and/or plant growth in chambers that can be from three classes to four weeks long. If you’re interested, contact us at inspect@concord.org.