# Optimizing short-range and long-range atomic interactions

[Editor’s note: Piotr Janik (janikpiotrek@gmail.com) was a Google Summer of Code 2012 student at the Concord Consortium and is now a consultant working on our Next-Generation Molecular Workbench.]

Some time ago we described the core engine used in Molecular Workbench and our attempts to speed it up. At that time we focused mainly on the low-level optimization connected with reducing the number of necessary multiplications. This promising early work encouraged us to think even more about performance.

We next reviewed existing algorithms in the core of the molecular dynamics engine. To make a long story short, atoms interact with each other using two kinds of forces:

• Lennard-Jones forces (repulsion and short-range attraction)
• Coulomb forces (electrostatic and long-range attraction)

Atomic interactions are pairwise, meaning that we have to calculate forces between each pair of atoms while using the basic, naive algorithm. Having n atoms, we must perform about n^2/2 calculations. “The Big O” notation can be used and the computational complexity can be described as O(N^2), which means that the execution time of calculations grows very fast as the number of atoms used in the simulation increases. This is definitely an unwanted effect, but fortunately there are ways to reduce the complexity.

Solutions are different for short-range and long-range forces, so let’s start with short-range. “Short-range” means that atoms interact only while they are quite close to each other. Let’s use rCut as a symbol for the interaction maximum distance. So, one obvious optimization would be to limit calculations to pairs of atoms that are closer to each other than rCut. How? There are two popular approaches—cell lists and Verlet (neighbor) list algorithms.

The cell lists algorithm is based on the concept that we can divide the simulation area into smaller boxes or cells. Each cell dimension is equal to the maximum range of interaction between atoms—rCut. So, while calculating interactions for a given atom, it’s enough to take into account only atoms from the same box and its closest neighbors. Atoms in other boxes are too far to interact with this atom. This is both simple and effective, reducing computational complexity to O(N)! Note that it’s C * O(N) with a pretty significant C, unfortunately.

However, while calculating interactions between atoms in neighboring cells, still only 16% of atoms that we take into account are interacting! This is a waste of resources and where we find room for further optimizations. So, what about creating a list for each atom, which contains only atoms actually interacting with it? This Verlet or neighbor list algorithm as it’s called works well. The only problem is that we have to be smart about updating these lists, as atoms constantly change their position and, thus, their “neighborhood.” We can slightly extend these lists to also include some atoms outside the area of interaction. So each list should include atoms closer than rCut + d from the given atom, where d defines a buffer area size. Because of that, lists need to be updated only when the maximum displacement of some atom, measured since the moment of the previous lists update, is bigger than d. If it’s smaller, neighbor lists are still valid. Lists can be updated using the normal, naive algorithm (which still leaves the complexity O(N^2)), or even better, using the cell lists algorithm presented above! This ensures complexity O(N) and greatly reduces inefficiencies of the cell lists approach.

We’re also working on long-range forces optimization. Since we can no longer use the assumption that atoms interact only when they are close to each other, we can’t rely on the optimization strategies above. The algorithms are now more complicated. The problem of the electrostatic interaction is akin to a problem of gravitational interactions (called N-body problem), popular in astrophysics. One of the most common algorithms for speed-up of such calculations is the Barnes-Hut algorithm. We tried to implement it, but the overhead connected with creating additional data structures was bigger than potential performance gains. The reason is that the number of charged atoms we use in our models is too small to see the advantage of such an approach. As a result, we left our naive algorithms for long-range interactions, which perform better due to their simplicity.

However, we successfully implemented both short-range optimizations in Next-Generation Molecular Workbench and the results are spectacular. The speed-up varies from 20% for really small models (where the number of atoms is less than 50) to 700% for bigger ones (where the number of atoms is about 250). This is the really significant improvement and made complex models usable. As you can see, conceptual, algorithmic optimizations really matter!

We’re still thinking about further optimizations, both low level and algorithmic. Stay tuned as the Next-Generation MW is getting more and more computational power!

# Better than an Apple, a Gift for Teachers

Thanks to everyone who entered our Suggest-a-Model contest. We always enjoy hearing from teachers and love to help with hard-to-teach science concepts. If you haven’t already, please vote for the model you’d most like us to build.

To Vote

2) Look for the poll pinned to the top left of the page’s wall

3) Click on the idea you like most to cast your vote

Our goal is to build a custom computer model to help teach a complex, science, math or engineering concept suggested by real teachers, like YOU! We know all too well the awkwardness of jumping up and down and waving your hands to model the behavior of molecules or dancing around the classroom to model photosynthesis.

We received a lot of great ideas and whittled the list down to three concepts.

One finalist told us that her students “are always making fun of me looking like I am doing a swim stroke in front of the class” when she tries to model convection! She’d love a new set of heat transfer models!

Another finalist is looking for a model of nutrient runoff into coastal waters and how that stimulates harmful algal bloom production. Concerned about the environment? Show your support for this model!

A model of meiosis and genetic recombination (known as crossing over, when exchanges of chromosome portions occurs) also made it to the top three. If you teach biology or know a student who’s taking Bio, this may be the one for you.

Voting ends on November 30th, so please go to our Facebook page and vote now.

After voting is over, we’ll announce the winner and get started on building the model. And once it’s done, it’ll be available for free to everybody. Win-win all around! If you want to know when it’s available, be sure to like us on Facebook, follow us on Twitter and subscribe to our mailing list and RSS feed. We’ll be posting about it through all those channels.

But don’t wait to use our models. Check out our Activity Finder and Classic MW. These free resources contain lots of great examples of the models we already have available for science, math and engineering teachers at all grades. You’re sure to find an activity (or two or three!) that covers other difficult-to-teach concepts. Enjoy!

# Revisiting Educational Parallel Computing

About four years ago, I dreamed about how multicore computing could push educational computing into a high-performance era. It turns out that the progress in multicore computing has been slow. The computer I am using to write this blog post has four physical cores that support eight virtualized cores, but I don't feel it is dramatically faster than my previous one bought more than six years ago. Worse, it feels much slower than my new Thinkpad tablet, which is powered by a recent I7 dual-core processor. The fact that a dual-core Intel CPU beats a quad-core Intel CPU suggests something must be wrong in the multicore business.

Before the real promise of multicore computing arrives in my computers, two other things have changed the landscape of personal parallel computing. The first is general-purpose computing on graphics processing units (GPGPU), which uses hundreds of GPU processors in a graphics card to perform some calculations traditionally done in CPUs. OpenCL and CUDA are currently two frameworks that support developers to write parallel code to leverage the GPU power (or the power of hybrid CPU/GPU).

The second is cloud computing. Public clouds provide access to thousands of processors. IT companies have developed cutting-edge technologies that make modern search engines so fast. Can they be used to accelerate scientific simulations on your tablets? The Magellan Report on Cloud Computing for Science published by the U.S. Department of Energy last year provides some perspectives from the science community. Cloud gaming provides some complementary perspectives from the game industry. Putting these insights together, I think there is an opportunity here for educational technology developers who want to deliver killer animations for digital textbooks or online courses. After all, like games, the competition in the education media market will eventually be driven by the quality of animations. And when it comes to animations, high quality usually means realistic details and fast renderings.

GPGPU and cloud computing represent a departure from multicore computing to many-core computing. Regardless of what the future of computing will be, parallel computing is not optional -- it is inevitable. Educational technology can benefit from this wave of paradigm shift if we take actions now.

# InfraMation Keynote Delivered

Orlando is the center of the thermal imaging universe in November 6-8 when it hosts the largest infrared imaging conference in the world: InfraMation. Invited by FLIR Systems, I gave a Keynote Speech on the educational applications of IR imaging in this morning's Opening Plenary and I felt that it was very well received. The PEPSI joke about how to use an IR camera to produce a PEPSI logo (see the second image in this post) was a hit. Everyone laughed.