NASA EarthThe National Aeronautics and Space Administration (NASA) is involved in scientifically significant, exciting and widely-known projects like the International Space Station, the Journey to Mars, and the Hubble Space Telescope. The agency also freely shares unique knowledge about climate change and works with institutions around the world to gain new insights into how our planet is changing.

Recently our customer, Dr. Kwo-Sen Kuo, Chief Scientist and CEO at Bayesics, LLC, who works closely with NASA’s Goddard Space Flight Center, came to visit us at Paradigm4. He showed us some of his latest work and shared with us his thoughts on SciDB.

“Our experiments have demonstrated the potential of the SciDB array database system for the Earth Sciences domain,” stated Kuo.

The Use Case

“Long-term, comprehensive observations of Earth’s conditions are necessary for NASA scientists to understand the complex system of systems that contribute to global climatology. However, because approximately two-thirds of Earth consists of oceans where direct and dense measurements are difficult to obtain, remote sensing was a more cost-effective means for obtaining the measurements required to monitor Earth’s current health and to provide data for the prediction of its future.

“Remote sensing problems, however, are usually under-constrained. That is, its problem space is often of a higher dimensionality than that covered by the observations of the instruments. To gain better constraints and to reduce ambiguity, scientists strive to obtain as much simultaneous, co-located and independent information as possible concerning the problem space.

“The complication, however, is that CloudSat and TRMM [the sources of the data] do not fly in formation with similar orbit characteristics. CloudSat is in a sun-synchronous orbit with local overpasses roughly constant in solar time, while TRMM, is in an orbit with approximately 35-degree inclination, designed to attain better temporal sampling. As a result, they do not routinely observe the same location of Earth at the same time. Thus, to obtain nearly coincident observations from both the 94-GHz CPR and 13.8-GHz PR, we need to first identify intersections of their ground tracks.”

And that’s where SciDB comes in. 

SciDB or Hadoop/MapReduce?

“Recent technological developments, such as SciDB, which specifically target multidimensional arrays, are providing an attractive alternative to Hadoop/MapReduce (HD/MR) for scientific data analysis. SciDB, a next-generation array-model parallel database system, not only indexes the data it ingests for fast extraction and retrieval, but also provides an attractive, albeit still basic, mathematical/statistical toolbox for data analysis,” concluded Kwo.

Our thanks to Dr. Kwo for his kind permission to use excerpts from his paper.

Read Dr. Kwo’s Full Paper