Paradigm4 White Papers
Accelerating Bioinformatics Research with New Software for Big Data to Knowledge (BD2K)
This whitepaper describes where SciDB fits in the genomics and translational informatics data analysis pipeline and demonstrates use cases for data mining on genomics, phenotype, and outcomes data. We show how SciDB boosts bioinformatics analytical capabilities in support of research on disease understanding, biomarker discovery, and precision medicine.
SciDB for Bioinformatics AMI: Real-World Use Cases, Sample Queries, Real Data
SciDB is well-matched for managing, accessing, integrating, and analyzing diverse scientific data from ‘omics to outcomes. Our bioinformatics AMI makes it easy to test out SciDB in the cloud via Amazon EC2. This AMI includes data along with example bioassay and variant analysis use cases to let you quickly try out SciDB. (Redirects to Try SciDB Page for Bioinformatics AMI)
Leaving Data on the Table: Data Scientists Reveal Obstacles to Big Data Analytics
While Big Data enjoys widespread media coverage, not enough attention has been paid to what practitioners think — data scientists who manage and analyze massive volumes of data. We wanted to know, so Paradigm4 teamed up with Innovation Enterprise to ask over 100 data scientists for their help separating Big Data hype from reality. What we learned is that data scientists face multiple challenges achieving their company’s analytical aspirations. The upshot is that businesses are leaving data — and money — on the table.
MIT CSAIL Technical Report/GenBase: A Complex Analytics Genomics Benchmark
This paper introduces a new benchmark, designed to test database management system (DBMS) performance on a mix of data management tasks (joins, filters, etc.) and complex analytics (regression, singular value decomposition, etc.)As a specific use case, we have chosen genomics data for our benchmark, and have constructed a collection of typical tasks in this area.
Windowed Aggregates in SciDB
SciDB provides two closely related operations for creating intervals of ordered data for aggregation: windows and grids. This Tech Brief discusses useful functions in SciDB for computational finance.