SCI Publications
2011
D. Keyes, V. Taylor, T. Hey, S. Feldman, G. Allen, P. Colella, P. Cummings, F. Darema, J. Dongarra, T. Dunning, M. Ellisman, I. Foster, W. Gropp, C.R. Johnson, C. Kamath, R. Madduri, M. Mascagni, S.G. Parker, P. Raghavan, A. Trefethen, S. Valcourt, A. Patra, F. Choudhury, C. Cooper, P. McCartney, M. Parashar, T. Russell, B. Schneider, J. Schopf, N. Sharp.
Advisory Committee for CyberInfrastructure Task Force on Software for Science and Engineering, Note: NSF Report, 2011.
The Software for Science and Engineering (SSE) Task Force commenced in June 2009 with a charge that consisted of the following three elements:
Identify specific needs and opportunities across the spectrum of scientific software infrastructure. Characterize the specific needs and analyze technical gaps and opportunities for NSF to meet those needs through individual and systemic approaches. Design responsive approaches. Develop initiatives and programs led (or co-led) by NSF to grow, develop, and sustain the software infrastructure needed to support NSF’s mission of transformative research and innovation leading to scientific leadership and technological competitiveness. Address issues of institutional barriers. Anticipate, analyze and address both institutional and exogenous barriers to NSF’s promotion of such an infrastructure.The SSE Task Force members participated in bi-weekly telecons to address the given charge. The telecons often included additional distinguished members of the scientific community beyond the task force membership engaged in software issues, as well as personnel from federal agencies outside of NSF who manage software programs. It was quickly acknowledged that a number of reports loosely and tightly related to SSE existed and should be leveraged. By September 2009, the task formed had formed three subcommittees focused on the following topics: (1) compute-intensive science, (2) data-intensive science, and (3) software evolution.
S.H. Kim, V. Fonov, J. Piven, J. Gilmore, C. Vachet, G. Gerig, D.L. Collins, M. Styner.
Spatial Intensity Prior Correction for Tissue Segmentation in the Developing human Brain, In Proceedings of IEEE ISBI 2011, pp. 2049--2052. 2011.
DOI: 10.1109/ISBI.2011.5872815
R.M. Kirby, B. Cockburn, S.J. Sherwin.
To CG or to HDG: A Comparative Study, In Journal of Scientific Computing, Note: published online, 2011.
DOI: 10.1007/s10915-011-9501-7
Hybridization through the border of the elements (hybrid unknowns) combined with a Schur complement procedure (often called static condensation in the context of continuous Galerkin linear elasticity computations) has in various forms been advocated in the mathematical and engineering literature as a means of accomplishing domain decomposition, of obtaining increased accuracy and convergence results, and of algorithm optimization. Recent work on the hybridization of mixed methods, and in particular of the discontinuous Galerkin (DG) method, holds the promise of capitalizing on the three aforementioned properties; in particular, of generating a numerical scheme that is discontinuous in both the primary and flux variables, is locally conservative, and is computationally competitive with traditional continuous Galerkin (CG) approaches. In this paper we present both implementation and optimization strategies for the Hybridizable Discontinuous Galerkin (HDG) method applied to two dimensional elliptic operators. We implement our HDG approach within a spectral/hp element framework so that comparisons can be done between HDG and the traditional CG approach.
We demonstrate that the HDG approach generates a global trace space system for the unknown that although larger in rank than the traditional static condensation system in CG, has significantly smaller bandwidth at moderate polynomial orders. We show that if one ignores set-up costs, above approximately fourth-degree polynomial expansions on triangles and quadrilaterals the HDG method can be made to be as efficient as the CG approach, making it competitive for time-dependent problems even before taking into consideration other properties of DG schemes such as their superconvergence properties and their ability to handle hp-adaptivity.
R.C. Knickmeyer, C. Kang, S. Woolson, K.J. Smith, R.M. Hamer, W. Lin, G. Gerig, M. Styner, J.H. Gilmore.
Twin-Singleton Differences in Neonatal Brain Structure, In Twin Research and Human Genetics, Vol. 14, No. 3, pp. 268--276. 2011.
ISSN: 1832-4274
DOI: 10.1375/twin.14.3.268
A. Knoll, S. Thelen, I. Wald, C.D. Hansen, H. Hagen, M.E. Papka.
Full-Resolution Interactive CPU Volume Rendering with Coherent BVH Traversal, In Proceedings of IEEE Pacific Visualization 2011, pp. 3--10. 2011.
B.H. Kopell, J. Halverson, C.R. Butson, M. Dickinson, J. Bobholz, H. Harsch, C. Rainey, D. Kondziolka, R. Howland, E. Eskandar, K.C. Evans, D.D. Dougherty.
Epidural cortical stimulation of the left dorsolateral prefrontal cortex for refractory major depressive disorder, In Neurosurgery, Vol. 69, No. 5, pp. 1015--1029. November, 2011.
ISSN: 1524-4040
DOI: 10.1227/NEU.0b013e318229cfcd
S. Kumar, V. Vishwanath, P. Carns, B. Summa, G. Scorzelli, V. Pascucci, R. Ross, J. Chen, H. Kolla, R. Grout.
PIDX: Efficient Parallel I/O for Multi-resolution Multi-dimensional Scientific Datasets, In Proceedings of The IEEE International Conference on Cluster Computing, pp. 103--111. September, 2011.
Z. Leng, J.R. Korenberg, B. Roysam, T. Tasdizen.
A rapid 2-D centerline extraction method based on tensor voting, In 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, pp. 1000--1003. 2011.
DOI: 10.1109/ISBI.2011.5872570
A. Lex, H. Schulz, M. Streit, C. Partl, D. Schmalstieg.
VisBricks: Multiform Visualization of Large, Inhomogeneous Data, In IEEE Transactions on Visualization and Computer Graphics (InfoVis '11), Vol. 17, No. 12, 2011.
Large volumes of real-world data often exhibit inhomogeneities: vertically in the form of correlated or independent dimensions and horizontally in the form of clustered or scattered data items. In essence, these inhomogeneities form the patterns in the data that researchers are trying to find and understand. Sophisticated statistical methods are available to reveal these patterns, however, the visualization of their outcomes is mostly still performed in a one-view-fits-all manner, In contrast, our novel visualization approach, VisBricks, acknowledges the inhomogeneity of the data and the need for different visualizations that suit the individual characteristics of the different data subsets. The overall visualization of the entire data set is patched together from smaller visualizations, there is one VisBrick for each cluster in each group of interdependent dimensions. Whereas the total impression of all VisBricks together gives a comprehensive high-level overview of the different groups of data, each VisBrick independently shows the details of the group of data it represents, State-of-the-art brushing and visual linking between all VisBricks furthermore allows the comparison of the groupings and the distribution of data items among them. In this paper, we introduce the VisBricks visualization concept, discuss its design rationale and implementation, and demonstrate its usefulness by applying it to a use case from the field of biomedicine.
G. Li, R. Palmer, M. DeLisi, G. Gopalakrishnan, R.M. Kirby.
Formal Specification of MPI 2.0: Case Study in Specifying a Practical Concurrent Programming API, In Science of Computer Programming, Vol. 76, pp. 65--81. 2011.
DOI: 10.1016/j.scico.2010.03.007
We describe the first formal specification of a non-trivial subset of MPI, the dominant communication API in high performance computing. Engineering a formal specification for a non-trivial concurrency API requires the right combination of rigor, executability, and traceability, while also serving as a smooth elaboration of a pre-existing informal specification. It also requires the modularization of reusable specification components to keep the length of the specification in check. Long-lived APIs such as MPI are not usually 'textbook minimalistic' because they support a diverse array of applications, a diverse community of users, and have efficient implementations over decades of computing hardware. We choose the TLA+ notation to write our specifications, and describe how we organized the specification of around 200 of the 300 MPI 2.0 functions. We detail a handful of these functions in this paper, and assess our specification with respect to the aforementioned requirements. We close with a description of possible approaches that may help render the act of writing, understanding, and validating the specifications of concurrency APIs much more productive.
J. Li, J. Li, D. Xiu.
An Efficient Surrogate-based Method for Computing Rare Failure Probability, In Journal of Computational Physics, Vol. 230, No. 24, pp. 8683--8697. 2011.
DOI: 10.1016/j.jcp.2011.08.008
In this paper, we present an efficient numerical method for evaluating rare failure probability. The method is based on a recently developed surrogate-based method from Li and Xiu [J. Li, D. Xiu, Evaluation of failure probability via surrogate models, J. Comput. Phys. 229 (2010) 8966–8980] for failure probability computation. The method by Li and Xiu is of hybrid nature, in the sense that samples of both the surrogate model and the true physical model are used, and its efficiency gain relies on using only very few samples of the true model. Here we extend the capability of the method to rare probability computation by using the idea of importance sampling (IS). In particular, we employ cross-entropy (CE) method, which is an effective method to determine the biasing distribution in IS. We demonstrate that, by combining with the CE method, a surrogate-based IS algorithm can be constructed and is highly efficient for rare failure probability computation—it incurs much reduced simulation efforts compared to the traditional CE-IS method. In many cases, the new method is capable of capturing failure probability as small as 10-12 ~ 10-6 with only several hundreds samples.
Keywords: Rare events, Failure probability, Importance sampling, Cross-entropy
L. Lins, D. Koop, J. Freire, C.T. Silva.
DEFOG: A System for Data-Backed Visual Composition, SCI Technical Report, No. UUSCI-2011-003, SCI Institute, University of Utah, 2011.
W. Liu, S. Awate, J. Anderson, D. Yurgelun-Todd, P.T. Fletcher.
Monte Carlo expectation maximization with hidden Markov models to detect functional networks in resting-state fMRI, In Machine Learning in Medical Imaging, Lecture Notes in Computer Science (LNCS), Vol. 7009/2011, pp. 59--66. 2011.
DOI: 10.1007/978-3-642-24319-6_8
J. Luitjens, M. Berzins.
Scalable parallel regridding algorithms for block-structured adaptive mesh renement, In Concurrency And Computation: Practice And Experience, Vol. 23, No. 13, John Wiley & Sons, Ltd., pp. 1522--1537. 2011.
ISSN: 1532--0634
DOI: 10.1002/cpe.1719
J.P. Luitjens.
The Scalability of Parallel Adaptive Mesh Refinement Within Uintah, Note: Advisor: Martin Berzins, School of Computing, University of Utah, 2011.
Solutions to Partial Differential Equations (PDEs) are often computed by discretizing the domain into a collection of computational elements referred to as a mesh. This solution is an approximation with an error that decreases as the mesh spacing decreases. However, decreasing the mesh spacing also increases the computational requirements. Adaptive mesh refinement (AMR) attempts to reduce the error while limiting the increase in computational requirements by refining the mesh locally in regions of the domain that have large error while maintaining a coarse mesh in other portions of the domain. This approach often provides a solution that is as accurate as that obtained from a much larger fixed mesh simulation, thus saving on both computational time and memory. However, historically, these AMR operations often limit the overall scalability of the application.
Adapting the mesh at runtime necessitates scalable regridding and load balancing algorithms. This dissertation analyzes the performance bottlenecks for a widely used regridding algorithm and presents two new algorithms which exhibit ideal scalability. In addition, a scalable space-filling curve generation algorithm for dynamic load balancing is also presented. The performance of these algorithms is analyzed by determining their theoretical complexity, deriving performance models, and comparing the observed performance to those performance models. The models are then used to predict performance on larger numbers of processors. This analysis demonstrates the necessity of these algorithms at larger numbers of processors. This dissertation also investigates methods to more accurately predict workloads based on measurements taken at runtime. While the methods used are not new, the application of these methods to the load balancing process is. These methods are shown to be highly accurate and able to predict the workload within 3% error. By improving the accuracy of these estimations, the load imbalance of the simulation can be reduced, thereby increasing the overall performance.
J. Luitjens, M. Berzins.
Scalable parallel regridding algorithms for block-structured adaptive mesh refinement, In Concurrency and Computation: Practice and Experience, Vol. 23, No. 13, pp. 1522--1537. September, 2011.
DOI: 10.1002/cpe.1719
Block-structured adaptive mesh refinement (BSAMR) is widely used within simulation software because it improves the utilization of computing resources by refining the mesh only where necessary. For BSAMR to scale onto existing petascale and eventually exascale computers all portions of the simulation need to weak scale ideally. Any portions of the simulation that do not will become a bottleneck at larger numbers of cores. The challenge is to design algorithms that will make it possible to avoid these bottlenecks on exascale computers. One step of existing BSAMR algorithms involves determining where to create new patches of refinement. The Berger–Rigoutsos algorithm is commonly used to perform this task. This paper provides a detailed analysis of the performance of two existing parallel implementations of the Berger– Rigoutsos algorithm and develops a new parallel implementation of the Berger–Rigoutsos algorithm and a tiled algorithm that exhibits ideal scalability. The analysis and computational results up to 98 304 cores are used to design performance models which are then used to predict how these algorithms will perform on 100 M cores.
S.A. Maas, B.J. Ellis, D.S. Rawlins, L.T. Edgar, C.R. Henak, J.A. Weiss.
Implementation and Verification of a Nodally-Integrated Tetrahedral Element in FEBio, SCI Technical Report, No. UUSCI-2011-007, SCI Institute, University of Utah, 2011.
Keywords: MRL
R.S. MacLeod, J.J.E. Blauer.
Atrial Fibrillation, In Multimodal Cardiovascular Imaging: Principles and Clinical Applications, Ch. 25, Edited by O. Pahlm and G. Wagner, McGraw Hill, 2011.
ISBN: 0071613463
Atrial fibrillation (AF) is the most common form of cardiac arrhythmia so that a review of the role imaging in AF is a natural topic to include in this book. Further motivation comes from the fact that the treatment of AF probably includes more different forms of imaging, often merged or combined in a variety of ways, than perhaps any other clinical intervention. A typical clinical electrophysiology lab for the treatment of AF usually contains no less than 6 and often more than 8 individual monitors, each rendering some form of image based information about the patient undergoing therapy. There is naturally great motivation to merge different images and different imaging modalities in the setting of AF but also very challenging because of a host of factors related to the small size, extremely thin walls, the large natural variation in atrial shape, and the fact that fibrillation is occurring so that atrial shape is changing rapidly and irregularly. Thus, the use of multimodal imaging has recently become a very active and challenging area of image processing and analysis research and development, driven by an enormous clinical need to understand and treat a disease that affects some 5 million Americans alone, a number that is predicted to increase to almost 16 million by 2050.
In this chapter we attempt to provide an overview of the large variety of imaging modalities and uses in the management and understanding of atrial fibrillation, with special emphasis on the most novel applications of magnetic resonance imaging (MRI) technology. To provide clinical and biomedical motivation, we outline the basics of the disease together with some contemporary hypotheses about its etiology and management. We then describe briefly the imaging modalities in common use in the management and research of AF, then focus on the use or MRI for all phases of the management of patients with AF and indicate some of the major engineering challenges that can motivate further progress.
Keywords: ablation, carma, cvrti, 5P41-RR012553-10
J. Mandel, J.D. Beezley, A. Kochanski, V.Y. Kondratenko, L. Zhang, E. Anderson, J. Daniels II, C.T. Silva, C.R. Johnson.
A wildland fire modeling and visualization environment, In Proceedings of the Ninth Symposium on Fire and Forest Meteorology, pp. (published online). 2011.
T. Martin, E. Cohen, R.M. Kirby.
Direct Isosurface Visualization of Hex-Based High-Order Geometry and Attribute Representations, In IEEE Transactions on Visualization and Computer Graphics (TVCG), Vol. PP, No. 99, pp. 1--14. 2011.
ISSN: 1077-2626
DOI: 10.1109/TVCG.2011.103
In this paper, we present a novel isosurface visualization technique that guarantees the accuarate visualization of isosurfaces with complex attribute data defined on (un-)structured (curvi-)linear hexahedral grids. Isosurfaces of high-order hexahedralbased finite element solutions on both uniform grids (including MRI and CT scans) and more complex geometry represent a domain of interest that can be rendered using our algorithm. Additionally, our technique can be used to directly visualize solutions and attributes in isogeometric analysis, an area based on trivariate high-order NURBS (Non-Uniform Rational B-splines) geometry and attribute representations for the analysis. Furthermore, our technique can be used to visualize isosurfaces of algebraic functions. Our approach combines subdivision and numerical root-finding to form a robust and efficient isosurface visualization algorithm that does not miss surface features, while finding all intersections between a view frustum and desired isosurfaces. This allows the use of view-independent transparency in the rendering process. We demonstrate our technique through a straightforward CPU implementation on both complexstructured and complex-unstructured geometry with high-order simulation solutions, isosurfaces of medical data sets, and isosurfaces of algebraic functions.
Page 62 of 144
