SCI Publications
2024
B. Orkild, K.M. Arefeen Sultan, E. Kholmovski, E. Kwan, E. Bieging, A. Morris, G. Stoddard, R. S. MacLeod, S. Elhabian, R. Ranjan & E. DiBella .
Image quality assessment and automation in late gadolinium-enhanced MRI of the left atrium in atrial fibrillation patients, In Journal of Interventional Cardiac Electrophysiology, Springer Nature, 2024.
DOI: https://doi.org/10.1007/s10840-024-01971-z
Background
Late gadolinium-enhanced (LGE) MRI has become a widely used technique to non-invasively image the left atrium prior to catheter ablation. However, LGE-MRI images are prone to variable image quality, with quality metrics that do not necessarily correlate to the image’s diagnostic quality. In this study, we aimed to define consistent clinically relevant metrics for image and diagnostic quality in 3D LGE-MRI images of the left atrium, have multiple observers assess LGE-MRI image quality to identify key features that measure quality and intra/inter-observer variabilities, and train and test a CNN to assess image quality automatically.
Methods
We identified four image quality categories that impact fibrosis assessment in LGE-MRI images and trained individuals to score 50 consecutive pre-ablation atrial fibrillation LGE-MRI scans from the University of Utah hospital image database. The trained individuals then scored 146 additional scans, which were used to train a convolutional neural network (CNN) to assess diagnostic quality.
Results
There was excellent agreement among trained observers when scoring LGE-MRI scans, with inter-rater reliability scores ranging from 0.65 to 0.76 for each category. When the quality scores were converted to a binary diagnostic/non-diagnostic, the CNN achieved a sensitivity of http://www.w3.org/1998/Math/MathML"><mn>0.80</mn><mo>±</mo><mn>0.06</mn></math>" role="presentation">
and a specificity of http://www.w3.org/1998/Math/MathML"><mn>0.56</mn><mo>±</mo><mn>0.10</mn></math>" role="presentation">Conclusion
The use of a training document with reference examples helped raters achieve excellent agreement in their quality scores. The CNN gave a reasonably accurate classification of diagnostic or non-diagnostic 3D LGE-MRI images of the left atrium, despite the use of a relatively small training set.
T.A.J. Ouermi, J. Li, T. Athawale, C.R. Johnson.
Estimation and Visualization of Isosurface Uncertainty from Linear and High-Order Interpolation Methods, In IEEE Workshop on Uncertainty Visualization: Applications, Techniques, Software, and Decision Frameworks, IEEE, pp. 51--61. 2024.
DOI: 10.1109/UncertaintyVisualization63963.2024.00012
Isosurface visualization is fundamental for exploring and analyzing 3D volumetric data. Marching cubes (MC) algorithms with linear interpolation are commonly used for isosurface extraction and visualization. Although linear interpolation is easy to implement, it has limitations when the underlying data is complex and high-order, which is the case for most real-world data. Linear interpolation can output vertices at the wrong location. Its inability to deal with sharp features and features smaller than grid cells can lead to an incorrect isosurface with holes and broken pieces. Despite these limitations, isosurface visualizations typically do not include insight into the spatial location and the magnitude of these errors. We utilize high-order interpolation methods with MC algorithms and interactive visualization to highlight these uncertainties. Our visualization tool helps identify the regions of high interpolation errors. It also allows users to query local areas for details and compare the differences between isosurfaces from different interpolation methods. In addition, we employ high-order methods to identify and reconstruct possible features that linear methods cannot detect. We showcase how our visualization tool helps explore and understand the extracted isosurface errors through synthetic and real-world data.
T.A.J. Ouermi, J. Li, Z. Morrow, B. Waanders, C.R. Johnson.
Glyph-Based Uncertainty Visualization and Analysis of Time-Varying Vector Fields, In IEEE Workshop on Uncertainty Visualization: Applications, Techniques, Software, and Decision Frameworks, IEEE, pp. 73--77. 2024.
DOI: 10.1109/UncertaintyVisualization63963.2024.00014
Uncertainty is inherent to most data, including vector field data, yet it is often omitted in visualizations and representations. Effective uncertainty visualization can enhance the understanding and interpretability of vector field data. For instance, in the context of severe weather events such as hurricanes and wildfires, effective uncertainty visualization can provide crucial insights about fire spread or hurricane behavior and aid in resource management and risk mitigation. Glyphs are commonly used for representing vector uncertainty but are often limited to 2D. In this work, we present a glyph-based technique for accurately representing 3D vector uncertainty and a comprehensive framework for visualization, exploration, and analysis using our new glyphs. We employ hurricane and wildfire examples to demonstrate the efficacy of our glyph design and visualization tool in conveying vector field uncertainty.
A. Panta, X. Huang, N. McCurdy, D. Ellsworth, A. Gooch, .
Web-based Visualization and Analytics of Petascale data: Equity as a Tide that Lifts All Boats, In Proceedings of the IEEE Visualization conference, IEEE, 2024.
Scientists generate petabytes of data daily to help uncover environmental trends or behaviors that are hard to predict. For example, understanding climate simulations based on the long-term average of temperature, precipitation, and other environmental variables is essential to predicting and establishing root causes of future undesirable scenarios and assessing possible mitigation strategies. While supercomputer centers provide a powerful infrastructure for generating petabytes of simulation output, accessing and analyzing these datasets interactively remains challenging on multiple fronts. This paper presents an approach to managing, visualizing, and analyzing petabytes of data within a browser on equipment ranging from the top NASA supercomputer to commodity hardware like a laptop. Our novel data fabric abstraction layer allows user-friendly querying of scientific information while hiding the complexities of dealing with file systems or cloud services. We also optimize network utilization while streaming from petascale repositories through state-of-the-art progressive compression algorithms. Based on this abstraction, we provide customizable dashboards that can be accessed from any device with any internet connection, enabling interactive visual analysis of vast amounts of data to a wide range of users - from top scientists with access to leadership-class computing environments to undergraduate students of disadvantaged backgrounds from minority-serving institutions. We focus on NASA’s use of petascale climate datasets as an example of particular societal impact and, therefore, a case where achieving equity in science participation is critical. We validate our approach by improving the ability of climate scientists to visually explore their data via two fully interactive dashboards. We further validate our approach by deploying the dashboards and simplified training materials in the classroom at a minority-serving institution. These dashboards, released in simplified form to the general public, contribute significantly to a broader push to democratize the access and use of climate data.
A. Panta, G. Scorzelli, A. Gooch, V. Pascucci, H. Lee.
Managing Large-scale Atmospheric and Oceanic Climate Data for Efficient Analysis and On-the-fly Interactive Visualization, 2024.
DOI: 10.22541/essoar.173238742.20533901/v1
Managing vast volumes of climate data, often reaching into terabytes and petabytes, presents significant challenges in terms of storage, accessibility, efficient analysis, and on-the-fly interactive visualization. Traditional data handling techniques are increasingly inadequate for the massive atmospheric and oceanic data generated by modern climate research. We tackled these challenges by reorganizing the native data layout to optimize access and processing, implementing advanced visualization algorithms like OpenVisus for real-time interactive exploration, and extracting comprehensive metadata for all available fields to improve data discoverability and usability. Our work utilized extensive datasets, including downscaled projections of various climate variables and high-resolution ocean simulations from NEX GDDP CMIP6 and NASA DYAMOND datasets. By transforming the data into progressive, streaming-capable formats and incorporating ARCO (Analysis Ready, Cloud Optimized) features before moving them to the cloud, we ensured that the data is highly accessible and efficient for analysis, while allowing direct access to data subsets in the cloud. The direct integration of the Python library called Xarray allows efficient and easy access to the data, leveraging the familiarity most climate scientists have with it. This approach, combined with the progressive streaming format, not only enhances the findability, shareability and reusability of the data but also facilitates sophisticated analyses and visualizations from commodity hardware like personal cell phones and computers without the need for large computational resources. By collaborating with climate scientists and domain experts from NASA Jet Propulsion Lab and NASA Ames Research Center, we published more than 2 petabytes of climate data via our interactive dashboards for climate scientists and the general public. Ultimately, our solution fosters quicker decision-making, greater collaboration, and innovation in the global climate science community by breaking down barriers imposed by hardware limitations and geographical constraints and allowing access to sophisticated visualization tools via publicly available dashboards.
M. Parashar.
Enabling Responsible Artificial Intelligence Research and Development Through the Democratization of Advanced Cyberinfrastructure, In Harvard Data Science Review, Special Issue 4: Democratizing Data, 2024.
Artificial intelligence (AI) is driving discovery, innovation, and economic growth, and has the potential to transform science and society. However, realizing the positive, transformative potential of AI requires that AI research and development (R&D) progress responsibly; that is, in a way that protects privacy, civil rights, and civil liberties, and promotes principles of fairness, accountability, transparency, and equity. This article explores the importance of democratizing AI R&D for achieving the goal of responsible AI and its potential impacts.
M. Parashar.
Everywhere & Nowhere: Envisioning a Computing Continuum for Science, Subtitled arXiv:2406.04480v1, 2024.
Emerging data-driven scientific workflows are seeking to leverage distributed data sources to understand end-to-end phenomena, drive experimentation, and facilitate important decision-making. Despite the exponential growth of available digital data sources at the edge, and the ubiquity of non trivial computational power for processing this data, realizing such science workflows remains challenging. This paper explores a computing continuum that is everywhere and nowhere – one spanning resources at the edges, in the core and in between, and providing abstractions that can be harnessed to support science. It also introduces recent research in programming abstractions that can express what data should be processed and when and where it should be processed, and autonomic middleware services that automate the discovery of resources and the orchestration of computations across these resources.
S. Parsa, B. Wang.
Harmonic Chain Barcode and Stability, Subtitled arXiv:2409.06093, 2024.
The persistence barcode is a topological descriptor of data that plays a fundamental role in topological data analysis. Given a filtration of the space of data, a persistence barcode tracks the evolution of its homological features. In this paper, we introduce a new type of barcode, referred to as the canonical barcode of harmonic chains, or harmonic chain barcode for short, which tracks the evolution of harmonic chains. As our main result, we show that the harmonic chain barcode is stable and it captures both geometric and topological information of data. Moreover, given a filtration of a simplicial complex of size n with m time steps, we can compute its harmonic chain barcode in O(m2nω + mn3) time, where nω is the matrix multiplication time. Consequently, a harmonic chain barcode can be utilized in applications in which a persistence barcode is applicable, such as feature vectorization and machine learning. Our work provides strong evidence in a growing list of literature that geometric (not just topological) information can be recovered from a persistence filtration.
M. Penwarden, H. Owhadi, R.M. Kirby.
Kolmogorov n-Widths for Multitask Physics-Informed Machine Learning (PIML) Methods: Towards Robust Metrics, Subtitled arXiv preprint arXiv:2402.11126, 2024.
Physics-informed machine learning (PIML) as a means of solving partial differential equations (PDE) has garnered much attention in the Computational Science and Engineering (CS&E) world. This topic encompasses a broad array of methods and models aimed at solving a single or a collection of PDE problems, called multitask learning. PIML is characterized by the incorporation of physical laws into the training process of machine learning models in lieu of large data when solving PDE problems. Despite the overall success of this collection of methods, it remains incredibly difficult to analyze, benchmark, and generally compare one approach to another. Using Kolmogorov n-widths as a measure of effectiveness of approximating functions, we judiciously apply this metric in the comparison of various multitask PIML architectures. We compute lower accuracy bounds and analyze the model's learned basis functions on various PDE problems. This is the first objective metric for comparing multitask PIML architectures and helps remove uncertainty in model validation from selective sampling and overfitting. We also identify avenues of improvement for model architectures, such as the choice of activation function, which can drastically affect model generalization to "worst-case" scenarios, which is not observed when reporting task-specific errors. We also incorporate this metric into the optimization process through regularization, which improves the models' generalizability over the multitask PDE problem.
D. Alex Quistberg, S.J. Mooney, T. Tasdizen, P. Arbelaez, Q.C. Nguyen.
Deep Learning-Methods to Amplify Epidemiological Data Collection and Analyses, In American Journal of Epidemiology, Oxford University Press, 2024.
Deep learning is a subfield of artificial intelligence and machine learning based mostly on neural networks and often combined with attention algorithms that has been used to detect and identify objects in text, audio, images, and video. Serghiou and Rough (Am J Epidemiol. 0000;000(00):0000-0000) present a primer for epidemiologists on deep learning models. These models provide substantial opportunities for epidemiologists to expand and amplify their research in both data collection and analyses by increasing the geographic reach of studies, including more research subjects, and working with large or high dimensional data. The tools for implementing deep learning methods are not quite yet as straightforward or ubiquitous for epidemiologists as traditional regression methods found in standard statistical software, but there are exciting opportunities for interdisciplinary collaboration with deep learning experts, just as epidemiologists have with statisticians, healthcare providers, urban planners, and other professionals. Despite the novelty of these methods, epidemiological principles of assessing bias, study design, interpretation and others still apply when implementing deep learning methods or assessing the findings of studies that have used them.
S. Saklani, C. Goel, S. Bansal, Z. Wang, S. Dutta, T. Athawale, D. Pugmire, C.R. Johnson.
Uncertainty-Informed Volume Visualization using Implicit Neural Representation, In IEEE Workshop on Uncertainty Visualization: Applications, Techniques, Software, and Decision Frameworks, IEEE, pp. 62--72. 2024.
DOI: 10.1109/UncertaintyVisualization63963.2024.00013
The increasing adoption of Deep Neural Networks (DNNs) has led to their application in many challenging scientific visualization tasks. While advanced DNNs offer impressive generalization capabilities, understanding factors such as model prediction quality, robustness, and uncertainty is crucial. These insights can enable domain scientists to make informed decisions about their data. However, DNNs inherently lack ability to estimate prediction uncertainty, necessitating new research to construct robust uncertainty-aware visualization techniques tailored for various visualization tasks. In this work, we propose uncertainty-aware implicit neural representations to model scalar field data sets effectively and comprehensively study the efficacy and benefits of estimated uncertainty information for volume visualization tasks. We evaluate the effectiveness of two principled deep uncertainty estimation techniques: (1) Deep Ensemble and (2) Monte Carlo Dropout (MC-Dropout). These techniques enable uncertainty-informed volume visualization in scalar field data sets. Our extensive exploration across multiple data sets demonstrates that uncertainty-aware models produce informative volume visualization results. Moreover, integrating prediction uncertainty enhances the trustworthiness of our DNN model, making it suitable for robustly analyzing and visualizing real-world scientific volumetric data sets.
S.A. Sakin, K.E. Isaacs.
A Literature-based Visualization Task Taxonomy for Gantt Charts, Subtitled arXiv:2408.04050, 2024.
Gantt charts are a widely-used idiom for visualizing temporal discrete event sequence data where dependencies exist between events. They are popular in domains such as manufacturing and computing for their intuitive layout of such data. However, these domains frequently generate data at scales which tax both the visual representation and the ability to render it at interactive speeds. To aid visualization developers who use Gantt charts in these situations, we develop a task taxonomy of low level visualization tasks supported by Gantt charts and connect them to the data queries needed to support them. Our taxonomy is derived through a literature survey of visualizations using Gantt charts over the past 30 years.
C. Scully-Allison, I. Lumsden, K. Williams, J. Bartels, M. Taufer, S. Brink, A. Bhatele, O. Pearce, K. Isaacs.
Design Concerns for Integrated Scripting and Interactive Visualization in Notebook Environments, In IEEE Transactions on Visualization and Computer Graphics, IEEE, 2024.
DOI: 10.1109/TVCG.2024.3354561
Interactive visualization can support fluid exploration but is often limited to predetermined tasks. Scripting can support a vast range of queries but may be more cumbersome for free-form exploration. Embedding interactive visualization in scripting environments, such as computational notebooks, provides an opportunity to leverage the strengths of both direct manipulation and scripting. We investigate interactive visualization design methodology, choices, and strategies under this paradigm through a design study of calling context trees used in performance analysis, a field which exemplifies typical exploratory data analysis workflows with Big Data and hard to define problems. We first produce a formal task analysis assigning tasks to graphical or scripting contexts based on their specificity, frequency, and suitability. We then design a notebook-embedded interactive visualization and validate it with intended users. In a follow-up study, we present participants with multiple graphical and scripting interaction modes to elicit feedback about notebook-embedded visualization design, finding consensus in support of the interaction model. We report and reflect on observations regarding the process and design implications for combining visualization and scripting in notebooks.
M. Shao, A. Singh, S. Johnson, A. Pessin, R. Merrill, A. Page, H. Odeen, S. Joshi, A. Payne.
Design and Evaluation of an Open-Source Block Face Imaging System for 2D Histology to Magnetic Resonance Image Registration, In MethodsX, Vol. 13, Elsevier, pp. 103062. 2024.
ISSN: 2215-0161
DOI: https://doi.org/10.1016/j.mex.2024.103062
This study introduces a comprehensive hardware-software framework designed to enhance the quality of block face image capture—an essential intermediary step for registering 2D histology images to ex vivo magnetic resonance (MR) images. A customized camera mounting and lighting system is employed to maintain consistent relative positioning and lighting conditions. Departing from traditional transparent paraffin, dyed paraffin is utilized to enhance contrast for subsequent automatic segmentation. Our software facilitates fully automated data collection and organization, complemented by a real-time Quality Assurance (QA) section to assess the captured image's quality during the sectioning process. The setup is evaluated and validated using rabbit muscle and rat brain which underwent MR-guided focused ultrasound ablations. The customized hardware system establishes a robust image capturing environment. The software with a real-time QA section, enables operators to promptly rectify low-quality captures, thereby preventing data loss. The execution of our proposed framework produces robust registration results for H&E images to ex vivo MR images.
N. Shingde, T. Blattner, A. Bardakoff, W. Keyrouz, M. Berzins.
An illustration of extending Hedgehog to multi-node GPU architectures using GEMM, In Springer Nature (to appear), 2024.
Asynchronous task-based systems offer the possibility of making it easier to take advantage of scalable heterogeneous architectures. This paper extends the previous work, demonstrating how Hedgehog, a dataflow graph-based model developed at the National Institute of Standards and Technology, can be used to obtain high performance for numerical linear algebraic operations as a starting point for complex algorithms. While the results were promising, it was unclear how to scale them to larger matrices and compute node counts. The aim here is to show how the new, improved algorithm inspired by DPLASMA performs equally well using Hedgehog. The results are compared against the leading library DPLASMA to illustrate the performance of different asynchronous dataflow models. The work demonstrates that using general-purpose, high-level abstractions, such as Hedgehog’s dataflow graphs, makes it possible to achieve similar performance to the specialized linear algebra codes such as DPLASMA.
A. Singh, S. Adams-Tew, S. Johnson, H. Odeen, J. Shea, A. Johnson, L. Day, A. Pessin, A. Payne, S. Joshi.
Treatment Efficacy Prediction of Focused Ultrasound Therapies Using Multi-parametric Magnetic Resonance Imaging, In Cancer Prevention, Detection, and Intervention, Springer Nature Switzerland, pp. 190-199. 2024.
Magnetic resonance guided focused ultrasound (MRgFUS) is one of the most attractive emerging minimally invasive procedures for breast cancer, which induces localized hyperthermia, resulting in tumor cell death. Accurately assessing the post-ablation viability of all treated tumor tissue and surrounding margins immediately after MRgFUS thermal therapy residual tumor tissue is essential for evaluating treatment efficacy. While both thermal and vascular MRI-derived biomarkers are currently used to assess treatment efficacy, currently, no adequately accurate methods exist for the in vivo determination of tissue viability during treatment. The non-perfused volume (NPV) acquired three or more days following MRgFUS thermal ablation treatment is most correlated with the gold standard of histology. However, its delayed timing impedes real-time guidance for the treating clinician during the procedure. We present a robust deep-learning framework that leverages multiparametric MR imaging acquired during treatment to predict treatment efficacy. The network uses qualtitative T1, T2 weighted images and MR temperature image derived metrics to predict the three day post-ablation NPV. To validate the proposed approach, an ablation study was conducted on a dataset (N=6) of VX2 tumor model rabbits that had undergone MRgFUS ablation. Using a deep learning framework, we evaluated which of the acquired MRI inputs were most predictive of treatment efficacy as compared to the expert radiologist annotated 3 day post-treatment images.
S. Subramaniam, M. Miller, several co-authors, C. R. Johnson, et al..
Grand Challenges at the Interface of Engineering and Medicine, In IEEE Open Journal of Engineering in Medicine and Biology, Vol. 5, IEEE, pp. 1--13. 2024.
DOI: 10.1109/OJEMB.2024.3351717
Over the past two decades Biomedical Engineering has emerged as a major discipline that bridges societal needs of human health care with the development of novel technologies. Every medical institution is now equipped at varying degrees of sophistication with the ability to monitor human health in both non-invasive and invasive modes. The multiple scales at which human physiology can be interrogated provide a profound perspective on health and disease. We are at the nexus of creating “avatars” (herein defined as an extension of “digital twins”) of human patho/physiology to serve as paradigms for interrogation and potential intervention. Motivated by the emergence of these new capabilities, the IEEE Engineering in Medicine and Biology Society, the Departments of Biomedical Engineering at Johns Hopkins University and Bioengineering at University of California at San Diego sponsored an interdisciplinary workshop to define the grand challenges that face biomedical engineering and the mechanisms to address these challenges. The Workshop identified five grand challenges with cross-cutting themes and provided a roadmap for new technologies, identified new training needs, and defined the types of interdisciplinary teams needed for addressing these challenges. The themes presented in this paper include: 1) accumedicine through creation of avatars of cells, tissues, organs and whole human; 2) development of smart and responsive devices for human function augmentation; 3) exocortical technologies to understand brain function and treat neuropathologies; 4) the development of approaches to harness the human immune system for health and wellness; and 5) new strategies to engineer genomes and cells.
S. Subramaniam, M. Akay, M. A. Anastasio, V. Bailey, D. Boas, P. Bonato, A. Chilkoti, J. R. Cochran, V. Colvin, T. A. Desai, J. S. Duncan, F. H. Epstein, S. Fraley, C. Giachelli, K. J. Grande-Allen, J. Green, X. E. Guo, I. B. Hilton, J. D. Humphrey, C. R. Johnson, G. Karniadakis, M. R. King, R. F. Kirsch, S. Kumar, C. T. Laurencin, S. Li, R. L. Lieber, N. Lovell, P. Mali, S. S. Margulies, D. F. Meaney, B. Ogle, B. Palsson, N. A. Peppas, E. J. Perreault, R. Rabbitt, L. A. Setton, L. D. Shea, S. G. Shroff, K. Shung, A. S. Tolias, M. C. H. van der Meulen, S. Varghese, G. Vunjak-Novakovic, J. A. White, R. Winslow, J. Zhang, K. Zhang, C. Zukoski, M. I. Miller.
Grand Challenges at the Interface of Engineering and Medicine, In IEEE Open Journal of Engineering in Medicine and Biology, Vol. 5, IEEE, pp. 1--13. Feb, 2024.
ISSN: 2644-1276
DOI: 10.1109/ojemb.2024.3351717
Over the past two decades Biomedical Engineering has emerged as a major discipline that bridges societal needs of human health care with the development of novel technologies. Every medical institution is now equipped at varying degrees of sophistication with the ability to monitor human health in both non-invasive and invasive modes. The multiple scales at which human physiology can be interrogated provide a profound perspective on health and disease. We are at the nexus of creating "avatars" (herein defined as an extension of "digital twins") of human patho/physiology to serve as paradigms for interrogation and potential intervention. Motivated by the emergence of these new capabilities, the IEEE Engineering in Medicine and Biology Society, the Departments of Biomedical Engineering at Johns Hopkins University and Bioengineering at University of California at San Diego sponsored an interdisciplinary workshop to define the grand challenges that face biomedical engineering and the mechanisms to address these challenges. The Workshop identified five grand challenges with cross-cutting themes and provided a roadmap for new technologies, identified new training needs, and defined the types of interdisciplinary teams needed for addressing these challenges. The themes presented in this paper include: 1) accumedicine through creation of avatars of cells, tissues, organs and whole human; 2) development of smart and responsive devices for human function augmentation; 3) exocortical technologies to understand brain function and treat neuropathologies; 4) the development of approaches to harness the human immune system for health and wellness; and 5) new strategies to engineer genomes and cells.
K.M. Sultan, M.H.H. Hisham, B. Orkild, A. Morris, E. Kholmovski, E. Bieging, E. Kwan, R. Ranjan, E. DiBella, S. Elhabian.
HAMIL-QA: Hierarchical Approach to Multiple Instance Learning for Atrial LGE MRI Quality Assessment, Subtitled arXiv:2407.07254v1, 2024.
The accurate evaluation of left atrial fibrosis via high-quality 3D Late Gadolinium Enhancement (LGE) MRI is crucial for atrial fibrillation management but is hindered by factors like patient movement and imaging variability. The pursuit of automated LGE MRI quality assessment is critical for enhancing diagnostic accuracy, standardizing evaluations, and improving patient outcomes. The deep learning models aimed at automating this process face significant challenges due to the scarcity of expert annotations, high computational costs, and the need to capture subtle diagnostic details in highly variable images. This study introduces HAMIL-QA, a multiple instance learning (MIL) framework, designed to overcome these obstacles. HAMIL-QA employs a hierarchical bag and sub-bag structure that allows for targeted analysis within sub-bags and aggregates insights at the volume level. This hierarchical MIL approach reduces reliance on extensive annotations, lessens computational load, and ensures clinically relevant quality predictions by focusing on diagnostically critical image features. Our experiments show that HAMIL-QA surpasses existing MIL methods and traditional supervised approaches in accuracy, AUROC, and F1-Score on an LGE MRI scan dataset, demonstrating its potential as a scalable solution for LGE MRI quality assessment automation.
X. Tang, B. Zhang, B.S. Knudsen, T. Tasdizen.
DuoFormer: Leveraging Hierarchical Visual Representations by Local and Global Attention, Subtitled arXiv:2407.13920, 2024.
We here propose a novel hierarchical transformer model that adeptly integrates the feature extraction capabilities of Convolutional Neural Networks (CNNs) with the advanced representational potential of Vision Transformers (ViTs). Addressing the lack of inductive biases and dependence on extensive training datasets in ViTs, our model employs a CNN backbone to generate hierarchical visual representations. These representations are then adapted for transformer input through an innovative patch tokenization. We also introduce a ’scale attention’ mechanism that captures cross-scale dependencies, complementing patch attention to enhance spatial understanding and preserve global perception. Our approach significantly outperforms baseline models on small and medium-sized medical datasets, demonstrating its efficiency and generalizability. The components are designed as plug-and-play for different CNN architectures and can be adapted for multiple applications.
Page 11 of 149
