BenchMark

Pig Matrix provides a matched multi-omics framework that can support benchmark-style evaluation of emerging AI models in regulatory genomics. Because genomic, transcriptomic, epigenomic, and 3D genome layers were generated from the same biological samples, the database preserves coherent cross-modal relationships across tissues, developmental stages, and cell lines. This design makes Pig Matrix particularly useful for benchmarking missing-modality prediction, epigenomic imputation, regulatory readout prediction, inference of 3D chromatin features, and cross-species generalization beyond human- and mouse-centered settings.
Representative AI-oriented application scenarios enabled by Pig Matrix
Application scenario Why Pig Matrix fits AI capability that can be assessed Representative AI tools Reference
References
  1. Avsec Z et al. Advancing regulatory variant effect prediction with AlphaGenome. Nature. 2026.
  2. Murphy AE et al. Predicting cell type-specific epigenomic profiles accounting for biological context with deep learning. Nature Communications. 2024.
  3. Zhang Z et al. Developing a general AI model for integrating diverse genomic modalities and comprehensive genomic knowledge. Nucleic Acids Research. 2025.
  4. Hawkins-Hooker A et al. Getting personal with epigenetics: towards individual-specific epigenomic imputation with machine learning. Nature Communications. 2023.
  5. Ernst J, Kellis M. Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nature Biotechnology. 2015.
  6. Avsec Z et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods. 2021.
  7. Linder J et al. Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. Nature Genetics. 2025.
  8. Fu X et al. A foundation model of transcription across human cell types. Nature. 2025.
  9. Chen KM et al. A sequence-based global map of regulatory activity for deciphering human genetics. Nature Genetics. 2022.
  10. Yang R et al. Epiphany: predicting Hi-C contact maps from 1D epigenomic signals. Genome Biology. 2023.
  11. Fudenberg G et al. Predicting 3D genome folding from DNA sequence with Akita. Nature Methods. 2020.
  12. Schwessinger R et al. deepC: predicting 3D genome folding using megabase-scale transfer learning. Nature Methods. 2020.
  13. Dalla-Torre H et al. Nucleotide Transformer: building and evaluating robust foundation models for human genomics. Nature Methods. 2025.
  14. Zhou Z et al. DNABERT-2: efficient foundation model and benchmark for multi-species genomes. ICLR. 2024.
  15. Nguyen E et al. HyenaDNA: long-range genomic sequence modeling at single nucleotide resolution. NeurIPS. 2023.
  16. Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nature Methods. 2015.
  17. Wang Y et al. Quantifying the regulatory potential of genetic variants via a hybrid sequence-oriented model with SVEN. Nature Communications. 2024.