• Hofmann Lab
  • People
  • Research
  • Publications
  • Teaching
  • Public Engagement
  • Links
  • News
  • Skip to primary navigation
  • Skip to main content
UT Shield
The Hofmann Lab
  • Hofmann Lab
  • People
    • Former Lab Members
  • Research
  • Publications
  • Teaching
  • Public Engagement
  • Links
  • News

March 23, 2019, Filed Under: 2019

SArKS: de novo discovery of gene expression regulatory motif sites and domains by suffix array kernel smoothing

Citation:

Wylie DC, Hofmann HA, BV Z. SArKS: de novo discovery of gene expression regulatory motif sites and domains by suffix array kernel smoothing. Bioinformatics [Internet]. btz198.

Publisher’s Version

Abstract

Motivation: We set out to develop an algorithm that can mine differential gene expression data to identify candidate cell type-specific DNA regulatory sequences. Differential expression is usually quantified as a continuous score—fold-change, test-statistic, P-value—comparing biological classes. Unlike existing approaches, our de novo strategy, termed SArKS, applies non-parametric kernel smoothing to uncover promoter motif sites that correlate with elevated differential expression scores. SArKS detects motif k-mers by smoothing sequence scores over sequence similarity. A second round of smoothing over spatial proximity reveals multi-motif domains (MMDs). Discovered motif sites can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing. Results: We applied SArKS to published gene expression data representing distinct neocortical neuron classes in Mus musculus and interneuron developmental states in Homo sapiens. When benchmarked against several existing algorithms using a cross-validation procedure, SArKS identified larger motif sets that formed the basis for regression models with higher correlative power. Availability and implementation: https://github.com/denniscwylie/sarks. Contact: denniswylie@austin.utexas.edu or zemelmanb@mail.clm.utexas.edu Supplementary information: Supplementary data are available at Bioinformatics online.

wylie_et_al._2019.pdf

UT Home | Emergency Information | Site Policies | Web Accessibility | Web Privacy | Adobe Reader

© The University of Texas at Austin 2026