Selected Recent Publications

  1. LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding, arXiv (2024). [PDF]

  2. Agent-as-a-Judge: Evaluate Agents with Agents, arXiv (2024). [PDF]

  3. High fidelity text-guided music generation and editing via single-stage flow matching, arXiv (2024). [PDF]

  4. MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases, ICML 2024. [PDF]

  5. EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything, CVPR 2024 (Highlight). [PDF]

  6. Taming Mode Collapse in Score Distillation for Text-to-3D Generation, CVPR 2024. [PDF]

  7. CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians, ECCV (2024). [PDF]

  8. MVDiffHD: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction, ECCV (2024). [PDF]

  9. TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-Device ASR Models, ICASSP 2024. [PDF]

  10. Stack-and-delay: a new codebook pattern for music generation, ICASSP 2024. [PDF]

  11. In-Context Prompt Editing for Conditional Audio Generation, ICASSP 2024. [PDF]

  12. On the Open Prompt Challenge in Conditional Audio Generation, ICASSP 2024. [PDF]

  13. Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition, ICASSP 2024. [PDF]

  14. LLM-QAT: Data-Free Quantization Aware Training for Large Language Models, ACL Findings (2024). [PDF]

  15. Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts, ACL Findings (2024). [PDF]

  16. An Introduction to Vision-Language Modeling, arXiv (2024). [PDF]

  17. PathFusion: Path-consistent Lidar-Camera Deep Feature Fusion, 3D Vision (2024). [PDF]

  18. SpinQuant: LLM Quantization with Learned Rotations, arXiv (2024). [PDF]

  19. Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications, arXiv (2024). [PDF]

  20. DREAM: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads, ASPLOS 2024. [PDF]

  21. SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity, arXiv (2023). [PDF]

  22. MiniGPT-v2: Large Language Model As a Unified Interface for Vision-Language Multi-task Learning, arXiv (2023). [PDF]

  23. Revisiting Sample Size Determination in Natural Language Understanding, arXiv (2023). [PDF]

  24. Enhance audio generation controllability through representation similarity regularization, arXiv (2023). [PDF]

  25. Exploring Speech Enhancement for Low-resource Speech Synthesis, arXiv (2023). [PDF]

  26. FoleyGen: Visually-Guided Audio Generation, arXiv (2023). [PDF]

  27. Towards Zero-Shot Multilingual Transfer for Code-Switched Responses, ACL 2023. [PDF]

  28. XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse, MLSys 2023. [PDF]

  29. Fast Point Cloud Generation with Straight Flows, CVPR 2023. [PDF]

  30. LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting, arXiv (2022). [PDF]

  31. Feature-align network with knowledge distillation for efficient denoising, WACV 2022. [PDF]

  32. NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training, ICLR 2022. [PDF]

  33. Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation, CVPR 2022. [PDF]

  34. DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks, ICML 2022. [PDF]

  35. Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet, ICASSP 2022. [PDF]

  36. Streaming Parallel Transducer Beam Search with Fast-Slow Cascaded Encoders, INTERSPEECH 2022. [PDF]

  37. ScaleNAS: Multi-Path One-Shot NAS for Scale-Aware High-Resolution Representation, AutoML 2022. [PDF]

  38. Contrastive Quant: Quantization makes Stronger Contrastive Learning, DAC 2022. [PDF]

  39. Feature-Align Network with Knowledge Distillation for Efficient Denoising, WACV 2022. [PDF]

  40. CPT: Efficient Deep Neural Network Training via Cyclic Precision, ICLR 2021 (Spotlight Presentation). [PDF]

  41. AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling, CVPR 2021. [PDF]

  42. KeepAugment: A Simple Information-Preserving Data Augmentation Approach, CVPR 2021. [PDF]

  43. AlphaNet: Improved Training of Supernet with Alpha-Divergence, ICML 2021 (Long Presentation). [PDF]

  44. Double-win Quant: Aggressively Winning Robustness of Quantized Deep Neural Networks via Random Precision Training and Inference, ICML 2021. [PDF]

  45. NASGEM: Neural Architecture Search via Graph Embedding Method, AAAI 2021. [PDF]

  46. Collaborative Training of Acoustic Encoders for Speech Recognition, INTERSPEECH 2021. [PDF]

  47. Memory-efficient Speech Recognition on Smart Devices, ICASSP 2021. [PDF]

  48. Heterogeneous Dataflow Accelerators for Multi-DNN Workloads, HPCA 2021. [PDF]

  49. EVRNet: Efficient Video Restoration on Edge Devices, International Conference on Multimedia, 2021. [PDF]

  50. Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search, ASPLOS 2021. [PDF]

  51. Noisy Training Improves E2E ASR for the Edge, arXiv (2021). [PDF]

  52. Low-Rank+ Sparse Tensor Compression for Neural Networks, arXiv (2021). [PDF]

  53. Vision Transformers with Patch Diversification, arXiv (2021). [PDF]

  54. Can Temporal Information Help with Contrastive Self-Supervised Learning?, arXiv (2020). [PDF]

  55. DNA: Differentiable Network-Accelerator Co-Search, arXiv (2020). [PDF]

  56. One weight bitwidth to rule them all, Embedded Vision Workshop, ECCV 2020 (Best Paper Award). [PDF]

  57. Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks, DAC 2020. [PDF]

  58. RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing, ISCA 2020. [PDF]

  59. Energy-Aware Neural Architecture Optimization With Splitting Steepest Descent, Workshop on Energy Efficient Machine Learning and Cognitive Computing, NeurIPS (2019). [PDF]

  60. Improving Efficiency in Neural Network Accelerator using Operands Hamming Distance Optimization, Workshop on Energy Efficient Machine Learning and Cognitive Computing, NeurIPS (2019). [PDF]

  61. Federated Learning with Non-IID Data, arXiv (2018). [PDF]

  62. CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs, arXiv (2018). [PDF]

  63. Not All Ops are Created Equal!, SysML (2018). [PDF]

  64. PrivyNet: A Flexible Framework for Privacy-Preserving Deep Neural Network Training, arXiv (2018). [PDF]

  65. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks, International Symposium on Computer Architecture, 2018. [PDF]

  66. Hello Edge: Keyword Spotting on Microcontrollers, arXiv (2017). [PDF]

  67. Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations, arXiv (2017). [PDF]

  68. Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks, FPGA Conference (2016). [PDF]