Selected Recent Publications

  1. CPT: Efficient Deep Neural Network Training via Cyclic Precision, ICLR 2021 (Spotlight Presentation). [PDF]

  2. Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search, ASPLOS 2021. [PDF]

  3. Memory-efficient Speech Recognition on Smart Devices, ICASSP 2021. [PDF]

  4. NASGEM: Neural Architecture Search via Graph Embedding Method, AAAI 2021. [PDF]

  5. Feature-Align Network and Knowledge Distillation for Efficient Denoising, arXiv (2021). [PDF]

  6. AlphaNet: Improved Training of Supernet with Alpha-Divergence, arXiv (2021). [PDF]

  7. Heterogeneous Dataflow Accelerators for Multi-DNN Workloads, HPCA 2021. [PDF]

  8. EVRNet: Efficient Video Restoration on Edge Devices, arXiv (2020). [PDF]

  9. ScaleNAS: One-Shot Learning of Scale-Aware Representations for Visual Recognition, arXiv (2020). [PDF]

  10. AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling, arXiv (2020). [PDF]

  11. Can Temporal Information Help with Contrastive Self-Supervised Learning?, arXiv (2020). [PDF]

  12. KeepAugment: A Simple Information-Preserving Data Augmentation Approach, arXiv (2020). [PDF]

  13. DNA: Differentiable Network-Accelerator Co-Search, arXiv (2020). [PDF]

  14. One weight bitwidth to rule them all, Embedded Vision Workshop, ECCV 2020 (Best Paper Award). [PDF]

  15. Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks, DAC 2020. [PDF]

  16. RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing, ISCA 2020. [PDF]

  17. Energy-Aware Neural Architecture Optimization With Splitting Steepest Descent, Workshop on Energy Efficient Machine Learning and Cognitive Computing, NeurIPS (2019). [PDF]

  18. Improving Efficiency in Neural Network Accelerator using Operands Hamming Distance Optimization, Workshop on Energy Efficient Machine Learning and Cognitive Computing, NeurIPS (2019). [PDF]

  19. Federated Learning with Non-IID Data, arXiv (2018). [PDF]

  20. CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs, arXiv (2018). [PDF]

  21. Not All Ops are Created Equal!, arXiv (2018). [PDF]

  22. PrivyNet: A Flexible Framework for Privacy-Preserving Deep Neural Network Training, arXiv (2018). [PDF]

  23. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks, International Symposium on Computer Architecture, 2018. [PDF]

  24. Hello Edge: Keyword Spotting on Microcontrollers, arXiv (2017). [PDF]

  25. Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations, arXiv (2017). [PDF]

  26. Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks, FPGA Conference (2016). [PDF]