ENCODE Lab

Yungu Campus
Westlake University
Hangzhou, China
About Us:
The ENCODE Lab is led by Dr. Huan Wang, a Tenure-Track Assistant Professor in the AI Department at Westlake University. Our lab is dedicated to advancing the field of Artificial Intelligence by focusing on creating efficient and effective AI solutions.
Research Focus:
Our research focuses on Efficient AI
in vision and language modeling, spanning image classifcation / detection / segmentation [GReg, PaI-Survey, TPP] to neural style transfer [Ultra-Resolution-NST], single image super-resolution [ASSL/GASSL, SRP, ISS-P, Oracle-Pruning-Sanity-Check], 3D novel view synthesis / neural rendering / NeRF / NeLF [R2L, MobileR2L, LightAvatar], AIGC / diffusion models / Stable Diffusion [SnapFusion, FreeBlend], LLM / MLLM [DyCoke, Poison-as-Cure], and snapshot compressive imaging (SCI) [QuantizedSCI, MobileSCI].
Our Mission:
Our mission is to advance AI by creating efficient, broadly applicable methods and models. We’re dedicated to driving both theoretical innovation and tangible solutions for diverse real-world problems.
News
2025/02 | [CVPR'25] DyCoke is accepted by CVPR’25! Congrats to Keda!🎉 DyCoke is a training-free, plug-and-play token compression method for fast video LLMs: 1.5x wall-clock inference speedup and 1.4x memory reduction with no performance drop. [arxiv] [code] |
---|---|
2025/02 | [Preprint] Can diffusion models blend visual concepts that are semantically very unsimilar (e.g., an orange and a teddy bear)? Yes, we introduce FreeBlend, a new method to blend arbitrary concepts. [arxiv] [code] [webpage] |
2025/01 | [Preprint] Adversarial visual noise is always malicious to our models like “poison”? No, we find it can also be a cure to mitigate the hallucination problem of VLMs. [arxiv] [code] [webpage] |
2025/01 | [ICLR'25] One paper about distilling large foundation models with low cost “Compressing Vision Foundation Models at ImageNet-level Costs” is accepted by ICLR’25. Thanks to the lead author Yitian! |
2024/12 | [Preprint] We present empirical evidence to show that oracle pruning, the “ground-truth” pruning paradigm that has been followed for around 35 years in the pruning community, does not hold in practice. [arxiv][webpage] |
2024/07 | [NeurIPS'24] We introduce a training framework Scala to learn slimmable ViTs. Using Scala, a ViT model is trained once but can inference at different widths, up to the need of devices with different resources. The project is led by Yitian. Congrats! |
2024/07 | [MM'24] We present the first real-time on-device video SCI (Snapshot Compressive Imaging) framework via dedicated network design and a distillation-based training strategy. Congrats to Miao! |
2024/07 | [ECCV'24] One paper about efficient video SCI (Snapshot Compressive Imaging) via network quantization is accepted by ECCV’24 as an oral. Congrats to Miao! [code] |