Projects & Thoughts

[Blog with Together.ai] Speculative decoding for high-throughput long-context inference

25 min read · September 05, 2024 · together.ai blog on MagicDec

2024
[Blog with Infini AI Lab] MagicDec: Breaking the Latency-Throughput Tradeoff for Long Contexts with Speculative Decoding

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Contexts with Speculative Decoding

14 min read · August 23, 2024 · Blog on MagicDec

2024
Reshaping Bonsai

Pruning LLMs for Mathematical Reasoning. Can we prune LLMs while maintaining their mathematical reasoning abilities? How does a novel comprehensive metric affect pruning?

6 min read · May 21, 2024

2024 · projects · deep-learning nlp llm pruning
Visual Prompt Tuning

Can you transfer prompts? What is the best place to append prompts? Do they increase the adversarial robustness? Find out here :)

3 min read · May 20, 2024

2024 · projects · deep-learning computer-vision visual-prompt-tuning