All
Search
Images
Videos
Shorts
Maps
News
More
Shopping
Flights
Travel
Notebook
Report an inappropriate content
Please select one of the options below.
Not Relevant
Offensive
Adult
Child Sexual Abuse
Text Summarization Fast Inference
What's
Speculative Decoding
Speculative Decoding
for LLM
Transformer Models Fast Inference
Machine Translation Fast Inference
Speculative
Execution
Vllm GitHub Windows
What Is
Speculative Execution
Speculative Decoding
LLMs Explained
Text Summarization (Ts)
Openvino Docker Quick Start
Speech Recognition Fast Inference
K80 LLM Inference
Speech Recognition (Sr)
La Conception
Speculative
Transformer Models
Beam Search
LLM Draft Model
Speculative
John S Grocery and Hardware
Machine Translation (Mt)
Deep Mind
Spec Decode LLM
Machine Learning (Ml)
Mariana Internet
Sqampling in Lmmqs
Neural Networks
Artificial Intelligence (Ai)
Length
All
Short (less than 5 minutes)
Medium (5-20 minutes)
Long (more than 20 minutes)
Date
All
Past 24 hours
Past week
Past month
Past year
Resolution
All
Lower than 360p
360p or higher
480p or higher
720p or higher
1080p or higher
Source
All
Dailymotion
Vimeo
Metacafe
Hulu
VEVO
Myspace
MTV
CBS
Fox
CNN
MSN
Price
All
Free
Paid
Clear filters
SafeSearch:
Moderate
Strict
Moderate (default)
Off
Filter
Text Summarization Fast Inference
What's
Speculative Decoding
Speculative Decoding
for LLM
Transformer Models Fast Inference
Machine Translation Fast Inference
Speculative
Execution
Vllm GitHub Windows
What Is
Speculative Execution
Speculative Decoding
LLMs Explained
Text Summarization (Ts)
Openvino Docker Quick Start
Speech Recognition Fast Inference
K80 LLM Inference
Speech Recognition (Sr)
La Conception
Speculative
Transformer Models
Beam Search
LLM Draft Model
Speculative
John S Grocery and Hardware
Machine Translation (Mt)
Deep Mind
Spec Decode LLM
Machine Learning (Ml)
Mariana Internet
Sqampling in Lmmqs
Neural Networks
Artificial Intelligence (Ai)
Faster LLMs: Accelerate Inference with Speculative Decoding
11 months ago
ibm.com
23:40
Speculative Speculative Decoding: How to Parallelize Drafting and ... for 2x Faster LLM Inference
178 views
2 months ago
YouTube
Xiaol.x
How to Quadruple LLM Decoding Performance with Speculative Decoding (SpD) and Microscaling (MX) Formats on Qualcomm® Cloud AI 100
Aug 1, 2024
qualcomm.com
Speculative Decoding — Think Fast⚡, Then Think Right✅
Apr 13, 2025
substack.com
6:18
What is Speculative Sampling? | Boosting LLM inference speed
4K views
Nov 20, 2024
YouTube
AssemblyAI
0:31
Speculative Decoding • LLM Acceleration Patterns
1 views
1 month ago
YouTube
Technical Interview Essentials A–Z
1:16:02
Speculative Decoding and Efficient LLM Inference with Chris Lott - 717
1.8K views
Feb 3, 2025
YouTube
The TWIML AI Podcast with Sam Charrington
🌵 Speculative Speculative DecodingWhat if your draft model could speculate while the target model is still verifying? That's the idea behind Speculative Speculative Decoding (SSD). I've been… | Maxime Labonne | 15 comments
15 views
2 months ago
linkedin.com
1:23
Speculative Speculative Decoding for Faster LLM Inference
2.1K views
2 months ago
YouTube
Rajistics - data science, AI, and machine learning
37:34
Speculative Decoding Explained
7.8K views
Dec 21, 2023
YouTube
Trelis Research
14:37
Understanding Speculative Decoding: Boosting LLM Efficiency and Speed
470 views
Apr 6, 2025
YouTube
MLWorks
2:27:59
COLING 2025 Tutorial: Speculative Decoding for Efficient LLM Inference
398 views
Jan 23, 2025
bilibili
云安Ann
17:56
Behind the Stack, Ep 11 - Speculative Decoding
70 views
6 months ago
YouTube
Doubleword
12:45
Speculative Decoding & Inference Speed — 2-3x Faster LLMs With Zero Quality Loss
2 weeks ago
YouTube
Jeff Heidelberger
24:17
Fast Inference from Transformers via Speculative Decoding
1.3K views
Sep 12, 2023
YouTube
Arxiv Papers
40:19
Speculation is all you need: Intro to Speculative Decoding for High Performance Inference
753 views
2 months ago
YouTube
Modal
5:04
Speculative Decoding: 2-3x Faster LLMs for Free
1 views
1 month ago
YouTube
The AI Century
1:05
What is Speculative decoding - Speculative decoding Explained #generativeai #RAG #ai #llm
320 views
2 months ago
YouTube
Med Bou | AI Tutorials
0:36
How AI Replies So Fast! ⚡ Speculative Decoding
164 views
4 months ago
YouTube
Mr. Doubty – Short. Smart. Techy
12:46
Speculative Decoding: When Two LLMs are Faster than One
32.9K views
Oct 12, 2023
YouTube
Efficient NLP
7:40
Speculative Decoding: 3× Faster LLM Inference with Zero Quality Loss
709 views
5 months ago
YouTube
Tales Of Tensors
0:26
Researchers found a way to make LLMs 8.5x faster!(without compromising accuracy)Speculative decoding is quite an effective way to address the single-token bottleneck in traditional LLM inference.A small "draft" model first generates the next several tokens, then the large model verifies all of them at once in a single forward pass.If a token at any position is wrong, you keep everything before it and restart from there. This never does worse than normal decoding.But current drafters in Speculati
10K views
1 week ago
x.com
Avi Chawla
1:08:32
LLM推理加速新范式!推测解码(Speculative Decoding)最新综述
3.2K views
Mar 2, 2024
bilibili
NICE学术
8:44
How to PROPERLY Use Speculative Decoding in LM Studio to DOUBLE Your AI Speed
1.9K views
3 months ago
YouTube
AsapGuide
7:00
Speculative Decoding with OpenVINO | Intel Software
197K views
10 months ago
YouTube
Intel Devs
1:09
This AI Trick Gives You 3x Speed For FREE
98 views
1 month ago
YouTube
The AI Century
2:42
AI Explained: Speculative decoding with vLLM
1.1K views
2 months ago
YouTube
Red Hat
9:39
Faster LLMs: Accelerate Inference with Speculative Decoding
22.1K views
11 months ago
YouTube
IBM Technology
0:46
Speculative Decoding Turbocharge Your LLM Inference! #ai, #llm, #inference, #optimization
67 views
3 months ago
YouTube
The Code Architect
6:53
How Speculative Decoding Makes LLMs 2.5x Faster (The Secret to Faster AI)
159 views
8 months ago
YouTube
FranksWorld of AI
See more
More like this
Feedback