Profile Picture
  • All
  • Search
  • Images
  • Videos
    • Shorts
  • Maps
  • News
  • More
    • Shopping
    • Flights
    • Travel
  • Notebook
Report an inappropriate content
Please select one of the options below.
NVIDIA
NVIDIA GPU
Computer
GPU Prices
NVIDIA
RTX
Cramer Nvda
RTX 3060
Capital RX
RTX Stock
Price
Android Ai Photo
9850X3d
GeForce
Supporters' Shield
Nvda Stock
Price Today
NVIDIA
Desktop CPU
4080 Super vs 5080
NVIDIA
RTX 3050
Aas Pas 10
Hy GPU Prices
Collapsed in 2021
Broadcom
Computr Soft Ware
6800 XT vs 9060 XT
RTX 50 Series
2
NVIDIA
NVIDIA
Stock News
AMD Ryzen 7 5800X3d
From NVIDIA
to AMD
NVIDIA
Stock
NVIDIA
Stock Split
NVIDIA Price
Target
  • Length
    AllShort (less than 5 minutes)Medium (5-20 minutes)Long (more than 20 minutes)
  • Date
    AllPast 24 hoursPast weekPast monthPast year
  • Resolution
    AllLower than 360p360p or higher480p or higher720p or higher1080p or higher
  • Source
    All
    Dailymotion
    Vimeo
    Metacafe
    Hulu
    VEVO
    Myspace
    MTV
    CBS
    Fox
    CNN
    MSN
  • Price
    AllFreePaid
  • Clear filters
  • SafeSearch:
  • Moderate
    StrictModerate (default)Off
Filter
    NVIDIA
    NVIDIA GPU
    Computer
    GPU Prices
    NVIDIA
    RTX
    Cramer Nvda
    RTX 3060
    Capital RX
    RTX Stock
    Price
    Android Ai Photo
    9850X3d
    GeForce
    Supporters' Shield
    Nvda Stock
    Price Today
    NVIDIA
    Desktop CPU
    4080 Super vs 5080
    NVIDIA
    RTX 3050
    Aas Pas 10
    Hy GPU Prices
    Collapsed in 2021
    Broadcom
    Computr Soft Ware
    6800 XT vs 9060 XT
    RTX 50 Series
    2
    NVIDIA
    NVIDIA
    Stock News
    AMD Ryzen 7 5800X3d
    From NVIDIA
    to AMD
    NVIDIA
    Stock
    NVIDIA
    Stock Split
    NVIDIA Price
    Target
You now convert any LLM into a faster one without retraining from scratch.NVIDIA just did this to their 30B model. Here's the trick:1. Duplicate the model into two copies2. Freeze one copy, it just reads the prompt and remembers context3. Train the other copy to write chunks of text at once instead of one word at a time4. Run them togetherThe frozen copy barely costs anything (it's already trained). The new copy only needed ~8% of the original training data to learn the new trick.Result: 2.4x fa
0:13
You now convert any LLM into a faster one without retraining from scratch.NVIDIA just did this to their 30B model. Here's the trick:1. Duplicate the model into two copies2. Freeze one copy, it just reads the prompt and remembers context3. Train the other copy to write chunks of text at once instead of one word at a time4. Run them togetherThe frozen copy barely costs anything (it's already trained). The new copy only needed ~8% of the original training data to learn the new trick.Result: 2.4x fa
103.4K views1 day ago
x.comLior Alexander
See more
Static thumbnail place holder
More like this
  • Privacy
  • Terms