Discussion about this post

User's avatar
John Holman's avatar

Lol 'Speak softly and carry a big GPU' indeed 😂

Speaking of big GPU's, I'd love to hear your take on Cerebra's WSE ? Have you actually seen one in action ? It sounds like they’ve decoupled memory and compute which is huge for running and training big models. Reports say they're running 2000 tokens a second vs 130 on an H100 cluster, and training time for a llama 3.1 70b went from a month to a day. Finally they leaned into the yield problem and engineered through it. As long as TSMC can reliably get the dinner plate sized wafer's out the door, then i would think there's gonna be a line for them out the door and around the block.

patrick gallagher's avatar

“However, the next 59,000 years of care will be paid for differently. “

Too funny. I wonder what the Neanderthal would have paid for a little Novocaine??

5 more comments...

No posts

Ready for more?