Mixture of Experts

Omar Sanseviero @osanseviero Grok weights are out. Download them quickly at https://huggingface.co/xai-org/grok-1

huggingface-cli download xai-org/grok-1 –repo-type model –include ckpt/tensor* –local-dir checkpoints/ckpt-0 –local-dir-use-symlinks False

Learn about mixture of experts at https://hf.co/blog/moe
Replying to @osanseviero

It seems there is a conflict between saying “Grok-1 open-weights model” and “Due to the large size of the model (314B parameters), a multi-GPU machine is required” on the same page. Your people need to learn that “open data” means it has to be accessible to all. A few weights at a time, statistical summaries, examples, open description, links and background, some effort at an open community. “Here, we dumped this on the web. See, we shared!!” “You have no way to verify it, because you do not have the big computers that we have!!”
 

Your “Mixture of Experts Explained” is very revealing. Yes, I looked at your “Open Source MoEs” project links.

Are those few groups with big computers supposed to be the “Mixture of Experts”? It is not that hard if you actually share and use global open tokens.  When I see the flopping door on the suborbital test, I know why you do things the way you do. Tell him to stop saying “open”.

 
Do you know that “Moe” in Japanese means “feelings of strong affection toward characters in anime, manga, video games, and other media – directed at the otaku markets” So I recommend you always write out “Mixture of Experts approach”

Richard Collins, The Internet Foundation

Richard K Collins

About: Richard K Collins

Director, The Internet Foundation Studying formation and optimized collaboration of global communities. Applying the Internet to solve global problems and build sustainable communities. Internet policies, standards and best practices.


Leave a Reply

Your email address will not be published. Required fields are marked *