世界首富马斯克（马“死磕”）先是开放了电车的所有技术，俩天前再次将旗下X AI公司正式开源了3140亿参数的大型语言模型Grok-1,包括权重和网络架构。

混合专家模型

Grok-1是马斯克AI创企xAI发布的Grok的第一代产品，其参数量达到了3140亿，远超OpenAI GPT-3.5的1750亿。

当地时间2024年3月17日，马斯克宣布开源Grok-1。

这使Grok-1成为目前参数量最大的开源大语言模型。（相比叫其它模型）

Grok-1采用混合专家(MoE)架构,在给定token上的激活权重为25%。

X AI在2023年10月使用自定义训练堆栈从头训练了该模型,并遵守Apache 2.0许可证开源。

团子总感觉AI以后会不会向当年电车一样，会不会在国内犹如雨后春笋一样遍地开花，但是我觉得并非坏事，毕竟是造福全人类的好事，科技发展离不开互联网精神，更离不开我们想更好的知识借鉴学习，这里我也希望我们的中国更加强大，以及中国互联网企业或是技术人员更加强大，加油！！！

Grok-1大语音模型与Open AI （Close AI）同台竞技虽说略逊一筹？

但咱们要知道Grok-1是开源项目，相比较目前的 Open AI以后可以称之为（Close AI）而言个人感觉更胜一筹，甚至可以说是碾压，而GRok对于科技发展所提供的帮助不是一星半点，但是Open AI （Close AI）就目前而言只是为了挣钱，而不是面向全社会帮助全社会，当然不可否认它也给我们带来过价值，这里再次感谢马斯克（马“死磕”）马先生给我们带来太多惊喜了，我还记得此前他说过可能会公开，但没想到这么快，还不到一个星期直接公开了。（仅个人观点如有说错请评论指出，团子先谢谢了！！！）

1.Grok-1 开源模型下载：【磁力链接】，然后使用来自于Github另一个项目【Torrent 增强版客户端】或使用【qbittorrent不同操作系统的安装包】进行安装即可下载！

2.Grok-1 开源项目地址：【Github】

3.如果需要在云端安装Grok-1大模型，目前团子参考大部分云商，感觉从价格上而言，能直接用的云端每小时价格在20-50美元左右【vultr 租赁 H100 云GPU】进行云端搭建

4.当然也可以直接调用HuggingFace 🤗 Hub：

模型简述：

基于大量文本数据训练,未针对任何特定任务微调
3140亿参数MoE模型,激活权重25%
使用旋转嵌入而非固定位置嵌入
Tokenizer词汇大小131,072,嵌入大小6,144
64层Transformer,每层解码器层包含多头注意力块和密集块
多头注意力:48头查询,8头键/值,键值大小128
密集块:加宽因子8,隐藏层大小32768
每个token从8个专家选择2个
旋转位置嵌入大小6,144
上下文长度8192 tokens,精度bf16

模型能力:

在标准LM基准测试中表现超过同等计算量模型
HumanEval编码任务63.2%,MMLU 73%
在匈牙利高中数学考试中获C级(59%)
整体在推理和编码任务中表现出色

局限性:

缺少独立搜索网络能力,需结合搜索工具增强
可能产生幻觉,需人工审查
目前无法像X平台付费版实现实时获取信息

开源意义:

遵循Apache 2.0许可证,用户可自由使用修改分发
体现了xAI追求透明化和社区开放的理念
为进一步研究和创新提供了宝贵资源

总的来说,Grok-1作为一款大规模开源语言模型,在模型能力和透明度方面都具有重要意义,值得业内外人士关注和探索。当然,模型的实际应用还需要结合其他工具和人工审查,以发挥其最大潜能。

原文即地址：【Github】https://github.com/xai-org/grok-1

Grok-1

This repository contains JAX example code for loading and running the Grok-1 open-weights model.

Make sure to download the checkpoint and place the ckpt-0 directory in checkpoints - see Downloading the weights

Then, run


pip install -r requirements.txt
python run.py
pip install -r requirements.txt
python run.py
pip install -r requirements.txt
python run.py

to test the code.

The script loads the checkpoint and samples from the model on a test input.

Due to the large size of the model (314B parameters), a machine with enough GPU memory is required to test the model with the example code. The implementation of the MoE layer in this repository is not efficient. The implementation was chosen to avoid the need for custom kernels to validate the correctness of the model.

Model Specifications

Grok-1 is currently designed with the following specifications:

Parameters: 314B
Architecture: Mixture of 8 Experts (MoE)
Experts Utilization: 2 experts used per token
Layers: 64
Attention Heads: 48 for queries, 8 for keys/values
Embedding Size: 6,144
Tokenization: SentencePiece tokenizer with 131,072 tokens
Additional Features:
- Rotary embeddings (RoPE)
- Supports activation sharding and 8-bit quantization
Maximum Sequence Length (context): 8,192 tokens

Downloading the weights

You can download the weights using a torrent client and this magnet link:
注释：（磁力链接）


magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

or directly using HuggingFace 🤗 Hub:（签名一段是马斯克（马“死磕”）个人X账号链接）


git clone https://github.com/xai-org/grok-1.git && cd grok-1
pip install huggingface_hub[hf_transfer]
huggingface-cli download xai-org/grok-1 --repo-type model --include ckpt-0/* --local-dir checkpoints --local-dir-use-symlinks False
git clone https://github.com/xai-org/grok-1.git && cd grok-1
pip install huggingface_hub[hf_transfer]
huggingface-cli download xai-org/grok-1 --repo-type model --include ckpt-0/* --local-dir checkpoints --local-dir-use-symlinks False
git clone https://github.com/xai-org/grok-1.git && cd grok-1
pip install huggingface_hub[hf_transfer]
huggingface-cli download xai-org/grok-1 --repo-type model --include ckpt-0/* --local-dir checkpoints --local-dir-use-symlinks False

License

The code and associated Grok-1 weights in this release are licensed under the Apache 2.0 license. The license only applies to the source files in this repository and the model weights of Grok-1.