Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
ENOT-AutoDL
/
gpt-j-6B-tensorrt-int8
like
7
Follow
ENOT AutoDL
11
Text Generation
Transformers
ONNX
lambada
English
text-generation-inference
causal-lm
int8
tensorrt
ENOT-AutoDL
Inference Endpoints
License:
apache-2.0
Model card
Files
Files and versions
Community
2
Train
Deploy
Use this model
554833e
gpt-j-6B-tensorrt-int8
3 contributors
History:
13 commits
igor
added onnx model (fake quant) compatible with trt
554833e
over 1 year ago
.gitattributes
Safe
1.57 kB
added onnx model (fake quant) compatible with trt
over 1 year ago
NVIDIA_GeForce_RTX_2080_Ti-8_5_3_1-i8f32.engine
Safe
8.5 GB
LFS
added 2080ti engine
over 1 year ago
NVIDIA_GeForce_RTX_3080_Ti-8_5_3_1-i8f32.engine
Safe
8.5 GB
LFS
normalized engine name
over 1 year ago
NVIDIA_GeForce_RTX_4090-8_5_3_1-i8f32.engine
Safe
8.5 GB
LFS
added 4090 engine
over 1 year ago
README.md
Safe
1.72 kB
updated README.md (added latency table)
over 1 year ago
gptj-i8.data
Safe
24.3 GB
LFS
added onnx model (fake quant) compatible with trt
over 1 year ago
gptj-i8.onnx
Safe
1.61 MB
LFS
added onnx model (fake quant) compatible with trt
over 1 year ago