Creative Writing Control Vectors Integration for ExLlamaV2
This project provides a wrapper to integrate jukofyork's creative writing control vectors with ExLlamaV2. While ExLlamaV2 does not natively support control vectors, this wrapper enables loading and injecting GGUF control vectors into the model for dynamic text generation control.
Overview
- Wrapper for using control vectors with ExLlamaV2
- Supports loading control vectors from GGUF format
- Injects vectors directly into ExLlamaV2 inference
- Enables dynamic text generation control
Usage
- Download model in ExLlamaV2 format
- Create a "-vectors" directory next to model directory
- Download the control vectors from jukofyork's repository and place them in the "-vectors" directory.
- Run inference with the
--control_vectors
(-vc
) parameter.
Example command:
python test_inference.py -m Meta-Llama-3-70B-Instruct-8bpw \
-p "<prompt>" \
--control_vectors language:simple:0.5,optimism:optimism:0.5
Directory Structure
Ensure your directory structure follows this format to correctly load the control vectors:
models/
βββ Meta-Llama-3-70B-Instruct-8bpw/
β βββ model files...
βββ Meta-Llama-3-70B-Instruct-8bpw-vectors/
βββ llama-3:70b-language__debias.gguf
βββ llama-3:70b-language__simple.gguf
βββ llama-3:70b-language__ornate.gguf
βββ ...
Limitations
- Proof of concept implementation
- May impact model performance
- Limited testing with different vector combinations
- No guarantee of exact equivalence to llama.cpp behavior
Acknowledgments
- Control vectors from jukofyork's creative-writing-control-vectors-v3.0
- ExLlamaV2 by turboderp