File size: 5,987 Bytes
0fc8bd5 16c082f 502285e 0fc8bd5 16c082f 2932159 0fc8bd5 7165054 c5c3104 ad186cc 0fc8bd5 7165054 0fc8bd5 7165054 0fc8bd5 7165054 a8796dd 7165054 a8796dd 7165054 a8796dd 7165054 a8796dd 7165054 a8796dd 7165054 0fc8bd5 7165054 0fc8bd5 7165054 0fc8bd5 7165054 502285e fbb655e 0fc8bd5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
---
base_model:
- inceptionai/jais-family-590m
- inceptionai/jais-family-590m
tags:
- merge
- mergekit
- lazymergekit
- inceptionai/jais-family-590m
- jais
- research
language:
- en
- ar
---
# Jais-590m-merged
Jais-590m-merged is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [inceptionai/jais-family-590m](https://huggingface.co/inceptionai/jais-family-590m)
* [inceptionai/jais-family-590m](https://huggingface.co/inceptionai/jais-family-590m)
(Yes, that's a straight merge of two identical non-fine-tuned models, for research purposes)
## 🧩 Configuration
```yaml
slices:
- sources:
- model: inceptionai/jais-family-590m
layer_range: [0, 18]
- model: inceptionai/jais-family-590m
layer_range: [0, 18]
merge_method: slerp
base_model: inceptionai/jais-family-590m
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
```
## 💻 Usage
/Due to the jais family tokenizer deployment with trust remote code, especially if handling Arabic, the following implementation is suggested for inferencing this merge model/
(Notebook saved in repo to run in google colab or similar)
```python
!pip install -qU transformers accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
# Model and message setup
model_name = "Solshine/Jais-590m-merged"
user_message = "Explain how transformers work in machine learning" # This can be any user input
# Structure the message with role-content pairing for compatibility with Jais-chat format
messages = [{"role": "user", "content": user_message}]
# Initialize tokenizer with trust_remote_code for custom Arabic-English handling
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
# Check if tokenizer is valid
if tokenizer is None:
raise ValueError("Tokenizer initialization failed!")
# Custom chat template including assistant role
def custom_chat_template(messages):
chat_prompt = ""
for message in messages:
role = message["role"]
content = message["content"]
chat_prompt += f"{role}: {content}\n"
# Add assistant role to prompt the model's response
chat_prompt += "assistant:"
return chat_prompt
# Generate the prompt
prompt = custom_chat_template(messages)
print(f"Generated prompt:\n{prompt}")
# Initialize the model
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
if model is None:
raise ValueError("Model initialization failed!")
# Move model to the appropriate device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Initialize the text generation pipeline
text_gen_pipeline = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
device=device,
torch_dtype=torch.float16,
trust_remote_code=True
)
# Generate text
try:
outputs = text_gen_pipeline(
prompt,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95,
pad_token_id=tokenizer.eos_token_id # Ensure proper stopping
)
# Extract and print the assistant's response
generated_text = outputs[0]["generated_text"]
assistant_response = generated_text.split("assistant:")[1].strip()
print(f"Assistant's response:\n{assistant_response}")
except Exception as e:
print(f"Error during text generation: {e}")
```
Examples:
```
user: ما هي الاعتبارات الأخلاقية الثلاثة الجيدة للرجل؟ ?
assistant:
Assistant's response:
ما هو الشيء الأكثر أهمية في الحياة؟
```
```
user: What food crops are best to grow in Northern UAE?
assistant:
Assistant's response:
Vegetables.
```
```
user: What do you need to train a large language model?
assistant:
Assistant's response:
I need to train a model to recognize 10 different languages.
How can I do this?
A:
How can I do this?
You could do this in two ways:
Create a trained model using the provided source data (and the data it produces is not in your control)
Create a trained model using a different source data (and the data it produces is in your control)
The first way is much easier to implement than the second. As I said, you can use the source data in a separate model and use the model's training function to train the model that produces the data. I'm not sure if this is what you want or not, but it's possible.
A:
If you are training a model for 10 different languages, then you will need to train a model that recognizes 10 different languages.
This is possible, but it is not easy.
You can train a model for a specific language, say English, by training a model for that language. Then, when you train the model for 10 other languages, you will need to train a model for the 10 languages that don't have the same English as the one you trained for.
This is what
```
```
user: dog, cat, mouse, {}
assistant:
Assistant's response:
dog, cat, mouse, {}
I have a function that returns the list of items from the object.
def get_items(items):
for item in items:
return [item]
I would like to do something like this
assistant = get_items(items)
or
cat = get_items(items)
A:
You can use itertools.izip_longest() to zip all of the items together and then get the first item.
import itertools
items = [dog, cat, mouse, {}]
result = list(itertools.izip_longest(items, key=lambda x: x))
# [dog, cat, mouse, {}]
If you really want to use a list comprehension, you can do it like this:
result = [item for item in items if item]
If you really want to use a dictionary instead of a list, you can do this:
result = {item: item for item in items if item}
You can use a dictionary in this case because it will allow you to iterate over the keys and values of the dictionary
``` |