Jue Wang's picture

37 6

Jue Wang

juewang

·

https://juewang.me/about/

AI & ML interests

None yet

Organizations

juewang's activity

New activity in codellama/CodeLlama-70b-Instruct-hf 10 months ago

Context length?

#2 opened 10 months ago by

New activity in EleutherAI/neox-ckpt-pythia-12b-v1 about 1 year ago

Missing files?

#1 opened about 1 year ago by

New activity in togethercomputer/LLaMA-2-7B-32K over 1 year ago

Correct the output dtype of rmsnorm_func

#13 opened over 1 year ago by

ag0

how to fine tune peft qlora and SFTTrainer?

#2 opened over 1 year ago by

New activity in togethercomputer/RedPajama-INCITE-7B-Instruct over 1 year ago

Poor performance?

#6 opened over 1 year ago by

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B over 1 year ago

Can you help me fine-tune this with LoRA? (Having an error)

#12 opened over 1 year ago by

What kind of machine would be suitable for this model (in amazon sagemaker)?

#7 opened over 1 year ago by

Will it be possible to run this on PC with 8 GeForce RTX 3060 with 8 Gb VRAM each?

#11 opened over 1 year ago by

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

Any way to set the "stop, split by" when running the model locally?

#26 opened over 1 year ago by

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B over 1 year ago

Issue with loading model to GPU when using pipeline

#5 opened over 1 year ago by

Is it a wrong prompt?

#8 opened over 1 year ago by

tatyanavidrevich

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

Feature requests and suggestions for V2

#4 opened almost 2 years ago by

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B over 1 year ago

use accelerate to load model

#4 opened over 1 year ago by

This model requires A LOT of resources... But how much? Trying to build a chatbot

#3 opened over 1 year ago by

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

Generated Text have issues

#22 opened almost 2 years ago by

New activity in togethercomputer/GPT-NeoXT-Chat-Base-20B over 1 year ago

Is UL2 used?

#2 opened over 1 year ago by

New activity in togethercomputer/GPT-JT-6B-v1 over 1 year ago

Question-Answering over documents

#19 opened almost 2 years ago by

Confused about bidirectional attention when implementing custom sampling loop

#25 opened over 1 year ago by

ericanthonymitchell

New activity in togethercomputer/GPT-JT-6B-v1 almost 2 years ago

Model behavior during adaptation phase

#24 opened almost 2 years ago by

Fine Tuning // Download Full Weights

#23 opened almost 2 years ago by