Ci Splunk PRO

Csplk

AI & ML interests

None yet

Organizations

Csplk's activity

reacted to abhishek's post with 🔥 1 day ago
view post
Post
2929
INTRODUCING Hugging Face AutoTrain Client 🔥
Fine-tuning models got even easier!!!!
Now you can fine-tune SOTA models on all compatible dataset-model pairs on Hugging Face Hub using Python on Hugging Face Servers. Choose from a number of GPU flavors, millions of models and dataset pairs and 10+ tasks 🤗

To try, install autotrain-advanced using pip. You can ignore dependencies and install without --no-deps and then you'd need to install some dependencies by hand.

"pip install autotrain-advanced"

Github repo: https://github.com/huggingface/autotrain-advanced
reacted to prithivMLmods's post with 🧠 2 days ago
view post
Post
4275
Quintet Drop : : 🤗

{ Flux LoRA DLC ⛵ } : prithivMLmods/FLUX-LoRA-DLC

-- Purple Dreamy
{ pop of color } : prithivMLmods/Purple-Dreamy-Flux-LoRA

-- Golden Dust
{ shimmer contrast } : prithivMLmods/Golden-Dust-Flux-LoRA

-- Lime Green
{ depth to the composition } : prithivMLmods/Lime-Green-Flux-LoRA

-- Flare Strike
{ Fractured Line } : prithivMLmods/Fractured-Line-Flare

-- Orange Chroma
{ studio lighting } : prithivMLmods/Orange-Chroma-Flux-LoRA
.
.
.
{ collection } : prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be

@prithivMLmods
replied to qq8933's post 4 days ago
view reply

For users with Chinese IP addresses, consider adding this URL to the rules of your U.S. node, as the response headers from this site will report the user's physical location to GPT.

Interested in what this means, can you say more about this part on Chinese IPs?

reacted to qq8933's post with 👍 4 days ago
view post
Post
2118
Discovered an outrageous bug on the ChatGPT official website, especially for those using ad-blocking plugins. Please make sure to add browser-intake-datadoghq.com to your ad block whitelist. The ChatGPT webpage relies on this site for heartbeat detection, but since it belongs to an ad tracking network, it's included in major ad-blocking lists. (If you're using Clash, also remember to add it to the whitelist.) Failing to do so may cause the ChatGPT web interface to display a greyed-out send button after clicking, with no response.

For users with Chinese IP addresses, consider adding this URL to the rules of your U.S. node, as the response headers from this site will report the user's physical location to GPT.
  • 2 replies
·
replied to qq8933's post 4 days ago
view reply

I propose another potential solution: stop using chatgpt.com and instead use hf.co/chat :)

If anyone thinks they can’t use this alternative solution and must use chatgpt for what they use it for then please share the details of what it is doing for you that you are unable to do with hugging face chat or maybe have not heard of hugging face chat so the community can help you out with how it most likely actually can do such things so you can free yourself from the closedAI shackles. :)

reacted to merve's post with 🚀 14 days ago
view post
Post
3402
Microsoft released a groundbreaking model that can be used for web automation, with MIT license 🔥 microsoft/OmniParser

Interesting highlight for me was Mind2Web (a benchmark for web navigation) capabilities of the model, which unlocks agentic behavior for RPA agents.

no need for hefty web automation pipelines that get broken when the website/app design changes! Amazing work.

Lastly, the authors also fine-tune this model on open-set detection for interactable regions and see if they can use it as a plug-in for VLMs and it actually outperforms off-the-shelf open-set detectors like GroundingDINO. 👏


OmniParser is a state-of-the-art UI parsing/understanding model that outperforms GPT4V in parsing.
replied to DeFactOfficial's post 27 days ago
reacted to merve's post with 🚀 30 days ago
view post
Post
2819
This is not a drill 💥
HuggingChat is now multimodal with meta-llama/Llama-3.2-11B-Vision-Instruct! 🤗
This also comes with multimodal assistants, I have migrated my Marcus Aurelius advice assistant to Llama-Vision and Marcus can see now! 😄

Chat with Marcus: https://hf.co/chat/assistant/65bfed22022ba290531112f8
Start chatting with Llama-Vision 3.2 11B Instruct https://huggingface.co/chat/models/meta-llama/Llama-3.2-11B-Vision-Instruct
  • 1 reply
·
reacted to abidlabs's post with ❤️ about 1 month ago
view post
Post
4022
👋 Hi Gradio community,

I'm excited to share that Gradio 5 will launch in October with improvements across security, performance, SEO, design (see the screenshot for Gradio 4 vs. Gradio 5), and user experience, making Gradio a mature framework for web-based ML applications.

Gradio 5 is currently in beta, so if you'd like to try it out early, please refer to the instructions below:

---------- Installation -------------

Gradio 5 depends on Python 3.10 or higher, so if you are running Gradio locally, please ensure that you have Python 3.10 or higher, or download it here: https://www.python.org/downloads/

* Locally: If you are running gradio locally, simply install the release candidate with pip install gradio --pre
* Spaces: If you would like to update an existing gradio Space to use Gradio 5, you can simply update the sdk_version to be 5.0.0b3 in the README.md file on Spaces.

In most cases, that’s all you have to do to run Gradio 5.0. If you start your Gradio application, you should see your Gradio app running, with a fresh new UI.

-----------------------------

Fore more information, please see: https://github.com/gradio-app/gradio/issues/9463
  • 2 replies
·
reacted to asoria's post with 👍 about 2 months ago
reacted to davanstrien's post with ❤️ about 2 months ago
view post
Post
2177
Yesterday, I shared a blog post on generating data for fine-tuning ColPali using the Qwen/Qwen2-VL-7B-Instruct model.

To simplify testing this approach, I created a Space that lets you generate queries from an input document page image: davanstrien/ColPali-Query-Generator

I think there is much room for improvement, but I'm excited about the potential for relatively small VLMs to create synthetic data.

You can read the original blog post that goes into more detail here: https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html
replied to davidberenstein1957's post about 2 months ago
view reply

This gives me the feeling that this will be one one of those "Was there a time before...I can’t recall the time before..." based novelty acceleration tool. Amazingly useful.

reacted to davidberenstein1957's post with 🚀 about 2 months ago
view post
Post
2140
🎉 Exciting News: Argilla 2.2.0 is Here! 🚀

We're thrilled to announce the release of Argilla 2.2.0, packed with powerful new features to enhance your data annotation and LLM workflow:

🗨️ ChatField: Work with text conversations natively in Argilla. Perfect for building datasets for conversational LLMs!
⚙️ Adjustable Task Distribution: Modify settings on the fly and automatically recalculate completed and pending records.
📊 Progress Tracking: Monitor annotation progress directly from the SDK, including user-specific metrics.
🧠 Automatic Settings Inference: Importing datasets from Hugging Face Hub just got easier with automatic settings detection.
📋 Task Templates: Jump-start your projects with pre-built templates for common dataset types.
🔧 Background Jobs Support: Improved performance for long-running tasks (requires Redis).

Upgrade now and supercharge your data workflows!

Check out our full changelog for more details: https://github.com/argilla-io/argilla/compare/v2.1.0...v2.2.0
reacted to davanstrien's post with 🧠 about 2 months ago
view post
Post
3118
ColPali is revolutionizing multimodal retrieval, but could it be even more effective with domain-specific fine-tuning?

Check out my latest blog post, where I guide you through creating a ColPali fine-tuning dataset using Qwen/Qwen2-VL-7B-Instruct to generate queries for a collection of UFO documents sourced from the Internet Archive.

The post covers:
- Introduction to data for ColPali models
- Using Qwen2-VL for retrieval query generation
- Tips for better query generation

Check out the post here:
https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html

The resulting Hugging Face dataset: davanstrien/ufo-ColPali
  • 1 reply
·
replied to davanstrien's post about 2 months ago
view reply

This is such a fun dataset for this type of guide!
Very enjoyable and helpful guide!

reacted to dylanebert's post with 👀 about 2 months ago
reacted to merve's post with ❤️ about 2 months ago
view post
Post
3777
If you have documents that do not only have text and you're doing retrieval or RAG (using OCR and LLMs), give it up and give ColPali and vision language models a try 🤗

Why? Documents consist of multiple modalities: layout, table, text, chart, images. Document processing pipelines often consist of multiple models and they're immensely brittle and slow. 🥲

How? ColPali is a ColBERT-like document retrieval model built on PaliGemma, it operates over image patches directly, and indexing takes far less time with more accuracy. You can use it for retrieval, and if you want to do retrieval augmented generation, find the closest document, and do not process it, give it directly to a VLM like Qwen2-VL (as image input) and give your text query. 🤝

This is much faster + you do not lose out on any information + much easier to maintain too! 🥳

Multimodal RAG merve/multimodal-rag-66d97602e781122aae0a5139 💬
Document AI (made it way before, for folks who want structured input/output and can fine-tune a model) merve/awesome-document-ai-65ef1cdc2e97ef9cc85c898e 📖
  • 2 replies
·
posted an update about 2 months ago
view post
Post
589
I made Multi-agent Software Team Gradio space using transformers agents based on the multiagent_web_assistant cookbook by @m-ric

Csplk/SoftwareTeam
replied to m-ric's post about 2 months ago
view reply

They just afraid to show how it works cos they know they cant keep up with the open source train

reacted to merve's post with 👍 2 months ago
view post
Post
5475
I have put together a notebook on Multimodal RAG, where we do not process the documents with hefty pipelines but natively use:
- vidore/colpali for retrieval 📖 it doesn't need indexing with image-text pairs but just images!
- Qwen/Qwen2-VL-2B-Instruct for generation 💬 directly feed images as is to a vision language model with no processing to text!
I used ColPali implementation of the new 🐭 Byaldi library by @bclavie 🤗
https://github.com/answerdotai/byaldi
Link to notebook: https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb