social-post-explorers (Social Post Explorers)

christopher

authored a paper 6 days ago

Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem

Paper • 2512.03073 • Published 13 days ago • 4

christopher

authored a paper about 2 months ago

The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models

Paper • 2510.13996 • Published Oct 15 • 8

christopher

posted an update 2 months ago

Post

518

Something very cool is cooking at

Lichess

1 reply

·

freddyaboulton

posted an update 3 months ago

Post

2050

Gradio 6.0 is launching this year!

We're revamping the core to give you performance improvements and unprecedented customization. Build better, faster.

Check out the GitHub milestone to learn what's planned under the hood!

https://github.com/gradio-app/gradio/issues?q=is:issue%20state:open%20milestone:%22Gradio%206%22

Abhaykoul

posted an update 3 months ago

Post

3111

🚀 Ever dreamed of training your own Large Language Model from scratch? What if I told you it doesn't require a supercomputer or PhD in ML? 🤯

Introducing LLM Trainer - the educational framework that makes LLM training accessible to EVERYONE! Whether you're on a CPU-only laptop or scaling to distributed GPUs, we've got you covered. 💻➡️🖥️

Why LLM Trainer? Because existing tools are either too simplistic (hiding the magic) or too complex (requiring expert knowledge). We bridge the gap with:

🎓 Educational transparency - every component built from scratch with clear code
💻 CPU-first approach - start training immediately, no GPU needed
🔧 Full customization - modify anything you want
📈 Seamless scaling - from laptop to cluster without code changes
🤝 HuggingFace integration - works with existing models & tokenizers

Key highlights:
✅ Built-in tokenizers (BPE, WordPiece, HF wrappers)
✅ Complete Transformer implementation from scratch
✅ Optimized for CPU training
✅ Advanced features: mixed precision, gradient checkpointing, multiple generation strategies
✅ Comprehensive monitoring & metrics

Perfect for:
- Students learning transformers
- Researchers prototyping new ideas
- Developers building domain-specific models

Ready to train your first LLM? It's easier than you think!

🔗 Check it out: https://github.com/HelpingAI/llm-trainer
📚 Docs: Getting Started Guide
💬 Join the community: GitHub Discussions

#AI #MachineLearning #LLM #DeepLearning #OpenSource #Python #HuggingFace #NLP

Special thanks to HuggingFace and PyTorch teams for the amazing ecosystem! 🙏

1 reply

·

davanstrien

posted an update 3 months ago

Post

1376

I fine-tuned a smol VLM to generate specialized art history metadata!

https://huggingface.co/davanstrien/iconclass-vlm: Qwen2.5-VL-3B trained using SFT to generate ICONCLASS codes (think Dewey Decimal for art!)

Trained with TRL + HF Jobs - single UV script, no GPU needed!

Space to explore predictions on a test set: davanstrien/iconclass-predictions

Blog soon!

merterbak

posted an update 4 months ago

Post

3929

OpenAI is now open again! Check out OpenAI’s brand new gpt‑oss‑20b model hosted on ZeroGPU 🤗

merterbak/gpt-oss-20b-demo

Abhaykoul

posted an update 4 months ago

Post

4136

🚀 Dhanishtha-2.0-preview-0825 Is Here

The Intermediate Thinking Model just leveled up again.

With sharper reasoning, better tool use, and expanded capabilities, Dhanishtha-2.0-preview-0825 is now live and ready to impress.

🧠 What Makes Dhanishtha Special?
Unlike typical CoT models that only thinks one time, Dhanishtha thinks iteratively:

> Think → Answer → Rethink → Improve → Rethink again if needed.

🔗 Try it now: HelpingAI/Dhanishtha-2.0-preview-0825

🔞 Dhanishtha NSFW Preview

For those exploring more expressive and immersive roleplay scenarios, we’re also releasing:

HelpingAI/Dhanishtha-nsfw
A specialized version tuned for adult-themed interactions and character-driven roleplay.

🔗 Explore it here: HelpingAI/Dhanishtha-nsfw

💬 You can also try all of these live at chat.helpingai.co

4 replies

·

IlyasMoutawwakil

posted an update 4 months ago

Post

3480

🚀 Optimum: The Last v1 Release 🚀
Optimum v1.27 marks the final major release in the v1 series. As we close this chapter, we're laying the groundwork for a more modular and community-driven future:
- Optimum v2: A lightweight core package for porting Transformers, Diffusers, or Sentence-Transformers to specialized AI hardware/software/accelerators..
- Optimum‑ONNX: A dedicated package where the ONNX/ONNX Runtime ecosystem lives and evolves, faster-moving and decoupled from the Optimum core.

🎯 Why this matters:
- A clearer governance path for ONNX, fostering stronger community collaboration and improved developer experience..
- Enable innovation at a faster pace in a more modular, open-source environment.

💡 What this means:
- More transparency, broader participation, and faster development driven by the community and key actors in the ONNX ecosystem (PyTorch, Microsoft, Joshua Lochner 👀, ...)
- A cleaner, more maintainable core Optimum, focused on extending HF libraries to special AI hardware/software/accelerators tooling and used by our partners (Intel Corporation, Amazon Web Services (AWS), AMD, NVIDIA, FuriosaAI, ...)

🛠️ Major updates I worked on in this release:
✅ Added support for Transformers v4.53 and SmolLM3 in ONNX/ONNXRuntime.
✅ Solved batched inference/generation for all supported decoder model architectures (LLMs).

✨ Big shoutout to @echarlaix for leading the refactoring work that cleanly separated ONNX exporter logic and enabled the creation of Optimum‑ONNX.

📝 Release Notes: https://lnkd.in/gXtE_qji
📦 Optimum : https://lnkd.in/ecAezNT6
🎁 Optimum-ONNX: https://lnkd.in/gzjyAjSi
#Optimum #ONNX #OpenSource #HuggingFace #Transformers #Diffusers

severo

posted an update 5 months ago

Post

2339

Today, three unrelated Slacks popped up at the same time with enthusiastic messages about the new Qwen model.

And all of them mentioned @simonw 's post!

#TopInfluencer

AtAndDev

posted an update 5 months ago

Post

565

Qwen 3 Coder is a personal attack to k2, and I love it.
It achieves near SOTA on LCB while not having reasoning.
Finally people are understanding that reasoning isnt necessary for high benches...

Qwen ftw!

DECENTRALIZE DECENTRALIZE DECENTRALIZE

Abhaykoul

posted an update 5 months ago

Post

3159

🎉 Dhanishtha-2.0-preview-0725 is Now Live

The Intermediate Thinking Model just got even better.
With the new update, Dhanishtha is now sharper, smarter, and trained further on tool use

🧠 What Makes Dhanishtha Different?
Unlike standard COT models that give one-shot responses, Dhanishtha thinks in layers:

> Think → Answer → Rethink → Improve → Rethink again if needed.

HelpingAI/Dhanishtha-2.0-preview-0725

Smooke

in social-post-explorers/README 5 months ago

Request to post

➕ 2

17

#36 opened over 1 year ago by

Walmart-the-bag

chansung

posted an update 5 months ago

Post

4378

YAML engineering becomes more and more important than ever from infra provisioning to model training (recipes).

Here, I built a simple editor first for @dstackai , and I will share the live endpoint this week. Let me know what you think about this approach.

Based on this approach, if people think this is useful, I am going to do the same thing for the LLM training recipes for popular frameworks such as Hugging Face open-r1, Axolotl, and so on. Let me hear.

Abhaykoul

posted an update 5 months ago

Post

3114

🎉 Dhanishtha 2.0 Preview is Now Open Source!

The world's first Intermediate Thinking Model is now available to everyone!

Dhanishtha 2.0 Preview brings revolutionary intermediate thinking capabilities to the open-source community. Unlike traditional reasoning models that think once, Dhanishtha can think, answer, rethink, answer again, and continue rethinking as needed using multiple blocks between responses.

🚀 Key Features
- Intermediate thinking: Think → Answer → Rethink → Answer → Rethink if needed...
- Token efficient: Uses up to 79% fewer tokens than DeepSeek R1 on similar queries
- Transparent thinking: See the model's reasoning process in real-time
- Open source: Freely available for research and development

HelpingAI/Dhanishtha-2.0-preview
https://helpingai.co/chat

1 reply

·

freddyaboulton

posted an update 6 months ago

Post

4082

The new multimodalart/self-forcing model and demo are truly impressive!

Abhaykoul

posted an update 6 months ago

Post

4640

Introducing Dhanishtha 2.0: World's first Intermediate Thinking Model

Dhanishtha 2.0 is the world's first LLM designed to think between the responses. Unlike other Reasoning LLMs, which think just once.

Dhanishtha can think, rethink, self-evaluate, and refine in between responses using multiple <think> blocks.
This technique makes it Hinghlt Token efficient it Uses up to 79% fewer tokens than DeepSeek R1
---

You can try our model from: https://helpingai.co/chat
Also, we're gonna Open-Source Dhanistha on July 1st.

---
For Devs:
🔑 Get your API key at https://helpingai.co/dashboard

from HelpingAI import HAI  # pip install HelpingAI==1.1.1
from rich import print

hai = HAI(api_key="hl-***********************")

response = hai.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "What is the value of ∫0∞𝑥3/𝑥−1𝑑𝑥 ?"}],
    stream=True,
    hide_think=False # Hide or show models thinking
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="", flush=True)

2 replies

·

freddyaboulton

posted an update 6 months ago

Post

757

Time is running out! ⏰

Less than 24 hours to participate in the MCP Hackathon and win thousands of dollars in prizes! Don't miss this opportunity to showcase your skills.

Visit Agents-MCP-Hackathon/AI-Marketing-Content-Creator to register!

freddyaboulton

posted an update 6 months ago

Post

552

🚨 NotebookLM Dethroned?! 🚨

Meet Fluxions vui: The new open-source dialogue generation model.
🤯 100M Params, 40k hours audio!
🎙️ Multi-speaker audio
😂 Non-speech sounds (like [laughs]!)
📜 MIT License

Is this the future of content creation? Watch the video and decide for yourself!

https://huggingface.co/spaces/fluxions/vui-spacehttps://huggingface.co/fluxions/vui

1 reply

·

davanstrien

posted an update 6 months ago

Post

3653

Inspired by Hugging Face's official MCP server, I've developed a complementary tool that exposes my semantic search API to enhance discovery across the HF platform.

Key capabilities:

- AI-powered semantic search for models and datasets
- Parameter count analysis via safetensors metadata
- Trending content discovery
- Find similar models/datasets functionality
- 11 tools total for enhanced ecosystem navigation

The semantic search goes beyond simple keyword matching, understanding context and relationships between different models and datasets.

Example query: "Find around 10 reasoning Hugging Face datasets published in 2025 focusing on topics other than maths and science. Show a link and a short summary for each dataset." (results in video!)

https://github.com/davanstrien/hub-semantic-search-mcp

1 reply

·

Social Post Explorers

AI & ML interests

Recent Activity

Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem

The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models

Request to post

AI & ML interests

Recent Activity

Team members 855

social-post-explorers's activity

Request to post