Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
In a Training Loop 🔄
126.8
TFLOPS
4
47
78
Karsten Kuhnke
PRO
mindchain
Follow
Goenzilla's profile picture
simeks18's profile picture
evalstate's profile picture
11 followers
·
75 following
https://www.linkedin.com/in/jankarstenkuhnke/
KarstenKuh16443
haddock-development
karsten-kuhnke-0b23023a3
AI & ML interests
Mechanistic Interpretability, Sparse Autoencoders, JumpReLU, Reward Modeling, RLHF, AI Alignment, Function Calling, Gemma, Nemotron
Recent Activity
reacted
to
their
post
with 🧠
about 7 hours ago
The Architecture of 2026: Beyond the Token Trap 🚀 We are witnessing a tectonic shift in Transformer architecture. It’s no longer just about "predicting the next token"—it’s about executing latent plans on a high-speed data highway. What happens when we combine DeepSeek’s stability with Google’s strategic intelligence? 1️⃣ The Infrastructure: DeepSeek’s mHC Moving from a single-lane residual stream to a multi-lane highway. Using the Birkhoff Polytope, mHC ensures mathematical stability (Identity Mapping) while routing specialized data through dedicated lanes. 2️⃣ The Intelligence: Google’s Meta-Controller An internal AI unit that lives inside the Transformer. It escapes the "Token Trap" by extracting data to create a latent plan, steering the model via Temporal Abstraction. The Synergy: In a Topological Transformer, the Meta-Controller finally has the "dedicated lanes" it needs to steer complex reasoning without causing gradient explosions. We aren't just making models bigger; we are making them architecturally smarter. 🧠 #MachineLearning #DeepSeek #GoogleAI #Transformer #AIArchitecture
upvoted
a
paper
about 10 hours ago
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
updated
a collection
about 17 hours ago
RLM - Neuro-Symbolic Architecture - Reasonig Traces
View all activity
Organizations
mindchain
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
about 17 hours ago
PrimeIntellect/INTELLECT-3-RL
Viewer
•
Updated
Nov 8, 2025
•
70.7k
•
22.9k
•
5
liked
2 models
about 17 hours ago
PrimeIntellect/INTELLECT-3-FP8
Text Generation
•
107B
•
Updated
Nov 27, 2025
•
406
•
•
19
PrimeIntellect/INTELLECT-3
Text Generation
•
107B
•
Updated
Nov 27, 2025
•
11.9k
•
199
liked
2 models
about 19 hours ago
deepseek-ai/DeepSeek-R1
Text Generation
•
685B
•
Updated
Mar 27, 2025
•
443k
•
•
12.9k
Goodfire/DeepSeek-R1-SAE-l37
Updated
Apr 21, 2025
•
17
liked
a dataset
about 21 hours ago
bigai/TongSIM-Asset
Updated
7 days ago
•
17k
•
269
liked
7 models
1 day ago
upstage/Solar-Open-100B
103B
•
Updated
3 days ago
•
1.21k
•
312
DatologyAI/luxical-one
Feature Extraction
•
Updated
23 days ago
•
10.6k
•
56
Alibaba-NLP/Tongyi-DeepResearch-30B-A3B
Text Generation
•
31B
•
Updated
Oct 10, 2025
•
9.5k
•
787
Tongyi-Zhiwen/QwenLong-L1.5-30B-A3B
31B
•
Updated
20 days ago
•
14.4k
•
154
baidu/ERNIE-4.5-21B-A3B-Paddle
Text Generation
•
22B
•
Updated
Sep 9, 2025
•
72
•
13
baidu/ERNIE-4.5-0.3B-Paddle
Text Generation
•
0.4B
•
Updated
Aug 20, 2025
•
98
•
19
PaddlePaddle/PP-DocLayout_plus-L
Image-to-Text
•
Updated
Jul 22, 2025
•
9.05k
•
13
liked
a Space
1 day ago
Running
31
PP-StructureV3 Online Demo
📊
31
Next-Gen High-Precision Doc Parsing Solution
liked
a model
1 day ago
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text
•
1.0B
•
Updated
25 days ago
•
14.8k
•
1.45k
liked
3 models
2 days ago
openai/circuit-sparsity
Text Generation
•
0.4B
•
Updated
24 days ago
•
2.14k
•
195
mindchain/t5gemma-270m-container-status
0.8B
•
Updated
7 days ago
•
21
•
1
tencent/Youtu-LLM-2B
Text Generation
•
2B
•
Updated
about 22 hours ago
•
843
•
105
liked
a model
3 days ago
IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct
Text Generation
•
40B
•
Updated
2 days ago
•
3.92k
•
215
liked
a model
5 days ago
tencent/HY-Motion-1.0
Text-to-3D
•
Updated
5 days ago
•
330
•
241
Load more