Karsten Kuhnke's picture

In a Training Loop 🔄

Karsten Kuhnke PRO

mindchain

·

https://www.linkedin.com/in/jankarstenkuhnke/

AI & ML interests

Mechanistic Interpretability, Sparse Autoencoders, JumpReLU, Reward Modeling, RLHF, AI Alignment, Function Calling, Gemma, Nemotron

Recent Activity

reacted to their post with 🧠 about 2 hours ago

The Architecture of 2026: Beyond the Token Trap 🚀 We are witnessing a tectonic shift in Transformer architecture. It’s no longer just about "predicting the next token"—it’s about executing latent plans on a high-speed data highway. What happens when we combine DeepSeek’s stability with Google’s strategic intelligence? 1️⃣ The Infrastructure: DeepSeek’s mHC Moving from a single-lane residual stream to a multi-lane highway. Using the Birkhoff Polytope, mHC ensures mathematical stability (Identity Mapping) while routing specialized data through dedicated lanes. 2️⃣ The Intelligence: Google’s Meta-Controller An internal AI unit that lives inside the Transformer. It escapes the "Token Trap" by extracting data to create a latent plan, steering the model via Temporal Abstraction. The Synergy: In a Topological Transformer, the Meta-Controller finally has the "dedicated lanes" it needs to steer complex reasoning without causing gradient explosions. We aren't just making models bigger; we are making them architecturally smarter. 🧠 #MachineLearning #DeepSeek #GoogleAI #Transformer #AIArchitecture

upvoted a paper about 4 hours ago

DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

updated a collection about 11 hours ago

RLM - Neuro-Symbolic Architecture - Reasonig Traces

View all activity

Organizations

mindchain 's models 45

mindchain/llama2-adapter_01

Updated Sep 11, 2023 • 1

mindchain/llama2-13

Updated Sep 11, 2023

mindchain/llama-2-7b-custom1

Updated Sep 11, 2023 • 1

mindchain/llama2-7-segeln_003_adapter

Updated Sep 11, 2023

mindchain/llama2-7-segeln_001

Updated Sep 11, 2023 • 1

mindchain/llama2-ooiiioo

Updated Sep 10, 2023

mindchain/llama-2-7b-custom004444444

Text Generation • Updated Sep 10, 2023 • 10

mindchain/llama2-qlora-finetunined-french_aktuell

Updated Sep 10, 2023

mindchain/llama-2-7b-custom

Updated Sep 10, 2023

mindchain/llama2-qlora-finetunined_01

Updated Sep 9, 2023

mindchain/llama2-qlora-finetunined-french

Updated Sep 9, 2023

mindchain/ybelkada-falcon-7b-sharded-bf16-yizhongw_self_instruct

Updated Jul 13, 2023

mindchain/ybelkada-falcon-7b-sharded-bf16-qlora-openassistant-guanaco

Updated Jul 6, 2023 • 3

mindchain/t5-small-finetuned-xsum-finetuned-xsum2

Updated May 25, 2023 • 5

mindchain/t5-small-finetuned-xsum

Updated May 25, 2023 • 10