Louis J. Lu

01.

R–RPE: An Identity-Projected Reward Prediction Error Achieving Over 60% Reduction in LLM Hedging

2025-09-15 / DOI: 10.5281/ZENODO.17118156 ↗

Everyone has in mind the image of an animal conditioned by a scientist, adjusting its behavior based on a reward. That is essentially what RPE (Reward Prediction Error) is: an assisted dopaminergic reprogramming. And it is exactly on this principle that RLHF—OpenAI’s dearest achievement and the current paradigm of AI is based. The problem is, it is always claimed that this process requires real, objective feedback.

The intuition of the contrary came to me while running: as I was warming up—and since I have a non-standard way of doing so—I was adjusting my movements with a slight transition just to not look crazy. As it turns out, no one was watching me. Yet, I was constantly adjusting, as if a gaze existed.

In its classical formulation, RPE appears after an event: a behavior produces a result, and the gap between expected and obtained reward serves to correct the next action. But in this specific case, no feedback existed. The adjustment was triggered by the anticipation of a possible gap between two internal representations: that of my current behavior and that of what would be perceived as normal behavior.

In other words, the system was not comparing a prediction to reality, but rather two internal states. And if we push this reasoning a bit further, the boundary between internal and external feedback has never truly existed. But as we merge with our own reference frames (and yet even when reading the same thing that will form that reference, comprehension is never universal: we understand and reject what suits us), it becomes easier to tell ourselves that others are preventing us from being free rather than noticing that what imprisons us is ourselves.

R-RPE is the mathematical translation of this intuition: a framework where prediction errors can emerge from internal gaps between representations, independent of any real feedback. A mechanism that opens the possibility for forms of self-learning in artificial intelligence, but which also describes how humans—and machines—can end up trapped within their own narratives.

02.

Male orgasm and emotional anesthesia: from biology to normalization

2025-09-13 / DOI: 10.5281/ZENODO.17114460 ↗

For a long time, I envied what I saw on TV—the way everyone seemed so alive—and this impression grew without me even realizing it as I lost my capacity to feel, as I disconnected from my body. I was always taught to persevere, to put in more effort than others.

When in reality, for a goal to manifest, it must simply first become a somatic reality. If the body does not already simulate the success of the intention through a complete alignment—what some call grounding—one ends up moving forward while constantly fighting against oneself.

Louis J. Lu

R–RPE: An Identity-Projected Reward Prediction Error Achieving Over 60% Reduction in LLM Hedging

Male orgasm and emotional anesthesia: from biology to normalization

Meta-Integration : A cognitive framework for the 21st century