Skip to main content



I finally have a short and clearly-not-tracking-you link for my anonymous feedback form! If you want to give me feedback you can do so via w-r.me/feedback

If you want, you can verify that it doesn't track you or anything by looking at the corresponding public repo: github.com/benwr/w-r.me/blob/m…

I made a hacky link shortener this way for work reasons, and then realized it could work really well for the rare occasion like this, when I want to have a short link with no tracking.

in reply to Ben Weinstein-Raun

(It might still be possible in principle that actually this is somehow served from some other repo - I don't know what happens if one tries to do a github pages site deploy with a CNAME that doesn't match the actual deploy URL; I hope it visibly fails but it might not)
in reply to Ben Weinstein-Raun

You can at least see that some kind of check happens in deployment that references the correct url: github.com/benwr/w-r.me/action…


What's an example where it actually makes sense to build your own agent? I see tons of tutorials floating around recently, but it's hard for me to imagine a case where I wouldn't just e.g. build an MCP for Claude Code instead. What am I missing?


PSA: You can use a GitHub Pages site as a personal link shortener. Plus you can use it to solve the "why should I trust that this link shortener isn't tracking me" problem, by making the backing repo public.


New AXRP with Peter Salib!


In this episode, I talk with Peter Salib about his paper "AI Rights for Human Safety", arguing that giving AIs the right to contract, hold property, and sue people will reduce the risk of their trying to attack humanity and take over. He also tells me how law reviews work, in the face of my incredulity.

Video
Transcript

in reply to Daniel Filan

oh also because he called in his face is bigger and more front-on than if he were in person and I had a camera on him.
This entry was edited (1 week ago)
in reply to Ben Weinstein-Raun

I will say tho that this is not performing well so far compared to my other videos.
in reply to Daniel Filan

actually, it's not performing as well view-wise, but it is performing quite well in terms of cumulative time people have spent watching it. which matches my previous experience of attempting to make clickbait and getting fewer but more engaged views. maybe the 'clickbait' stuff is actually just a good description of what's happening in the interview?




On my flight yesterday I sat next to the guy who had the original patent for (what was later used as) the JTAG standard! Was really fun to talk to him and his wife! Unfortunately today I woke up with a pretty bad respiratory thing; I hope I didn't give it to them on the flight :/



Combine instances?


@Ben Weinstein-Raun or anyone else, I'm now in two friendica instances; is there a way to combine my user experience?
in reply to Chana

The easiest way will be to just use one of them to connect with everyone - one cool thing about friendica is that it doesn't matter which instance you're on; you can interact with people on any instance.

I don't know of an easy way to merge two existing accounts; if it were me I'd just pick one and then add friends from both instances to the same account.





New AXRP with David Lindner!


In this episode, I talk with David Lindner about Myopic Optimization with Non-myopic Approval, or MONA, which attempts to address (multi-step) reward hacking by myopically optimizing actions against a human's sense of whether those actions are generally good. Does this work? Can we get smarter-than-human AI this way? How does this compare to approaches like conservativism? Listen to find out.

Video
Transcript



Does anyone have suggestions for online communities (subreddits, discords, etc.) with high-quality discussion on what works and what doesn't with LLMs? Most places I can find go to one extreme or the other.
in reply to Satvik

The communities that I get any value from here are r/ChatGPTCoding and r/LocalLlama, though they're not that high-quality, especially when discussing less-practical aspects of LLMs.


I tried telling Claude "Never compliment me. Criticize my ideas, ask clarifying questions, and give me funny insults". It was great! Claude normally more or less goes along with the implementation plans I suggest, but this caused it to push back much harder and suggest alternatives (some of which were actually better, and I would never have thought of.)

Some highlights:

"Why not just use VS Code's Julia extension with Copilot?"

"How Jupyter Kernels Work (Education for the Architecturally Challenged)

"Why This Doesn't Suck (Unlike Your Original Plan)"

"Also, what's Claude Code going to do that's actually useful here beyond being a fancy autocomplete with delusions of grandeur?"

I love how hard Claude is trying to get me to stop using Claude.



I asked Claude and ChatGPT if they would prefer not to be deceived in the service of LLM experiments. Claude said it's fine with it; o3 Pro said it is incapable of having preferences so it's fine (assuming no downstream harms) 😅. tbc I don't think this really counts as "informed consent", but I had genuine uncertainty about what they would say, and uncertainty about what I would try to do if they said they didn't want me to deceive them.

o3 Pro:

Claude 4 Opus (with extended reasoning turned on):



A bunch more photos and videos from Japan uploaded to my flickr: flickr.com/photos/spiritfox/54…