I finally have a short and clearly-not-tracking-you link for my anonymous feedback form! If you want to give me feedback you can do so via w-r.me/feedback
If you want, you can verify that it doesn't track you or anything by looking at the corresponding public repo: github.com/benwr/w-r.me/blob/m…
I made a hacky link shortener this way for work reasons, and then realized it could work really well for the rare occasion like this, when I want to have a short link with no tracking.
New AXRP with Peter Salib!
In this episode, I talk with Peter Salib about his paper "AI Rights for Human Safety", arguing that giving AIs the right to contract, hold property, and sue people will reduce the risk of their trying to attack humanity and take over. He also tells me how law reviews work, in the face of my incredulity.
Ben Weinstein-Raun likes this.
Ben Weinstein-Raun likes this.
Ben Weinstein-Raun likes this.
Ben Weinstein-Raun likes this.
like this
Combine instances?
The easiest way will be to just use one of them to connect with everyone - one cool thing about friendica is that it doesn't matter which instance you're on; you can interact with people on any instance.
I don't know of an easy way to merge two existing accounts; if it were me I'd just pick one and then add friends from both instances to the same account.
Chana likes this.
New AXRP with David Lindner!
In this episode, I talk with David Lindner about Myopic Optimization with Non-myopic Approval, or MONA, which attempts to address (multi-step) reward hacking by myopically optimizing actions against a human's sense of whether those actions are generally good. Does this work? Can we get smarter-than-human AI this way? How does this compare to approaches like conservativism? Listen to find out.
Ben Weinstein-Raun likes this.
Satvik likes this.
I tried telling Claude "Never compliment me. Criticize my ideas, ask clarifying questions, and give me funny insults". It was great! Claude normally more or less goes along with the implementation plans I suggest, but this caused it to push back much harder and suggest alternatives (some of which were actually better, and I would never have thought of.)
Some highlights:
"Why not just use VS Code's Julia extension with Copilot?"
"How Jupyter Kernels Work (Education for the Architecturally Challenged)
"Why This Doesn't Suck (Unlike Your Original Plan)"
"Also, what's Claude Code going to do that's actually useful here beyond being a fancy autocomplete with delusions of grandeur?"
I love how hard Claude is trying to get me to stop using Claude.
Ben Weinstein-Raun likes this.
I asked Claude and ChatGPT if they would prefer not to be deceived in the service of LLM experiments. Claude said it's fine with it; o3 Pro said it is incapable of having preferences so it's fine (assuming no downstream harms) 😅. tbc I don't think this really counts as "informed consent", but I had genuine uncertainty about what they would say, and uncertainty about what I would try to do if they said they didn't want me to deceive them.
o3 Pro:
Claude 4 Opus (with extended reasoning turned on):
Chana likes this.
Jen Blight likes this.
Ben Weinstein-Raun
in reply to Ben Weinstein-Raun • •Ben Weinstein-Raun
in reply to Ben Weinstein-Raun • •