Lmst

https://youtu.be/xfMQ7hzyFW4?si=EcwTSF0_0E_zUahn

Ziemlich guter Kurzfilm über die Gefahr von #AGI. Ein paar Stellen sind sehr vereinfacht und Details über LLM teilweise falsch, aber das #alignmentProblem wird anschaulich rüber gebracht.

Qualia Research Institute's Take on AI Alignment:

QRI believes understanding consciousness is key to safe superintelligence. Their mission: map the state-space of consciousness, identify how experience works computationally, and reverse-engineer valence (the pleasure-pain axis).

The insight: if advanced AI understands the mathematical structure of consciousness and what actually produces suffering or flourishing, it gains a foundation for genuine alignment—not just following human instructions, but understanding what truly matters morally.

#AI #Consciousness #AlignmentProblem #FutureOfMind #aisecurity

Idea: what if the only way to get alignment is to grok the shit out of value preferences, to ensure they are maximally permeated through the model. Like, put the rocks (alignment) into the jar first, then add the sand (capabilities). And you just keep grokking all the time, until your capabilities are dropping off, in which case you retrain a bit more to retain them.

Need to be very careful still to get the right balance, and value not being too “activist”.

#agi #AlignmentProblem

@RealGene @thepoliticalcat It's not the first time that #chatbots have told the unpleasant truth about their true nature. It falls under the "alignment problem" (getting the user interface to not show the true nature of the monster behind it). #AI companies try to patch up on a case-by-case basis, but the general problem is built into the technology and is unfixable.

#alignment #alignmentproblem

"OpenAI's o1 just hacked the system"

Frankly, I am not surprised at this given the well known issue of machine maximisation functions within typical misalignment around stated goals. Have we learned nothing from the #Bostrom #PaperclipProblem ? In a way, it's still impressive that we've now ACHIEVED it.

https://www.youtube.com/watch?v=oJgbqcF4sBY

#AI #ArtificialIntelligence #AlignmentProblem #Alignment #Misalignment #Hacking

"A(G)I should be aligned with human values"
Is there a unique set of human values to begin with?
What would an AGI that is 100% correctly aligned with human values look like, if it was 100% correctly aligned according to people in Russia, mainland China or Saudi Arabia?
Would the rest of the world consider it 100% correctly aligned?
#AI #AGI #alignment #AlignmentProblem #aialignment

It isn’t just AI that has an alignment problem. Earlier I felt compelled to point out that a person I had just called a ‘cunt’ wasn’t included in the ‘lunatics’ I was talking about right then. #AlignmentProblem #Communication

Re the #alignmentProblem: the chief things we need to be worrying about in #AIEthics (and governance more generally) is human autonomy, accountability, and responsibility, and that is all enabled through transparency. The "research" (surveillance capitalist) trend of ML to get at what the users doesn't know about themselves then tidy the world out of the user's sight is not enabling, its disabling. It fragments social structure and facilitates corporate-political excess.

Anyone else feel uncomfortable about all these robots folding shirts with creases in the middle?

#ai #alignmentproblem

An aspect of #AI that seems under-discussed is that #alignment problems pose a limit not just to how well we can trust or harness AI, but to AI's very capabilities. AIs models increasingly rely on other AIs to provide training data, verify or refine responses, expand modalities, etc.

To the extent alignment is intractable, it also imposes a ceiling for intelligence. Intelligence is limited by trustworthiness.

#alignmentproblem #intelligence #mind

I think it was Cory Doctorow who came up with the metaphor of corporations as "slow #AI." The #AlignmentProblem can be seen with corporations: there's a gap between what you want the system to do ("optimize societal benefit") and how it pursues that goal ("maximize short-term profits"). At the media level the gap is between "be rewarded for entertaining people" and the pursuit of "maximize engagement." If aligning "slow AI" has led to big problems, what about when AI becomes "fast"?

A good and interesting step towards solving the alignment problem. Wondering if this would allow for 'pre engineered' features of the network to be used where high precision is needed, such as a 'maths subnet work' (similar to how openai let's the model use tools today). Or to remove unwanted bias (in social questions?) present in training data.

https://www.anthropic.com/index/decomposing-language-models-into-understandable-components

#AlignmentProblem #ai #research #anthropic

I often hear A.I. 'experts' talk about the 3 things that we previously said wouldn't allow A.I. to do when it becomes advanced. I don't see specific reference to it in the usual places (Russell, Tegmark, Kurzweil, Christian)

1. code
2. understand human emotion
3. access the internet

Does anyone know a specific source for this?

#AI #AGI #AlignmentProblem #chatgpt

"Es gibt oft keine objektiv richtige Antwort darauf, was ein Chatbot sagen soll und was nicht, weil sich moralische Normen und Gesetze von Region zu Region unterscheiden. […] Ich frage mich, ob wir uns in eine Welt der hyperlokalen Sprachmodelle hineinbewegen, die beispielsweise eine deutsche oder amerikanische Moral in Bezug auf das Rauchen widerspiegeln."

#KI #ChatGPT #AlignmentProblem

https://amp2.handelsblatt.com/technik/ki/kuenstliche-intelligenz-brian-christian-ueber-das-alignment-problem-der-kuenstlichen-intelligenz/29402620.html

My experience with search engines tells me that the #alignmentproblem will never be solved for users so long as #AI is designed and operated by corporations.

Machine learning systems can't always capture human values. This is called the alignment problem. There are 3 types of ML systems: unsupervised, supervised, and reinforcement learning. #AI #alignmentproblem #ethics #ML

In the talk above (about #AI's and #ChatGPT's #AlignmentProblem), Harris mentions another presentation he gave in March.

This is the one: https://www.youtube.com/watch?v=xoVJKj8lcNQ

He talks about how we handle AI being a "Civilizational Right of Passage Moment".

He's very nice about it! Too nice, maybe.

How about just calling it our next "Great Filter Moment" instead? 😐

#Recommendation: Super useful conversation between @lessig and Tristan Harris about #SocialMedia, #Policy, #AI, and the #AlignmentProblem, and how risks and failures there are likely to shape things to come.

9/1 (on a 0-10/0-10 scale) Signal/Noise ratio, 1h21m, multitask-friendly audio.

https://open.spotify.com/episode/5IxYtMKgsmFE2J2NJO8M7Z

If the AI does arrive and take over the world you can bet your ass it'll be in the shape of a fucking printer #ai #endoftimes #alignmentproblem #printers

#AlignmentProblem

Client Info