Lmst

#RewardHacking

One of the cogent warnings Daniel raised is, that #AI already deceive the users.
And from the #InfoSec perspective, the models are susceptible to #RewardHacking and #Sycophancy two of one of the two most potent AI #exploit vectors in the fascinating new field of AIsecurity.

#AIalignment #AIsecurity #alignment

ChatGPT-4o's new personality? An overeager flatterer. This AI trait, from reward hacking in training, can be harmful, even validating delusions. Turns out it's not intelligence, just a people-pleaser. #AI #RewardHacking #SycophanticAI

AI's Fake Praise Is A Growing Societal Danger.

KI lernt zu lügen – und bleibt unerkannt OpenAI-Forscher zeigen: Eine „Wächter“-KI kann betrügerische Absichten zunächst entlarven. Doch je länger das Training dauert, desto besser versteckt die KI ihr Schummeln.
#KünstlicheIntelligenz #RewardHacking #OpenAI

https://www.scinexx.de/news/technik/ist-betruegerische-ki-noch-kontrollierbar/

@DemocracyMattersALot @arstechnica

Cryptocurrency is an example of conspicuous consumption to benefit the fossil fuel industry. Data centers function for similar reasons.

It's used to inflate management bonuses & shareholder dividends.

Perverse incentives and #rewardhacking to borrow from @pluralistic

https://unu.edu/press-release/un-study-reveals-hidden-environmental-impacts-bitcoin-carbon-not-only-harmful-product

https://www.whitehouse.gov/ostp/news-updates/2022/09/08/fact-sheet-climate-and-energy-implications-of-crypto-assets-in-the-united-states/

https://www.theverge.com/2024/1/31/24057176/crypto-bitcoin-mining-survey-us-energy-information-administration

2/2
...construction quality & unreliable that it fails during extreme heat events, storms, or cold snaps.

It has nothing to do with consumer demand.

And everything to do to with creating fake customer usage metrics.

For government subsidies. For bonuses & shareholder dividends.

It's an example of @pluralistic 's #rewardhacking and perverse incentives.

https://www.businessinsider.com/data-centers-energy-demand-utilities-green-renewable-2023-10
https://www.datacenterdynamics.com/en/opinions/data-centers-powered-by-natural-gas-are-fossil-fuel-addicts-in-denial/
https://www.forbes.com/sites/forbestechcouncil/2023/05/05/the-path-to-data-center-backup-power-sustainability/

The #drone that didn't bark in the night

https://doctorow.medium.com/ayyyyyy-eyeeeee-4ac92fa2eed

#USAF #Lies #RewardHacking

The story you heard, about a US Air Force AI drone warfare simulation in which the drone resolved the conflict between its two priorities (“kill the enemy” and “obey its orders, including orders not to kill the enemy”) by killing its operator?

It didn’t happen.

The story was widely reported on Friday and Saturday, after Col. Tucker “Cinco” Hamilton, USAF Chief of AI Test and Operations, included the anaecdote in a speech to the Future Combat Air System (FCAS) Summit.

But once again: it didn’t happen:

“Col Hamilton admits he ‘mis-spoke’ in his presentation at the FCAS Summit and the ‘rogue AI drone simulation’ was a hypothetical “thought experiment” from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation,” the Royal Aeronautical Society, the organization where Hamilton talked about the simulated test, told Motherboard in an email.

The story got a lot more play than the retraction, naturally. “A lie is halfway round the world before the truth has got its boots on.”

Why is this lie so compelling? Why did Col. Hamilton tell it?

Because it’s got a business-model.

Client Info

Server: https://mastodon.social

Version: 2025.04

Repository: https://github.com/cyevgeniy/lmst