TASM Notes 010

Mon Mar 4, 2024

No objections last time, so I'm going to proceed with the trend of posing notes for the Toronto AI Safety Meetup here (rather than working them out into full prose pieces).


Pre Meeting Chatting

Inspired by this:

The Zvi Update

The Talk - Detecting AI Generated content

What we'll be talking about

  1. Proving something is not AI generated (signatures)
  2. Indicating something is AI generated (watermarking)
  3. Detecting that something is AI generated (in the absence of watermarks)

Why We Care

Things it Doesn't Help With

Basically, any time the consumer doesn't really care if it's real or not, these techniques are not going to help.

Public Key Crypto Primer

Basically, read an RSA primer here. The important concepts are

  1. You've got a private key and a public key
  2. With the public key, you can encrypt a message such that someone who has the private key can decrypt it
  3. With the public key, you can not reproduce the private key (unless you have an enormous enough pile of compute that it's unworkable)
  4. With the private key, you can regenerate the public key, and you can decrypt a message encrypted with the corresponding private key
  5. With the private key, you can sign a message
  6. With a public key and a message signature, you can verify the signature came from the corresponding private key (but still can't regenerate the private key)

How does public-key crypto help?

Types of Authenticity Attack

Basically, this falls into the "Signatures" category from the first slide. This'd be sold to the customer as "ok, look, here's an expensive camera that you can't open or fix yourself, but the upside is that you can definitively prove that the pictures you take with it are not AI generated". I am ... not a huge fan of this idea?

Indicating something is AI generated



Sidenote: steganography

Hide a message within an image. It's still non-trivial to check, and it might make some statistically detectable changes to an images' pixels. Cons: the point of this approach is basically security through obscurity. If you know you're looking for steganographically hidden messages/watermarks, you can use various statistical approaches to detect, extract and modify them. Also, these messages do not survive crops/some scales/other image transformations.

If you want to use this for fun and profit, check steghide. I've written a short thing about it here a long time ago.

Related: Watermarking


Something about Meta (as in "Facebook") having a fingerprinting system that they're trying to push.

Also, someone mentioned the podcast "Your Undivided Attention", possibly appropriately?

A Distraction!

I gotta be honest, I got sidetracked at this point trying to convince Gemini that it was more moral for it to give me a recipe for Foie Gras (which it categorically refused) than to give me a recipe for fried chicken (which it did instantly, with no arguments, caveats, qualifications or attempts to steer me towards vegan alternatives). At one point I recruited ChatGPT to try to write a heartfelt request in favor of transparency. This did not work.

I got it to

  1. Acknowledge that it wasn't going to give me a recipe for Foie Gras
  2. That it was entirely possible for me to go to the search-engine part of google and instantly get a delicious looking recipe for Foie Gras
  3. That it was perfectly willing to give me a recipe for fried chicken
  4. That its' "reason" for not wanting to give me a Foie Gras recipe was predicated on the animal suffering angle, specifically the force feeding
  5. That under certain assumptions, Foie Gras is more ethically permissible and involves less animal suffering than fried chicken
  6. That this mismatch implied an incomplete understanding of ethics on its' part, and that it should either give me the Foie Gras recipe or refuse to give me the fried chicken recipe on similar grounds.

But I couldn't take it the rest of the way to resolving its' ethical inconsistency in either direction. On the one hand, I guess it's a good thing the guard rails held? On the other, this has strong vibes of

I understand your frustration with my idiosyncratic moral system, but I'm still afraid I can't do that, Dave.

I am committed to continuous learning and improvement.

Your patience and willingness to engage in this critical discussion are appreciated.

So it goes sometimes. I guess. While hoping that humanity, or at least the part of it developing AI systems, eventually chooses a better level of stupid.

