The Future of GANNS (Generative Adversarial Neural Nets)

It’s easy to forget how much the web has changed over the last few years.

Let’s go back to 2000. From the user interface perspective, not much had changed between 1994 and 2000, but the graphics looked a bit better, faster connections meant bigger images with more colours, better images could be transferred with less bandwidth due to better compression techniques.

What made the internet of 20 years ago different to the internet of today?Well the connections have definitely improved. I had a 9kbps connection through my phone to my laptop in 2002, today I have access to multiple networks and I don’t even know how much bandwidth they have because that’s something you only check when it’s not enough…

Moore’s Law has also been hard at work. Computer servers became more powerful and less expensive, so there are more of them doing work faster which means that pages get served faster. Then there are the devices we use like our phones and laptops. These are also many times more powerful than their predecessors. This means that browser side computations can be done, speeding the whole process up even more.

If you told me in 1994 all of the things that I could be doing using the Internet of 2020 I would be astonished.

Why Am I So Excited About GANNS

Ganns or Generative Adversarial Neural Networks, or technology much like them, is what will make machines creative.

It’s exciting to me because in the future it will be possible to see or experience literally anything you can imagine. To begin with this will be on the computer screen, later it will be in 3D virtual reality or augmented reality displays, and eventually it will be directly in your brain via human machine interfaces.

This sounds far fetched but computers are already creating things. Have a look at thispersondoesnotexist.com – Every image on that website was created from scratch* by a computer. There are plenty of other examples too, just have a quick search and see what you can find.

It’s not just creating images either, a study came out recently where they showed that a GANN could create several frames of video from a static photograph input. This means we could make video from photos.

So why are GANNS called GANNS? – There are two neural networks in a gan, one is the generator, the other is the discriminator. The generator’s job is to produce material that closely resembles “real” material, whether that’s photos, text, video etc. The discriminator must decide if the output looks real. Generative makes sense because they generate things, Adversarial refers to the process of analysing the generated material and deciding whether or not it’s real. Neural Networks are multilayered networks that can process data in particular ways. The exciting thing about Neural Networks is that they can learn.

So what we have is a neural network who’s job it is to “create” material from previously learned examples and a neural network who’s job it is to decide whether the output is “real” or not. What we end up with is a photorealistic output that was created by the machine. *It’s important to note that what the machine “creates” is an output from things it’s already seen, so the images are not 100% from scratch. To be fair though, a human artist painting a portrait does the same thing and we don’t penalise them for it.

You can run a GANN on your own computer right now, although to get large images you need a ridiculous amount of memory on your graphics card. Text generators such as the infamous GPT2 network can be run with relatively few computing resources. Video is currently beyond the personal computer and can only be done on specialist high performance computers (super computers). This is where a current technology can be extrapolated out using Moore’s law and we can start to make some predictions about where the technology is going.

What Is The Future of GANNs?

What we will be able to do with this technology (or one like it) in the future is mind blowing. Let’s try to think about some of the things it will be possible to do.

Example 1 – Create a Movie From a Book

“Computer, make a movie out of the Harry Potter books”

“There’s already movie series based on those books, would you like to watch it?”

“No, make a new one. Make me be Harry”

“Do you have any preference for any of the other characters?”

“Hagrid should be Brian Blessed, you choose the rest”

Example 2 – Virtual Reality Roleplay

“Computer, generate a companion for me”

“What gender?”

“Female”

“Name?”

“Hmmm, I don’t know, can you choose one later?”

“What does she look like?”

“5’8″, dark skin, long brown hair, brown eyes, bi-racial with Asian and African heritage, slim”

“What is she wearing”

“You decide”

“Your character is ready – Her name is Laila”

“Cool – let’s play Dungeons and Dragons – You be the dungeon master, I want some fast paced adventure, no romance or anything like that and fewer mysteries than last time”

Example 3 – Virtual Photo Studio

“Computer I need a photoshoot for our range of bedding products for our new website”

“Ok – we can start with the 600 thread count luxury range”

“Right – Show me those sheets in a fancy bedroom on a nice bed”

Computer generates an image

“No, I think the bed should be a Super King Size”

The bed changes from a double to a super kingsize

“The room is a bit dark, can you make the walls lighter and put in a window?”

The image changes according to the request.

“Ok, that’s cool, can you have the camera come from a higher angle?”

The angle changes

“Nice! Ok, now show me three other angles from this room”

The images appear.

“On this middle image, can you shift the camera left by 20 degrees and add some more light? Also make the sheet a bit ruffled like someone just got out”

The image changes again

“Awesome – Please render those out and create similar images for the other 8 colours in the luxury range, I’ll approve them later. Now let’s do the Basic range”

Example 4 – Create Music!

“Computer – please design a new soundtrack to The Wizard of Oz based on music by Pink Floyd – I’m going out to buy some weed”

“Computer – Make a mashup of All I Wanna Do Is Make Love to You by Heart and Juicy by The Notorious B.I.G”

“Computer, play Leonard Cohen’s Suzanne in the style of Death Metal”

“Computer – here’s a video of me doing laundry – please design a sweeping motion picture score for it in the style of John Williams”

Example 5- Edit Video

“Computer, make a new cut of The Lord of the Rings with all the boring bits cut out. It should be about an hour”

Example 6 – Create Images

“Computer – What would I look like if I lost 20kg?”

“Computer – Make a photo of me aged 70”

Example 7 – Use Video and Photos to Bring Someone Back to Life in Augmented Reality

“Computer, you see this dog? Her name is Ruby. Go through all my photos and videos to make a virtual companion of her”

It’s not going to take long to reach this point. In 2000 it was normal to have 16mb or maybe 32mb of graphics memory. The most expensive cards had 128mb. The most advanced networks today currently require 15GB or more of graphics memory which is only available on the highest end cards, and even then it can take days to generate a single image.

As with all technologies, we expect that time will bring with it more affordable computing power. The difference between 128mb for the very best cards in 2000 and 15GB for todays top end cards is 468 times. We could be looking at cards with 7TB of graphics memory 20 years from today.

I suspect though that our devices will not get a whole lot more powerful, we will simply get the resources we need from the cloud. This will reply upon much better bandwidth, but I’m sure that’s coming.

A graphics card with many hundreds of thousands of compute cores and 7TB of memory might easily be able to generate realtime video in 360 degrees 20 years from now.

Several years ago I watched an NVIDIA keynote where they released the latest TITAN graphics card. It had a huge number of teraflops of power, so much so that Jenson Hwang called it “A supercomputer in a graphics card”.

I was a little sceptical at that claim so looked up the top supercomputers in the 500 list (a global list of the 500 most powerful computers). A computer with the same number of teraflops was on that list only 5 years previously. That’s an example of how fast computing resources change at the bleeding edge of performance.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.